Architecture Notes

This page explains why the repository and runtime are split the way they are. The system is easier to reason about when it is viewed as a set of cooperating planes instead of a single application.

Figure 1: Architecture planes

        flowchart TB
    subgraph Experience Plane
        FE[Frontend]
        GW[API Gateway]
    end

    subgraph Intelligence Plane
        RAG[RAG service]
        PKG[lalandre_rag and lalandre_core]
    end

    subgraph Knowledge Plane
        PG[PostgreSQL]
        QD[Qdrant]
        NEO[Neo4j]
        REDIS[Redis]
    end

    subgraph Pipeline Plane
        BACK[backend/ Rust web API]
        ING[Rust importer services<br/>+ ingestion_service crate]
        WORK[chunking, embedding,<br/>extraction workers]
    end

    subgraph Observability Plane
        PROM[Prometheus]
        GRAF[Grafana]
    end

    FE --> GW --> RAG
    RAG --> PKG
    RAG --> PG
    RAG --> QD
    RAG --> NEO
    RAG --> REDIS
    BACK --> ING --> WORK
    WORK --> PG
    WORK --> QD
    WORK --> NEO
    GW --> PROM
    RAG --> PROM
    WORK --> PROM
    PROM --> GRAF
    

Main architectural principles

  • Thin services, thick shared packages. The reusable logic lives in packages/, while HTTP or worker entry points stay in services/.

  • Online and offline paths are separated. The chat runtime should stay responsive even when ingestion jobs are heavy.

  • Storage is specialized by concern. PostgreSQL stores canonical text and operational state, Qdrant handles vector retrieval, Neo4j stores relationship structure, and Redis supports queueing and transient runtime state.

  • The agentic layer is bounded. LLM-based planning is allowed to shape retrieval, but not to perform arbitrary side effects.

  • Observability is a first-class concern. Metrics, dashboards, and trace metadata are part of the architecture, not an afterthought.

Figure 2: Online path versus offline path

        flowchart LR
    USER[User query] --> UI[Frontend]
    UI --> GW[API Gateway]
    GW --> RS[RAG service]
    RS --> STORES[PG + Qdrant + Neo4j]
    RS --> SSE[SSE events and grounded answer]

    SOURCES[Legal sources and documents] --> ING[Importer services]
    ING --> PG[(PostgreSQL acts and subdivisions)]
    PG --> W1[Chunking worker]
    W1 --> QD[(Qdrant)]
    W1 --> XQ[extract_jobs]
    W1 --> EQ[embed_jobs__preset]
    EQ --> W2[Embedding worker]
    XQ --> W3[Extraction worker]
    W2 --> PG
    W2 --> QD
    W3 --> PG
    W3 --> NEO[(Neo4j)]
    

Storage responsibilities

Store

Main responsibility

Why it exists separately

PostgreSQL

canonical document text, subdivisions, operational metadata, dashboard SQL panels

transactional source of truth and relational querying

Qdrant

vector similarity search over chunks and subdivisions

low-latency semantic retrieval

Neo4j

act-to-act and relation-oriented graph exploration

graph traversal and relationship-aware evidence

Redis

queueing and transient operational state

decouple online runtime from async workers

Why the repository is split by folders

Folder

Why it exists

frontend/

UI, SSE consumption, and chat-side source rendering

services/

deployable Python services/workers plus Rust importer crates

packages/

shared Python logic reused by multiple services

backend/

Rust web/document API

infra/

local and deployment-time infrastructure definitions

monitoring/

Grafana, Prometheus, exporters, and observability assets

docs/

maintained Sphinx technical documentation

What matters most for maintainers

  • If you change retrieval behavior, the most important folders are packages/lalandre_rag/ and services/rag-service/.

  • If you change importer behavior, look at services/*_service/, services/ingestion_service/, and the async workers.

  • If you change the Rust web/document API, look at backend/.

  • If you change operational behavior, you will likely touch both application code and monitoring/.

  • If you change storage contracts, update both the relevant package and the architecture/operations docs.