Ingestion Pipeline¶

The indexing path is asynchronous and intentionally decoupled from the online query runtime.

        flowchart LR
    SRC[Source importers] --> PG[(PostgreSQL)]
    PG --> CQ[chunk_jobs]
    CQ --> CH[Chunking worker]
    CH --> EQ[embed_jobs__preset]
    EQ --> EM[Embedding worker]
    EM --> QD[(Qdrant)]
    PG --> XQ[extract_jobs]
    XQ --> EX[Extraction worker]
    EX --> NEO[(Neo4j)]

Stages¶

Stage	Output	Primary components
Import	Raw acts and subdivisions	Rust importer services in `services/*_service/`, backed by the shared `services/ingestion_service/` crate
Chunking	Chunk jobs and chunk rows	`services/chunking-worker`, `packages/lalandre_chunking`
Embedding	Vector collections per preset	`services/embedding-worker`, `services/embedding-service`
Extraction	Graph relations and quality metrics	`services/extraction-worker`, `packages/lalandre_extraction`

Operational signals¶

Queue depth is exposed via Redis-backed metrics and dashboard panels.
Worker execution volume and latency are emitted as Prometheus counters and histograms.
Relation sync, extraction status, and embedding estimates are surfaced on the current Grafana dashboards.

Note

backend/ hosts the Rust web/document API. The importer runtime itself lives under services/*_service/ and reuses services/ingestion_service/.

For a worker-by-worker explanation of queue consumption, downstream handoff, and non-query jobs such as summaries and community detection, see Workers.