Ingestion Pipeline¶
The indexing path is asynchronous and intentionally decoupled from the online query runtime.
flowchart LR
SRC[Source importers] --> PG[(PostgreSQL)]
PG --> CQ[chunk_jobs]
CQ --> CH[Chunking worker]
CH --> EQ[embed_jobs__preset]
EQ --> EM[Embedding worker]
EM --> QD[(Qdrant)]
PG --> XQ[extract_jobs]
XQ --> EX[Extraction worker]
EX --> NEO[(Neo4j)]
Stages¶
Stage |
Output |
Primary components |
|---|---|---|
Import |
Raw acts and subdivisions |
Rust importer services in |
Chunking |
Chunk jobs and chunk rows |
|
Embedding |
Vector collections per preset |
|
Extraction |
Graph relations and quality metrics |
|
Operational signals¶
Queue depth is exposed via Redis-backed metrics and dashboard panels.
Worker execution volume and latency are emitted as Prometheus counters and histograms.
Relation sync, extraction status, and embedding estimates are surfaced on the current Grafana dashboards.
Note
backend/ hosts the Rust web/document API. The importer runtime itself lives
under services/*_service/ and reuses services/ingestion_service/.
For a worker-by-worker explanation of queue consumption, downstream handoff, and non-query jobs such as summaries and community detection, see Workers.