Ingestion Pipeline

The indexing path is asynchronous and intentionally decoupled from the online query runtime.

        flowchart LR
    SRC[Source importers] --> PG[(PostgreSQL)]
    PG --> CQ[chunk_jobs]
    CQ --> CH[Chunking worker]
    CH --> EQ[embed_jobs__preset]
    EQ --> EM[Embedding worker]
    EM --> QD[(Qdrant)]
    PG --> XQ[extract_jobs]
    XQ --> EX[Extraction worker]
    EX --> NEO[(Neo4j)]
    

Stages

Stage

Output

Primary components

Import

Raw acts and subdivisions

Rust importer services in services/*_service/, backed by the shared services/ingestion_service/ crate

Chunking

Chunk jobs and chunk rows

services/chunking-worker, packages/lalandre_chunking

Embedding

Vector collections per preset

services/embedding-worker, services/embedding-service

Extraction

Graph relations and quality metrics

services/extraction-worker, packages/lalandre_extraction

Operational signals

  • Queue depth is exposed via Redis-backed metrics and dashboard panels.

  • Worker execution volume and latency are emitted as Prometheus counters and histograms.

  • Relation sync, extraction status, and embedding estimates are surfaced on the current Grafana dashboards.

Note

backend/ hosts the Rust web/document API. The importer runtime itself lives under services/*_service/ and reuses services/ingestion_service/.

For a worker-by-worker explanation of queue consumption, downstream handoff, and non-query jobs such as summaries and community detection, see Workers.