Workers¶

The async workers are a core part of the RagLogic AI architecture. They keep indexing and enrichment work off the online query path while still producing the artifacts that the RAG runtime needs in production.

Figure 1: Worker chain¶

        flowchart LR
    PG[(PostgreSQL acts and subdivisions)] --> CQ[chunk_jobs]
    CQ --> CH[Chunking worker]
    CH --> PGC[(PostgreSQL chunks)]
    CH --> EQ[embed_jobs__preset]
    CH --> XQ[extract_jobs]

    EQ --> EM[Embedding worker]
    EM --> QD[(Qdrant chunks and acts)]

    XQ --> EX[Extraction worker]
    EX --> NEO[(Neo4j relations and communities)]
    EX --> PGS[(PostgreSQL summaries and extraction state)]

Why these workers exist¶

chunking-worker materializes chunk records from legal text subdivisions.
embedding-worker turns chunks and acts into vectors for semantic retrieval.
extraction-worker builds relationship structure, canonical summaries, and graph communities.

This split matters because each stage has different CPU, memory, I/O, and model requirements. It also allows partial reprocessing without blocking the online chat runtime.

Worker inventory¶

Worker	Queue(s) consumed	Main output	Main code
Chunking worker	`chunk_jobs`	chunk rows in PostgreSQL; downstream embed/extract jobs	`services/chunking-worker/worker.py`, `packages/lalandre_chunking/`
Embedding worker	preset-specific `embed_jobs__*` queues	vectors in Qdrant, embedding state in PostgreSQL	`services/embedding-worker/worker.py`, `packages/lalandre_embedding/`
Extraction worker	`extract_jobs`	relations in Neo4j, extraction state and summaries, community detection	`services/extraction-worker/worker.py`, `packages/lalandre_extraction/`

Queue and job model¶

Queue	Representative jobs	Purpose
`chunk_jobs`	`chunk_all`, `chunk_act`	fan out or process chunk generation for acts
`embed_jobs__preset`	`embed_all`, `embed_act`	generate embeddings for a specific preset/model
`extract_jobs`	`extract_all`, `extract_act`, `summarize_all`, `summarize_act`, `build_communities`	build graph relations, summaries, and graph communities

What each worker really does¶

Chunking worker¶

reads acts and subdivisions from PostgreSQL,
builds chunk rows, including article-level aggregation when enabled,
deletes stale chunk and embedding state on forced reprocessing,
enqueues downstream embed_act jobs for every indexing-enabled embedding preset,
enqueues downstream extract_act jobs so graph enrichment stays aligned with the latest chunks.

Embedding worker¶

is bound to one embedding preset and one queue at a time,
embeds chunk content into Qdrant chunk collections,
optionally builds one act-level vector for EUR-Lex acts,
persists embedding-state rows in PostgreSQL so missing vectors can be reconciled later,
can re-run embed_all reconcile on startup when vectors are missing for the active provider/model/vector size.

Extraction worker¶

extracts legal relations for one act and syncs them to Neo4j,
writes or refreshes canonical summaries,
triggers community detection after successful relation extraction,
owns the richest job family because it covers relation extraction, summaries, and graph-level community materialization.

Figure 2: Event-driven handoff¶

        sequenceDiagram
    participant PG as PostgreSQL
    participant CH as Chunking worker
    participant EQ as Embed queues
    participant EM as Embedding worker
    participant XQ as Extract queue
    participant EX as Extraction worker
    participant QD as Qdrant
    participant NEO as Neo4j

    PG->>CH: subdivision content
    CH->>PG: insert chunks
    CH->>EQ: enqueue embed_act per preset
    CH->>XQ: enqueue extract_act
    EQ->>EM: embed_act
    EM->>QD: upsert chunk and act vectors
    XQ->>EX: extract_act
    EX->>NEO: write relations
    EX->>XQ: enqueue summarize_act and build_communities

Reconcile behavior¶

Each worker also has a startup reconcile path so the system can recover from missed jobs or partially indexed data:

chunking can enqueue chunk_all,
embedding can enqueue embed_all per preset,
extraction can enqueue extract_all and summarize_all.

That makes the pipeline more resilient than a pure fire-and-forget queue model.

Metrics and operations¶

The workers expose their own Prometheus endpoints and feed the runtime dashboards:

chunking-worker:9109/metrics
embedding-worker:9108/metrics
extraction-worker:9107/metrics

Operationally, the most useful signals are:

queue depth,
jobs executed and failed,
worker latency histograms,
extraction status and graph-sync state,
embedding backlog versus indexed corpus size.