Retrieval and Graph Support¶
The runtime does not perform a single vector lookup. It uses routing, parallel retrieval, reranking, adaptive cutoff, and optional graph augmentation.
Retrieval stages¶
Stage |
Behavior |
|---|---|
Routing |
|
Expansion |
Query variants can be derived before retrieval. |
Parallel search |
Semantic and lexical branches run in parallel. |
Fusion |
Weighted or RRF fusion combines ranked candidates. |
Rerank |
HTTP reranker or local fallback refines ordering. |
Cutoff and MMR |
The service trims weak tails and diversifies results. |
Context enrichment |
Acts, metadata, relations, and graph context are added before generation. |
Graph usage¶
Graph support appears in two distinct places:
Context enrichment on top of retrieved acts and relationships.
A parallel Cypher support branch that contributes extra evidence and trace data.
The agentic planner that decides when retrieval should clarify, refine, run complementary searches, or compress context is documented in Agentic Runtime.
Operational consequences¶
Backend health for PostgreSQL, Qdrant, and Neo4j is exported by
rag-service.Retrieval errors are exported separately from provider errors.
Phase timing metrics include the graph query and graph retrieval prefixes when those branches run.
For an executive-facing explanation of why this GraphRAG design matters commercially and how its performance has been measured, see GraphRAG Commercial Brief.