Retrieval and Graph Support¶

The runtime does not perform a single vector lookup. It uses routing, parallel retrieval, reranking, adaptive cutoff, and optional graph augmentation.

Retrieval stages¶

Stage	Behavior
Routing	`QueryRouter` chooses search profile, granularity, and graph usage.
Expansion	Query variants can be derived before retrieval.
Parallel search	Semantic and lexical branches run in parallel.
Fusion	Weighted or RRF fusion combines ranked candidates.
Rerank	HTTP reranker or local fallback refines ordering.
Cutoff and MMR	The service trims weak tails and diversifies results.
Context enrichment	Acts, metadata, relations, and graph context are added before generation.

Graph usage¶

Graph support appears in two distinct places:

Context enrichment on top of retrieved acts and relationships.
A parallel Cypher support branch that contributes extra evidence and trace data.

The agentic planner that decides when retrieval should clarify, refine, run complementary searches, or compress context is documented in Agentic Runtime.

Operational consequences¶

Backend health for PostgreSQL, Qdrant, and Neo4j is exported by rag-service.
Retrieval errors are exported separately from provider errors.
Phase timing metrics include the graph query and graph retrieval prefixes when those branches run.

For an executive-facing explanation of why this GraphRAG design matters commercially and how its performance has been measured, see GraphRAG Commercial Brief.