Retrieval and Graph Support

The runtime does not perform a single vector lookup. It uses routing, parallel retrieval, reranking, adaptive cutoff, and optional graph augmentation.

Retrieval stages

Stage

Behavior

Routing

QueryRouter chooses search profile, granularity, and graph usage.

Expansion

Query variants can be derived before retrieval.

Parallel search

Semantic and lexical branches run in parallel.

Fusion

Weighted or RRF fusion combines ranked candidates.

Rerank

HTTP reranker or local fallback refines ordering.

Cutoff and MMR

The service trims weak tails and diversifies results.

Context enrichment

Acts, metadata, relations, and graph context are added before generation.

Graph usage

Graph support appears in two distinct places:

  1. Context enrichment on top of retrieved acts and relationships.

  2. A parallel Cypher support branch that contributes extra evidence and trace data.

The agentic planner that decides when retrieval should clarify, refine, run complementary searches, or compress context is documented in Agentic Runtime.

Operational consequences

  • Backend health for PostgreSQL, Qdrant, and Neo4j is exported by rag-service.

  • Retrieval errors are exported separately from provider errors.

  • Phase timing metrics include the graph query and graph retrieval prefixes when those branches run.

For an executive-facing explanation of why this GraphRAG design matters commercially and how its performance has been measured, see GraphRAG Commercial Brief.