Request Flow¶
The streaming request path is the most important runtime interaction to keep aligned with the code and the UI trace model.
sequenceDiagram
participant UI as Frontend / ChatContext
participant PX as Dev proxy or Nginx
participant GW as API Gateway
participant RS as RAG Service
participant RET as Retrieval stack
participant DB as PG + Qdrant + Neo4j
UI->>PX: POST /chat/query/stream
PX->>GW: rewrite to /api/v1/query/stream
GW->>GW: auth + rate limit + metrics
GW->>RS: proxy stream with user context
RS->>RET: route, plan, retrieve, enrich
RET->>DB: semantic + lexical + graph support
DB-->>RET: ranked candidates and metadata
RS-->>UI: sources, status, token, timings, metadata events
RS->>DB: persist conversation turn
RS-->>UI: done
Why this matters¶
The frontend trace panel depends on phase-level SSE events.
The gateway and rag-service each emit separate request and latency metrics.
Graph support can augment the answer path without replacing the core retrieval result.
Source of truth¶
Diagram source:
docs/request_flow.pumlCode anchors:
frontend/src/contexts/ChatContext.tsxservices/api-gateway/routers/rag_proxy.pyservices/rag-service/routers/stream.py
Notes¶
Scope checked against current route names and streaming behavior.
SSE phase names are documented in Streaming and Phases.
The PlantUML source should still be regenerated after any major route renaming.