Request Flow

The streaming request path is the most important runtime interaction to keep aligned with the code and the UI trace model.

        sequenceDiagram
    participant UI as Frontend / ChatContext
    participant PX as Dev proxy or Nginx
    participant GW as API Gateway
    participant RS as RAG Service
    participant RET as Retrieval stack
    participant DB as PG + Qdrant + Neo4j

    UI->>PX: POST /chat/query/stream
    PX->>GW: rewrite to /api/v1/query/stream
    GW->>GW: auth + rate limit + metrics
    GW->>RS: proxy stream with user context
    RS->>RET: route, plan, retrieve, enrich
    RET->>DB: semantic + lexical + graph support
    DB-->>RET: ranked candidates and metadata
    RS-->>UI: sources, status, token, timings, metadata events
    RS->>DB: persist conversation turn
    RS-->>UI: done
    

Why this matters

  • The frontend trace panel depends on phase-level SSE events.

  • The gateway and rag-service each emit separate request and latency metrics.

  • Graph support can augment the answer path without replacing the core retrieval result.

Source of truth

  • Diagram source: docs/request_flow.puml

  • Code anchors:

    • frontend/src/contexts/ChatContext.tsx

    • services/api-gateway/routers/rag_proxy.py

    • services/rag-service/routers/stream.py

Notes

  • Scope checked against current route names and streaming behavior.

  • SSE phase names are documented in Streaming and Phases.

  • The PlantUML source should still be regenerated after any major route renaming.