Engine/RAG Dashboard¶
Purpose and audience¶
Purpose: provide one operational overview for the online query runtime and the async indexing pipeline.
Audience: engineers and operators diagnosing latency, failure rate, queue pressure, and ingestion backlog.
Source of truth:
monitoring/grafana/dashboards/engine-rag.jsonLive dashboard: Grafana Engine/RAG dashboard for RagLogic AI
Provisioned panel inventory¶
Overview and backend health
RAG runtime health and collection availability
Query rate by mode and by outcome
Query latency by mode and phase latency p95
Provider and retrieval error breakdown
Queue depth
Errors per minute
Traffic
Pipeline progress and counters
Relation sync and extraction status
Worker throughput and latency
Current panel-to-signal mapping¶
Expected signal |
Current coverage |
Notes |
|---|---|---|
Gateway backend health |
Present |
|
Gateway traffic |
Present |
Query and search rates are graphed |
Gateway proxy errors |
Present |
Proxy error rate is graphed |
RAG backend health |
Present |
|
RAG query modes |
Present |
|
RAG outcomes |
Present |
|
Query latency by mode |
Present |
|
RAG phase durations |
Present |
|
Provider errors |
Present |
|
Retrieval errors |
Present |
|
Queue depth |
Present |
Redis key size panels |
Worker throughput |
Present |
Chunking, embedding, extraction job rate panels |
Worker latency |
Present |
Histogram quantiles for worker durations |
Extraction backlog and relation sync |
Present |
Extraction status and graph sync panels |
Validation notes¶
The dashboard now exposes the core runtime signals that were previously missing:
canonical query modes such as
rag,llm_only,summarize, andcompare,response outcomes including
grounded,weakly_grounded,clarify, andhard_block,request latency by mode,
phase latency via
lalandre_rag_service_phase_duration_seconds,provider failures, retrieval failures, and rag-service backend health.
This validates the dashboard for the current runtime metrics surface described in the codebase and in the Sphinx operations pages.
Completion checklist¶
[x] Provisioned dashboard JSON is versioned in the repo.
[x] Existing panels are inventoried.
[x] Runtime metrics have been compared with panel coverage.
[x] Query mode coverage is complete.
[x] Query outcome coverage is complete.
[x] Phase timing coverage is complete.
[x] Provider and retrieval error coverage is complete.
[x] RAG backend health coverage is complete.