RAG API¶
Note
This page is generated automatically from the repository’s maintained Python module inventory.
Retrieval, graph augmentation, summaries, response building, and orchestration.
lalandre_rag¶
Source: packages/lalandre_rag/lalandre_rag/__init__.py
RAG Service Module High-level orchestration of retrieval, context enrichment, and LLM generation
lalandre_rag.adapters¶
Source: packages/lalandre_rag/lalandre_rag/adapters/__init__.py
RAG Adapters Adapters for different LLM frameworks
lalandre_rag.adapters.llamaindex¶
Source: packages/lalandre_rag/lalandre_rag/adapters/llamaindex.py
LlamaIndex Adapter Utilities for using LlamaIndex with context slices
- class lalandre_rag.adapters.llamaindex.LlamaIndexAdapter(llama_llm)[source]¶
Bases:
objectAdapter for using LlamaIndex with context slice objects
Provides: - Document to node conversion - TreeSummarize for long documents - Multi-document comparison
Initialize LlamaIndex adapter
- Parameters:
llama_llm (LLM) – LlamaIndex-compatible LLM client
- static context_slice_key(doc)[source]¶
Return the stable lookup key used for source identifiers.
- Parameters:
doc (ContextSlice)
- Return type:
Tuple[str, int, int | None]
- context_slices_to_nodes(context_slices, source_id_map=None)[source]¶
Convert ContextSlice objects to LlamaIndex NodeWithScore
- Parameters:
context_slices (List[ContextSlice]) – List of context slices
source_id_map (Dict[Tuple[str, int, int | None], str] | None)
- Returns:
List of LlamaIndex nodes with scores
- Return type:
List[NodeWithScore]
- summarize(topic, context_slices, source_id_map=None)[source]¶
Use LlamaIndex TreeSummarize for hierarchical summarization
Better for long documents as it summarizes in chunks then combines
- Parameters:
topic (str) – Topic to summarize
context_slices (List[ContextSlice]) – Context slices to summarize
source_id_map (Dict[Tuple[str, int, int | None], str] | None)
- Returns:
Summary text
- Return type:
str
- compare(comparison_question, context_slices, celex_list, source_id_map=None)[source]¶
Use LlamaIndex for intelligent multi-document comparison
Groups documents by CELEX and compares systematically
- Parameters:
comparison_question (str) – Question for comparison
context_slices (List[ContextSlice]) – Context slices to compare
celex_list (List[str]) – List of CELEX codes being compared
source_id_map (Dict[Tuple[str, int, int | None], str] | None)
- Returns:
Comparison text
- Return type:
str
lalandre_rag.agentic¶
Source: packages/lalandre_rag/lalandre_rag/agentic/__init__.py
Typed agentic orchestration package for RAG planning.
lalandre_rag.agentic.deps¶
Source: packages/lalandre_rag/lalandre_rag/agentic/deps.py
Dependency container for the PydanticAI planning runtime.
- class lalandre_rag.agentic.deps.RetrievalServiceProtocol(*args, **kwargs)[source]¶
Bases:
ProtocolProtocol for the retrieval service consumed by the planning graph.
- class lalandre_rag.agentic.deps.ContextServiceProtocol(*args, **kwargs)[source]¶
Bases:
ProtocolProtocol for retrieval-result enrichment services.
- class lalandre_rag.agentic.deps.QueryRouterProtocol(*args, **kwargs)[source]¶
Bases:
ProtocolProtocol for deterministic or LLM-assisted retrieval routing.
- class lalandre_rag.agentic.deps.GraphRAGServiceProtocol(*args, **kwargs)[source]¶
Bases:
ProtocolProtocol for optional graph-retrieval services.
- class lalandre_rag.agentic.deps.CommunityEnricherProtocol(*args, **kwargs)[source]¶
Bases:
ProtocolProtocol for community-summary enrichers used by global mode.
- class lalandre_rag.agentic.deps.AgenticPlanningDeps(retrieval_service, context_service, llm, lightweight_llm, rag_prompt, query_router, graph_rag_service, community_enricher, question, top_k, score_threshold, filters, include_relations, include_subjects, include_full_content, return_sources, collections, granularity, graph_depth, use_graph, embedding_preset, retrieval_depth, chat_history, progress_callback=None, preamble_callback=None, token_callback=None, final_answer_callback=None)[source]¶
Bases:
objectRuntime dependencies injected into the planning graph.
- Parameters:
retrieval_service (RetrievalServiceProtocol)
context_service (ContextServiceProtocol)
llm (Any)
lightweight_llm (Any)
rag_prompt (ChatPromptTemplate)
query_router (QueryRouterProtocol)
graph_rag_service (GraphRAGServiceProtocol | None)
community_enricher (CommunityEnricherProtocol | None)
question (str)
top_k (int)
score_threshold (float | None)
filters (Dict[str, Any] | None)
include_relations (bool)
include_subjects (bool)
include_full_content (bool)
return_sources (bool)
collections (List[str] | None)
granularity (str | None)
graph_depth (int | None)
use_graph (bool | None)
embedding_preset (str | None)
retrieval_depth (str | None)
chat_history (List[BaseMessage] | None)
progress_callback (Callable[[Dict[str, Any]], None] | None)
preamble_callback (Callable[[Dict[str, Any] | None, Dict[str, Any]], None] | None)
token_callback (Callable[[str], None] | None)
final_answer_callback (Callable[[str], None] | None)
lalandre_rag.agentic.graph¶
Source: packages/lalandre_rag/lalandre_rag/agentic/graph.py
Pydantic Graph orchestration for RAG planning phases.
- class lalandre_rag.agentic.graph.LoadConversationContext[source]¶
Bases:
BaseNode[PlanningGraphState,AgenticPlanningDeps,PlanningEarlyExit|PlanningContext]Bootstrap node for future conversation-memory loading.
- async run(ctx)[source]¶
Advance to the question-decomposition phase.
- Parameters:
ctx (GraphRunContext[PlanningGraphState, AgenticPlanningDeps])
- Return type:
- class lalandre_rag.agentic.graph.DecomposeQuestion[source]¶
Bases:
BaseNode[PlanningGraphState,AgenticPlanningDeps,PlanningEarlyExit|PlanningContext]Placeholder decomposition node for complex multi-part questions.
- async run(ctx)[source]¶
Advance to the routing phase.
- Parameters:
ctx (GraphRunContext[PlanningGraphState, AgenticPlanningDeps])
- Return type:
- class lalandre_rag.agentic.graph.RouteQuestion[source]¶
Bases:
BaseNode[PlanningGraphState,AgenticPlanningDeps,PlanningEarlyExit|PlanningContext]Route the user question toward the appropriate retrieval profile.
- async run(ctx)[source]¶
Compute routing metadata and transition to planning.
- Parameters:
ctx (GraphRunContext[PlanningGraphState, AgenticPlanningDeps])
- Return type:
- class lalandre_rag.agentic.graph.PlanRetrieval[source]¶
Bases:
BaseNode[PlanningGraphState,AgenticPlanningDeps,PlanningEarlyExit|PlanningContext]Build the retrieval plan and complementary-query strategy.
- async run(ctx)[source]¶
Resolve the planning step and transition to retrieval.
- Parameters:
ctx (GraphRunContext[PlanningGraphState, AgenticPlanningDeps])
- Return type:
- class lalandre_rag.agentic.graph.RunRetrieval[source]¶
Bases:
BaseNode[PlanningGraphState,AgenticPlanningDeps,PlanningEarlyExit|PlanningContext]Execute retrieval and context enrichment for the planned query.
- async run(ctx)[source]¶
Run retrieval and either finish early or evaluate sufficiency.
- Parameters:
ctx (GraphRunContext[PlanningGraphState, AgenticPlanningDeps])
- Return type:
- class lalandre_rag.agentic.graph.EvaluateEvidence[source]¶
Bases:
BaseNode[PlanningGraphState,AgenticPlanningDeps,PlanningEarlyExit|PlanningContext]Assess whether retrieved evidence is sufficient for answering.
- async run(ctx)[source]¶
Evaluate retrieval quality before optional graph augmentation.
- Parameters:
ctx (GraphRunContext[PlanningGraphState, AgenticPlanningDeps])
- Return type:
- class lalandre_rag.agentic.graph.MaybeFetchGraphSupport[source]¶
Bases:
BaseNode[PlanningGraphState,AgenticPlanningDeps,PlanningEarlyExit|PlanningContext]Optionally augment retrieval context with graph-derived support.
- async run(ctx)[source]¶
Fetch graph support when the current plan allows it.
- Parameters:
ctx (GraphRunContext[PlanningGraphState, AgenticPlanningDeps])
- Return type:
- class lalandre_rag.agentic.graph.CompressContext[source]¶
Bases:
BaseNode[PlanningGraphState,AgenticPlanningDeps,PlanningEarlyExit|PlanningContext]Finalize and optionally compress context before generation.
- async run(ctx)[source]¶
Produce the terminal planning artifact for downstream generation.
- Parameters:
ctx (GraphRunContext[PlanningGraphState, AgenticPlanningDeps])
- Return type:
End[PlanningEarlyExit | PlanningContext]
- lalandre_rag.agentic.graph.run_planning_graph(*, deps)[source]¶
Execute the planning graph synchronously and return the terminal artifact.
- Parameters:
deps (AgenticPlanningDeps)
- Return type:
lalandre_rag.agentic.models¶
Source: packages/lalandre_rag/lalandre_rag/agentic/models.py
Pydantic and dataclass models used by the planning runtime.
- class lalandre_rag.agentic.models.ComplementaryQueryOutput(*, query, level_hint=None)[source]¶
Bases:
BaseModelStructured follow-up retrieval produced by the planner.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
query (str)
level_hint (str | None)
- classmethod clean_query(value)[source]¶
Normalize and validate a complementary query string.
- Parameters:
value (Any)
- Return type:
str
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class lalandre_rag.agentic.models.RoutingIntentOutput(*, profile, granularity=None, top_k=10, include_relations_hint=False, execution_mode='hybrid', rationale='LLM parser selected retrieval profile.', use_graph=False, normalized_query=None, intent_label=None, confidence=None, output_validation_retries=0)[source]¶
Bases:
BaseModelStructured output returned by the routing agent.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
profile (Literal['contextual_default', 'citation_precision', 'relationship_focus', 'global_overview'])
granularity (Literal['subdivisions', 'chunks', 'all', 'auto'] | None)
top_k (int)
include_relations_hint (bool)
execution_mode (Literal['hybrid', 'global'])
rationale (str)
use_graph (bool)
normalized_query (str | None)
intent_label (str | None)
confidence (float | None)
output_validation_retries (int)
- model_config: ClassVar[ConfigDict] = {'extra': 'ignore'}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- classmethod normalize_granularity(value)[source]¶
Normalize planner granularity hints to supported values.
- Parameters:
value (Any)
- Return type:
str | None
- classmethod clean_optional_text(value)[source]¶
Trim optional text fields and coerce blanks to
None.- Parameters:
value (Any)
- Return type:
str | None
- classmethod clean_rationale(value)[source]¶
Normalize routing rationales and inject a default fallback.
- Parameters:
value (Any)
- Return type:
str
- class lalandre_rag.agentic.models.DecompositionOutput(*, sub_questions=<factory>, synthesize=False, output_validation_retries=0)[source]¶
Bases:
BaseModelStructured output returned by the decomposition agent.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
sub_questions (List[str])
synthesize (bool)
output_validation_retries (int)
- model_config: ClassVar[ConfigDict] = {'extra': 'ignore'}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class lalandre_rag.agentic.models.RetrievalPlannerOutput(*, primary_query='', intent_class='documentary', skip_retrieval=False, needs_complementary=False, complementary_queries=<factory>, needs_compression=False, clarification_question=None, strict_grounding_requested=False, rationale='', output_validation_retries=0)[source]¶
Bases:
BaseModelStructured retrieval plan returned by the planner agent.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
primary_query (str)
intent_class (Literal['conversational', 'documentary'])
skip_retrieval (bool)
needs_complementary (bool)
complementary_queries (List[ComplementaryQueryOutput])
needs_compression (bool)
clarification_question (str | None)
strict_grounding_requested (bool)
rationale (str)
output_validation_retries (int)
- model_config: ClassVar[ConfigDict] = {'extra': 'ignore'}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- classmethod clean_primary_query(value)[source]¶
Normalize the planner primary query field.
- Parameters:
value (Any)
- Return type:
str
- classmethod normalize_intent_class(value)[source]¶
Restrict the planner intent class to supported values.
- Parameters:
value (Any)
- Return type:
str
- classmethod clean_clarification_question(value)[source]¶
Normalize optional clarification prompts.
- Parameters:
value (Any)
- Return type:
str | None
- classmethod clean_rationale(value)[source]¶
Normalize planner rationales to a stripped string.
- Parameters:
value (Any)
- Return type:
str
- class lalandre_rag.agentic.models.RetrievalRefinementOutput(*, refined_query, rationale='', output_validation_retries=0)[source]¶
Bases:
BaseModelStructured refined query returned by the corrective agent.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
refined_query (str)
rationale (str)
output_validation_retries (int)
- model_config: ClassVar[ConfigDict] = {'extra': 'ignore'}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class lalandre_rag.agentic.models.RetrievalEvaluationOutput(*, status='SUFFICIENT', gap=None, output_validation_retries=0)[source]¶
Bases:
BaseModelStructured sufficiency evaluation returned by the CRAG evaluator.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
status (Literal['SUFFICIENT', 'PARTIAL', 'INSUFFICIENT'])
gap (str | None)
output_validation_retries (int)
- model_config: ClassVar[ConfigDict] = {'extra': 'ignore'}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class lalandre_rag.agentic.models.GraphSupportDecision(*, use_graph=False, use_cypher=False, rationale='')[source]¶
Bases:
BaseModelStructured graph support decision reserved for future graph/Cypher routing.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
use_graph (bool)
use_cypher (bool)
rationale (str)
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class lalandre_rag.agentic.models.PhaseTraceEvent(*, phase, status, label, detail=None, count=None, duration_ms=None, meta=<factory>, tool=None)[source]¶
Bases:
BaseModelSingle trace event emitted by the planning runtime.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
phase (str)
status (str)
label (str)
detail (str | None)
count (int | None)
duration_ms (float | None)
meta (Dict[str, Any])
tool (str | None)
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class lalandre_rag.agentic.models.PlanningEarlyExit(kind, routing_ms, planner_ms, retrieval_ms=0.0, intent_class='documentary', clarification_question=None, strict_grounding_requested=False, agentic_rationale='', agentic_meta=<factory>, best_score=0.0, gate_threshold=0.0, candidates_dropped=0)[source]¶
Bases:
objectSignals that the planning pipeline hit a terminal condition.
- Parameters:
kind (str)
routing_ms (float)
planner_ms (float)
retrieval_ms (float)
intent_class (Literal['conversational', 'documentary'])
clarification_question (str | None)
strict_grounding_requested (bool)
agentic_rationale (str)
agentic_meta (Dict[str, Any])
best_score (float)
gate_threshold (float)
candidates_dropped (int)
- class lalandre_rag.agentic.models.PlanningContext(context_slices, graph_fetch, retrieval_plan, agentic_plan, agentic_meta, retrieval_query, effective_top_k, effective_granularity, effective_include_relations, community_meta, routing_ms, planner_ms, retrieval_ms, context_enrichment_ms, graph_enrichment_ms, complementary_ms, compression_ms, retrieval_stats=None)[source]¶
Bases:
objectArtifacts produced by the planning graph and consumed by generation.
- Parameters:
context_slices (List[Any])
graph_fetch (Any | None)
retrieval_plan (Any)
agentic_plan (Any | None)
agentic_meta (Dict[str, Any])
retrieval_query (str)
effective_top_k (int)
effective_granularity (str | None)
effective_include_relations (bool)
community_meta (Dict[str, Any])
routing_ms (float)
planner_ms (float)
retrieval_ms (float)
context_enrichment_ms (float)
graph_enrichment_ms (float)
complementary_ms (float)
compression_ms (float)
retrieval_stats (Any)
- class lalandre_rag.agentic.models.PlanningGraphState(question, top_k, score_threshold, filters, include_relations, include_subjects, collections, granularity, graph_depth, use_graph, embedding_preset, planner_run_id, planner_path=<factory>, trace_events=<factory>, output_validation_retries=0, decompose_ms=0.0, routing_ms=0.0, planner_ms=0.0, retrieval_ms=0.0, context_enrichment_ms=0.0, graph_enrichment_ms=0.0, complementary_ms=0.0, compression_ms=0.0, effective_top_k=0, effective_granularity=None, effective_include_relations=False, decomposition_result=None, retrieval_plan=None, agentic_plan=None, retrieval_query=None, retrieval_results=<factory>, context_slices=<factory>, community_meta=<factory>, graph_fetch=None, retrieval_depth=None, planning_future=None, graph_prefetch_future=None, retrieval_stats=None, agentic_meta=<factory>, early_exit=None)[source]¶
Bases:
objectMutable planning state shared across graph nodes.
- Parameters:
question (str)
top_k (int)
score_threshold (float | None)
filters (Dict[str, Any] | None)
include_relations (bool)
include_subjects (bool)
collections (List[str] | None)
granularity (str | None)
graph_depth (int | None)
use_graph (bool | None)
embedding_preset (str | None)
planner_run_id (str)
planner_path (List[str])
trace_events (List[PhaseTraceEvent])
output_validation_retries (int)
decompose_ms (float)
routing_ms (float)
planner_ms (float)
retrieval_ms (float)
context_enrichment_ms (float)
graph_enrichment_ms (float)
complementary_ms (float)
compression_ms (float)
effective_top_k (int)
effective_granularity (str | None)
effective_include_relations (bool)
decomposition_result (Any)
retrieval_plan (Any | None)
agentic_plan (Any | None)
retrieval_query (str | None)
retrieval_results (List[Any])
context_slices (List[Any])
community_meta (Dict[str, Any])
graph_fetch (Any | None)
retrieval_depth (str | None)
planning_future (Any | None)
graph_prefetch_future (Any | None)
retrieval_stats (Any)
agentic_meta (Dict[str, Any])
early_exit (PlanningEarlyExit | None)
lalandre_rag.agentic.runtime¶
Source: packages/lalandre_rag/lalandre_rag/agentic/runtime.py
Concrete planning runtime for the PydanticAI-driven RAG pipeline.
- class lalandre_rag.agentic.runtime.AgenticComplementaryQuery(query, level_hint=None)[source]¶
Bases:
objectA targeted follow-up query proposed by the planner.
- Parameters:
query (str)
level_hint (str | None)
- class lalandre_rag.agentic.runtime.AgenticRetrievalPlan(primary_query, intent_class='documentary', skip_retrieval=False, needs_complementary=False, complementary_queries=<factory>, needs_compression=False, clarification_question=None, strict_grounding_requested=False, rationale='', planning_ms=0.0, planner_used=False, output_validation_retries=0)[source]¶
Bases:
objectPlanner decision for retrieval/refinement phases.
- Parameters:
primary_query (str)
intent_class (str)
skip_retrieval (bool)
needs_complementary (bool)
complementary_queries (List[AgenticComplementaryQuery])
needs_compression (bool)
clarification_question (str | None)
strict_grounding_requested (bool)
rationale (str)
planning_ms (float)
planner_used (bool)
output_validation_retries (int)
- class lalandre_rag.agentic.runtime.DecomposedQuery(sub_questions=<factory>, synthesize=False, decomposed=False, decompose_ms=0.0, output_validation_retries=0)[source]¶
Bases:
objectStructured decomposition used by the planning graph.
- Parameters:
sub_questions (List[str])
synthesize (bool)
decomposed (bool)
decompose_ms (float)
output_validation_retries (int)
- class lalandre_rag.agentic.runtime.EvalResult(status, gap_hint, eval_ms, fallback=False, output_validation_retries=0)[source]¶
Bases:
objectSufficiency evaluation for CRAG correction.
- Parameters:
status (str)
gap_hint (str | None)
eval_ms (float)
fallback (bool)
output_validation_retries (int)
- lalandre_rag.agentic.runtime.decompose_query(question, llm, *, heuristic_only=True, max_sub_questions=3)[source]¶
Decompose a complex question into independent sub-questions.
- Parameters:
question (str)
llm (Any)
heuristic_only (bool)
max_sub_questions (int)
- Return type:
- lalandre_rag.agentic.runtime.evaluate_retrieval(question, results, llm)[source]¶
Evaluate whether current retrieval evidence is sufficient.
- Parameters:
question (str)
results (List[RetrievalResult])
llm (Any)
- Return type:
- lalandre_rag.agentic.runtime.plan_retrieval(question, llm)[source]¶
Run the planner LLM to decide the retrieval strategy.
- Parameters:
question (str)
llm (Any)
- Return type:
lalandre_rag.agentic.tools¶
Source: packages/lalandre_rag/lalandre_rag/agentic/tools.py
PydanticAI tools and adapters for structured planning outputs.
- lalandre_rag.agentic.tools.run_intent_parser_agent(*, question, top_k, requested_granularity, generate_text, model_name)[source]¶
Run the structured intent parser agent for one question.
- Parameters:
question (str)
top_k (int)
requested_granularity (str | None)
generate_text (Callable[[str], str])
model_name (str)
- Return type:
tuple[RoutingIntentOutput, int]
- lalandre_rag.agentic.tools.run_decomposition_agent(*, question, llm, model_name='planner:decompose')[source]¶
Run the decomposition agent for one complex question.
- Parameters:
question (str)
llm (Any)
model_name (str)
- Return type:
tuple[DecompositionOutput, int]
- lalandre_rag.agentic.tools.run_planner_agent(*, question, llm, model_name='planner:retrieve')[source]¶
Run the retrieval planner agent.
- Parameters:
question (str)
llm (Any)
model_name (str)
- Return type:
tuple[RetrievalPlannerOutput, int]
- lalandre_rag.agentic.tools.run_refinement_agent(*, question, gap_hint, llm, model_name='planner:refine')[source]¶
Run the corrective refinement agent for weak retrieval results.
- Parameters:
question (str)
gap_hint (str)
llm (Any)
model_name (str)
- Return type:
tuple[RetrievalRefinementOutput, int]
- lalandre_rag.agentic.tools.run_evaluation_agent(*, question, context_preview, llm, model_name='planner:evaluate')[source]¶
Run the sufficiency evaluator on a preview of retrieved context.
- Parameters:
question (str)
context_preview (str)
llm (Any)
model_name (str)
- Return type:
tuple[RetrievalEvaluationOutput, int]
lalandre_rag.citation_sanitizer¶
Source: packages/lalandre_rag/lalandre_rag/citation_sanitizer.py
Normalize malformed citation tags emitted by the main LLM.
The RAG prompt instructs the LLM to use strict native tags like [S1],
[G2, L2], [R3], [C1], [CM4]. In practice the LLM often slips
in extra material between the brackets, e.g.:
[S2, Annex I C(4) ; RTS 2 Annex III §13.1]
[G9, article 25(2)]
[G7, considérant 71]
These ad-hoc forms are not recognized by the front-end regex (which only matches the strict format) and break the prose_rewriter’s integrity check (it counts strict tags only). This module rewrites them back to the strict form before any post-processing runs:
[S2, Annex I C(4) ; RTS 2 Annex III §13.1]→[S2][G9, article 25(2)]→[G9][S1, L1]→[S1, L1](preserved — already valid)[G2, L2 ; article 25]→[G2, L2](level kept, article precision dropped)
The article precision is not lost — the prompt instructs the LLM to write it
in the surrounding prose (« L'article 25 [G9, L2] »). The sanitizer just
strips it from inside the brackets where it doesn’t belong.
lalandre_rag.graph¶
Source: packages/lalandre_rag/lalandre_rag/graph/__init__.py
Graph RAG utilities — ranking, context budget, map-reduce, service, helpers.
lalandre_rag.graph.community¶
Source: packages/lalandre_rag/lalandre_rag/graph/community.py
Community-aware context enrichment for Graph RAG (Level 3).
Communities are stored as :Community nodes in Neo4j, linked to Acts via BELONGS_TO relationships. This module queries Neo4j directly — no JSON files on disk.
Usage:
enricher = CommunityContextEnricher(neo4j_repo)
community_block = enricher.build_context(seed_act_ids, max_communities=4)
lalandre_rag.graph.context_budget¶
Source: packages/lalandre_rag/lalandre_rag/graph/context_budget.py
Token-budget-aware context builder for Graph RAG.
Instead of blindly truncating acts by position, this module manages a character budget split across three zones:
Semantic zone (60 %): content from Qdrant vector matches
Graph zone (30 %): act titles and descriptions from Neo4j expansion
Relation zone (10 %): relationship descriptions
Each zone is filled with the highest-ranked items first, so the LLM always receives the most relevant content regardless of total volume.
Usage:
budget = GraphContextBudget(max_chars=20000)
context = budget.build(
semantic_results=semantic_results,
ranked_nodes=ranked_nodes,
ranked_relationships=ranked_relationships,
)
- class lalandre_rag.graph.context_budget.BudgetAllocation(semantic_chars, graph_chars, relation_chars)[source]¶
Bases:
objectCharacter-budget allocation across context zones.
- Parameters:
semantic_chars (int)
graph_chars (int)
relation_chars (int)
- property total: int¶
Return the total allocated character budget across all zones.
- class lalandre_rag.graph.context_budget.ContextBuildResult(combined_context, source_id_map, semantic_count, graph_nodes_used, relationships_used, budget_allocation, graph_node_refs=<factory>, relationship_refs=<factory>, chars_used=<factory>)[source]¶
Bases:
objectOutput of the context builder.
- Parameters:
combined_context (str)
source_id_map (Dict[Tuple[str, int | None, int | None], str])
semantic_count (int)
graph_nodes_used (int)
relationships_used (int)
budget_allocation (BudgetAllocation)
graph_node_refs (List[Dict[str, Any]])
relationship_refs (List[Dict[str, Any]])
chars_used (Dict[str, int])
- class lalandre_rag.graph.context_budget.GraphContextBudget(max_chars=20000, semantic_share=0.60, graph_share=0.30, relation_share=0.10, min_chars_per_source=200)[source]¶
Bases:
objectBuild LLM context from scored and ranked graph results.
- Parameters:
max_chars (int) – Total character budget for the whole context block.
semantic_share (float) – Fraction reserved for semantic search content.
graph_share (float) – Fraction reserved for graph-expanded act information.
relation_share (float) – Fraction reserved for relationship descriptions.
min_chars_per_source (int) – Minimum chars reserved for each semantic source.
- build(*, semantic_results, ranked_nodes, ranked_relationships)[source]¶
Assemble the full context string from ranked results.
Spills unused budget from one zone to the next (semantic → graph → relation).
- Parameters:
semantic_results (List[Any])
ranked_nodes (List[Dict[str, Any]])
ranked_relationships (List[Dict[str, Any]])
- Return type:
lalandre_rag.graph.helpers¶
Source: packages/lalandre_rag/lalandre_rag/graph/helpers.py
Graph-mode helper functions for the RAG service.
Contains NL→Cypher prompt building, Cypher extraction from LLM output, and Cypher-row context formatting. These are domain-level utilities that live in the package layer, not in the service layer.
- lalandre_rag.graph.helpers.build_nl_to_cypher_prompt(*, question, max_graph_depth, row_limit)[source]¶
Return the system prompt that translates a natural-language question into Cypher.
- Parameters:
question (str)
max_graph_depth (int)
row_limit (int)
- Return type:
str
- lalandre_rag.graph.helpers.normalize_cypher_candidate(candidate)[source]¶
Strip common model artifacts around an otherwise valid Cypher query.
- Parameters:
candidate (str)
- Return type:
str
- lalandre_rag.graph.helpers.extract_cypher(text)[source]¶
Attempt to extract a Cypher query from raw LLM text.
Tries, in order: 1. JSON object with a
"cypher"key. 2. Fenced code block (`cypher … `). 3. Bare Cypher starting with a keyword (MATCH, WITH, …).- Parameters:
text (str)
- Return type:
str | None
lalandre_rag.graph.map_reduce¶
Source: packages/lalandre_rag/lalandre_rag/graph/map_reduce.py
Map-Reduce generation for Graph RAG.
When the assembled context exceeds a certain threshold, a single LLM call can time out or produce degraded answers. This module splits the context into chunks, runs parallel “map” calls to produce partial summaries, then merges them into a final answer with a “reduce” call.
Pipeline:
context_chunks ──► LLM map (parallel) ──► partial summaries
│
question ──────────────────────────────────────► LLM reduce ──► answer
Usage:
answer = await map_reduce_generate(
context=long_context,
question=question,
llm=llm_chain,
chunk_chars=6000,
map_timeout=15.0,
reduce_timeout=20.0,
)
- async lalandre_rag.graph.map_reduce.map_reduce_generate(*, context, question, llm, chunk_chars=None, map_timeout=None, reduce_timeout=None, max_parallel=None)[source]¶
Map-reduce generation pipeline.
Split context into chunks of
chunk_charsRun up to
max_parallelmap calls concurrentlyMerge with a single reduce call
Falls back to the concatenated map summaries if the reduce step fails. All parameters default to values from
config.graph.- Parameters:
context (str)
question (str)
llm (Any)
chunk_chars (int | None)
map_timeout (float | None)
reduce_timeout (float | None)
max_parallel (int | None)
- Return type:
str
lalandre_rag.graph.neo4j_adapter¶
Source: packages/lalandre_rag/lalandre_rag/graph/neo4j_adapter.py
Optional Neo4j GraphRAG integration for graph-mode retrieval.
This module prefers official Neo4j GraphRAG retrievers when the dependency is installed, while keeping the existing Lalandre graph pipeline as a fallback.
- class lalandre_rag.graph.neo4j_adapter.Text2CypherSearchOutput(generated_cypher, rows, metadata)[source]¶
Bases:
objectResult payload for official Text2Cypher retrieval.
- Parameters:
generated_cypher (str)
rows (List[Dict[str, Any]])
metadata (Dict[str, Any])
- class lalandre_rag.graph.neo4j_adapter.Neo4jGraphRAGAdapter(*, neo4j_driver, neo4j_database, qdrant_client, qdrant_collection_name, llm_provider, llm_model, llm_temperature, llm_max_tokens, llm_api_key, mistral_api_key, llm_base_url, key_pool, read_only_validator, row_serializer)[source]¶
Bases:
objectBridge between Lalandre graph mode and official Neo4j GraphRAG retrievers.
- Parameters:
neo4j_driver (Driver)
neo4j_database (str | None)
qdrant_client (Any)
qdrant_collection_name (str)
llm_provider (str)
llm_model (str)
llm_temperature (float)
llm_max_tokens (int)
llm_api_key (str | None)
mistral_api_key (str | None)
llm_base_url (str | None)
key_pool (APIKeyPool | None)
read_only_validator (Callable[[str], str])
row_serializer (Callable[[Any], Any])
- is_available()[source]¶
Return whether the adapter is ready to serve official GraphRAG calls.
- Return type:
bool
lalandre_rag.graph.ranker¶
Source: packages/lalandre_rag/lalandre_rag/graph/ranker.py
Graph node ranking for Graph RAG.
Scores and ranks graph-expanded nodes by relevance to avoid sending noise to the LLM. Three signals are combined:
Hop distance – nodes closer to the seed acts score higher.
Semantic overlap – nodes that also appear in the Qdrant results get a boost (they matched the query both semantically and structurally).
Relation type weight – AMENDS / IMPLEMENTS (strong legal ties) outweigh CITES / DEROGATES (weaker references).
Usage:
ranked = rank_graph_nodes(
graph_context=graph_context,
relationships=relationships,
semantic_act_ids=semantic_act_ids,
seed_act_ids=seed_act_ids,
)
# ranked is sorted best-first; slice to your budget
- lalandre_rag.graph.ranker.rank_graph_nodes(*, graph_context, relationships, semantic_act_ids, seed_act_ids, max_depth=5, hop_decay=0.5, semantic_boost=0.3, relation_weight_factor=0.25)[source]¶
Score and rank graph-expanded nodes.
Each node receives a normalized composite score in
[0, 1]based on hop distance, semantic overlap, and incident relation strength.- Parameters:
graph_context (List[Dict[str, Any]]) – Graph-expanded act nodes to score.
relationships (List[Dict[str, Any]]) – Graph relationships connecting the candidate nodes.
semantic_act_ids (Set[int]) – Act identifiers also returned by semantic search.
seed_act_ids (Set[int]) – Seed act identifiers used to start graph expansion.
max_depth (int) – Maximum BFS depth used to estimate hop distance.
hop_decay (float) – Exponential decay applied to hop distance.
semantic_boost (float) – Non-negative weight applied to semantic overlap.
relation_weight_factor (float) – Non-negative weight applied to relation strength.
- Returns:
The input nodes enriched with ranking metadata and sorted best-first.
- Return type:
List[Dict[str, Any]]
lalandre_rag.graph.service¶
Source: packages/lalandre_rag/lalandre_rag/graph/service.py
Graph service for RAG.
Combines semantic search with graph traversal for enhanced regulatory context retrieval.
- class lalandre_rag.graph.service.GraphRAGService(neo4j_repo, qdrant_repo, embedding_service, key_pool=None)[source]¶
Bases:
objectGraph-Enhanced Retrieval-Augmented Generation Service
This service implements the Graph RAG approach by: 1. Using semantic search (Qdrant) to find relevant subdivisions 2. Enriching results with act-level graph context (Neo4j) to capture relationships 3. Providing regulatory ecosystem understanding at the act level
Key capabilities: - Semantic search (subdivision-level) with graph expansion (act-level) - Regulatory path discovery between acts - Temporal relationship tracking (amendments, repeals, etc.) - Full regulatory context retrieval
- Note: Graph operations focus on act-level relationships for performance.
Subdivision details are retrieved from Qdrant/PostgreSQL.
Initialize Graph RAG service
- Parameters:
neo4j_repo (Neo4jRepository) – Neo4j repository for graph operations
qdrant_repo (QdrantRepository) – Qdrant repository for semantic search
embedding_service (EmbeddingService) – Service for generating embeddings
key_pool (APIKeyPool | None)
- supports_official_text2cypher()[source]¶
Return whether the official Neo4j Text2Cypher adapter is available.
- Return type:
bool
lalandre_rag.graph.source_payloads¶
Source: packages/lalandre_rag/lalandre_rag/graph/source_payloads.py
Helpers to serialize graph-derived evidence into a consistent payload.
- lalandre_rag.graph.source_payloads.build_graph_node_source_item(*, node, source_id, sequence_order)[source]¶
Serialize a ranked graph node into a user-facing evidence item.
- Parameters:
node (Dict[str, Any])
source_id (str)
sequence_order (int)
- Return type:
Dict[str, Any]
- lalandre_rag.graph.source_payloads.build_graph_edge_source_item(*, relationship, source_id, sequence_order, start_celex, end_celex)[source]¶
Serialize a ranked graph relationship into a user-facing evidence item.
- Parameters:
relationship (Dict[str, Any])
source_id (str)
sequence_order (int)
start_celex (str)
end_celex (str)
- Return type:
Dict[str, Any]
- lalandre_rag.graph.source_payloads.build_cypher_row_source_item(*, row, row_index, include_full_content, content_preview_chars, query_id, graph_query_strategy, generated_cypher)[source]¶
Serialize a Cypher row into a concrete evidence item without fake scoring.
- Parameters:
row (Dict[str, Any])
row_index (int)
include_full_content (bool)
content_preview_chars (int)
query_id (str)
graph_query_strategy (str)
generated_cypher (str | None)
- Return type:
Dict[str, Any]
lalandre_rag.linker_factory¶
Source: packages/lalandre_rag/lalandre_rag/linker_factory.py
Build a LegalEntityLinker wired for the rag-service runtime.
The linker is shared with the extraction pipeline. For RAG use, we additionally
populate act_id on each ActAliasEntry (so resolutions carry the act
primary key) and supply an article_lookup callable that maps
(act_id, article_number) to a subdivision id using a small in-memory cache
seeded from the database.
- lalandre_rag.linker_factory.build_linker(pg_repo, *, fuzzy_threshold, fuzzy_min_gap, fuzzy_limit, min_alias_chars, article_cache_size=4096)[source]¶
Construct a
LegalEntityLinkerseeded from the acts table.The returned linker carries
act_idon every alias entry and is wired with anarticle_lookupcallable that queriessubdivisionson demand (LRU-cached, bounded).- Parameters:
pg_repo (PostgresRepository)
fuzzy_threshold (float)
fuzzy_min_gap (float)
fuzzy_limit (int)
min_alias_chars (int)
article_cache_size (int)
- Return type:
- lalandre_rag.linker_factory.build_external_detector(linker)[source]¶
Construct the optional NER-backed
ExternalDetectorfor prose linking.Reads
NER_SERVICE_URLfrom the environment. When unset (or empty), returnsNoneso the regex+fuzzy linker keeps its V1 behaviour with zero overhead. When set, builds a small HTTP client and wraps it in an adapter that resolves NER spans through the sameLegalEntityLinker.- Parameters:
linker (LegalEntityLinker)
- Return type:
Callable[[str], Sequence[ExternalDetection]] | None
lalandre_rag.llm¶
Source: packages/lalandre_rag/lalandre_rag/llm/__init__.py
LLM factory utilities for RAG.
lalandre_rag.llm.factory¶
Source: packages/lalandre_rag/lalandre_rag/llm/factory.py
Factory utilities for RAG LLM clients.
- class lalandre_rag.llm.factory.RAGLLMClients(provider, model_name, chat_llm, llamaindex_llm)[source]¶
Bases:
objectBundled LLM clients used by RAG modes.
- Parameters:
provider (str)
model_name (str)
chat_llm (Any)
llamaindex_llm (LLM | None)
- lalandre_rag.llm.factory.build_rag_llm_clients(*, provider, model_name, temperature, max_tokens, timeout_seconds, base_url, mistral_base_url, context_window, api_key, mistral_api_key, key_pool=None)[source]¶
Build provider-specific clients for RAG.
When key_pool is provided and contains >1 key, multiple underlying clients are created and dispatched through the shared pool.
Supported providers: mistral, openai_compatible.
- Parameters:
provider (str)
model_name (str)
temperature (float)
max_tokens (int)
timeout_seconds (float)
base_url (str | None)
mistral_base_url (str)
context_window (int)
api_key (str | None)
mistral_api_key (str | None)
key_pool (APIKeyPool | None)
- Return type:
lalandre_rag.models¶
Source: packages/lalandre_rag/lalandre_rag/models/__init__.py
Shared API models for the RAG service.
lalandre_rag.models.api¶
Source: packages/lalandre_rag/lalandre_rag/models/api.py
Shared API models for the RAG service.
These models are the single source of truth for request/response schemas shared between the rag-service and the api-gateway.
- class lalandre_rag.models.api.QueryMetadata[source]¶
Bases:
TypedDictKnown fields in QueryResponse.metadata. Additional mode-specific fields may be present.
- class lalandre_rag.models.api.SourcesResponse(*, total, documents, acts=None, graph_nodes=None, graph_edges=None, cypher_rows=None, community_reports=None, graph_query=None)[source]¶
Bases:
BaseModelStructured sources format from ResponseBuilder
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
total (int)
documents (List[Dict[str, Any]])
acts (Dict[str, Any] | None)
graph_nodes (List[Dict[str, Any]] | None)
graph_edges (List[Dict[str, Any]] | None)
cypher_rows (List[Dict[str, Any]] | None)
community_reports (List[Dict[str, Any]] | None)
graph_query (Dict[str, Any] | None)
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class lalandre_rag.models.api.QueryResponse(*, query_id, question, answer, mode, sources=None, metadata, conversation_id=None, message_id=None)[source]¶
Bases:
BaseModelResponse model for RAG queries
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
query_id (str)
question (str)
answer (str)
mode (str)
sources (SourcesResponse | None)
metadata (Dict[str, Any])
conversation_id (str | None)
message_id (str | None)
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class lalandre_rag.models.api.SearchRequest(*, query=None, query_embedding=None, top_k=None, mode=None, score_threshold=None, granularity=None, embedding_preset=None, include_full_content=False, filters=None)[source]¶
Bases:
BaseModelSearch request (semantic / lexical / hybrid).
top_k,modeandgranularityare optional — when omitted the rag-service resolves them fromSearchConfigdefaults.Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
query (str | None)
query_embedding (List[float] | None)
top_k (int | None)
mode (str | None)
score_threshold (float | None)
granularity (str | None)
embedding_preset (str | None)
include_full_content (bool)
filters (Dict[str, Any] | None)
- validate_query()[source]¶
Ensure at least a text query or a precomputed embedding is provided.
- Return type:
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class lalandre_rag.models.api.SearchResult(*, celex, subdivision_id, chunk_id=None, chunk_index=None, content, score, metadata, trace=None)[source]¶
Bases:
BaseModelSearch result item
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
celex (str | None)
subdivision_id (int)
chunk_id (int | None)
chunk_index (int | None)
content (str)
score (float)
metadata (Dict[str, Any])
trace (Dict[str, Any] | None)
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class lalandre_rag.models.api.SearchResponse(*, search_id, results, total, mode)[source]¶
Bases:
BaseModelSearch response
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
search_id (str)
results (List[SearchResult])
total (int)
mode (str)
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
lalandre_rag.modes¶
Source: packages/lalandre_rag/lalandre_rag/modes/__init__.py
RAG query mode package.
lalandre_rag.modes.hybrid_generation¶
Source: packages/lalandre_rag/lalandre_rag/modes/hybrid_generation.py
Generation strategies for HybridMode — standard and global modes, QA chain execution.
Extracted from hybrid_mode.py to keep the orchestrator focused on the pipeline.
- class lalandre_rag.modes.hybrid_generation.SourceArtifacts(acts, documents, validation_sources, payload, build_ms)[source]¶
Bases:
objectPrepared source payloads shared by sync and streaming generation paths.
- Parameters:
acts (Dict[str, Any])
documents (List[Dict[str, Any]])
validation_sources (List[Dict[str, Any]])
payload (Dict[str, Any] | None)
build_ms (float)
- lalandre_rag.modes.hybrid_generation.run_rag_chain(*, rag_prompt, llm, question, context, graph_context='', chat_history=None)[source]¶
Run the blocking QA chain and return the generated answer text.
- Parameters:
rag_prompt (ChatPromptTemplate)
llm (Any)
question (str)
context (str)
graph_context (str)
chat_history (List[BaseMessage] | None)
- Return type:
str
- lalandre_rag.modes.hybrid_generation.stream_rag_chain(*, rag_prompt, llm, question, context, graph_context='', chat_history=None)[source]¶
Stream answer chunks from the QA chain.
- Parameters:
rag_prompt (ChatPromptTemplate)
llm (Any)
question (str)
context (str)
graph_context (str)
chat_history (List[BaseMessage] | None)
- Return type:
Iterator[str]
- lalandre_rag.modes.hybrid_generation.query_standard_mode(*, question, context_slices, llm, rag_prompt, include_relations, include_subjects, include_full_content, return_sources, graph_fetch=None, chat_history=None, progress_callback=None, preamble_callback=None, token_callback=None, final_answer_callback=None, entity_linker=None, external_detector=None, lightweight_llm=None, cypher_documents=None, cypher_query_meta=None)[source]¶
Run the standard hybrid QA path with optional graph support.
- Parameters:
question (str)
context_slices (List[ContextSlice])
llm (Any)
rag_prompt (ChatPromptTemplate)
include_relations (bool)
include_subjects (bool)
include_full_content (bool)
return_sources (bool)
graph_fetch (GraphFetchResult | None)
chat_history (List[BaseMessage] | None)
progress_callback (Callable[[Dict[str, Any]], None] | None)
preamble_callback (Callable[[Dict[str, Any] | None, Dict[str, Any]], None] | None)
token_callback (Callable[[str], None] | None)
final_answer_callback (Callable[[str], None] | None)
entity_linker (LegalEntityLinker | None)
external_detector (Callable[[str], Any] | None)
lightweight_llm (Any)
cypher_documents (List[Dict[str, Any]] | None)
cypher_query_meta (Dict[str, Any] | None)
- Return type:
Dict[str, Any]
- lalandre_rag.modes.hybrid_generation.query_global_mode(*, question, context_slices, llm, rag_prompt, include_full_content, include_subjects, return_sources, graph_fetch=None, chat_history=None, progress_callback=None, preamble_callback=None, token_callback=None, final_answer_callback=None, entity_linker=None, external_detector=None, lightweight_llm=None, cypher_documents=None, cypher_query_meta=None)[source]¶
Run the global GraphRAG path with community reporting.
- Parameters:
question (str)
context_slices (List[ContextSlice])
llm (Any)
rag_prompt (ChatPromptTemplate)
include_full_content (bool)
include_subjects (bool)
return_sources (bool)
graph_fetch (GraphFetchResult | None)
chat_history (List[BaseMessage] | None)
progress_callback (Callable[[Dict[str, Any]], None] | None)
preamble_callback (Callable[[Dict[str, Any] | None, Dict[str, Any]], None] | None)
token_callback (Callable[[str], None] | None)
final_answer_callback (Callable[[str], None] | None)
entity_linker (LegalEntityLinker | None)
external_detector (Callable[[str], Any] | None)
lightweight_llm (Any)
cypher_documents (List[Dict[str, Any]] | None)
cypher_query_meta (Dict[str, Any] | None)
- Return type:
Dict[str, Any]
lalandre_rag.modes.hybrid_graph¶
Source: packages/lalandre_rag/lalandre_rag/modes/hybrid_graph.py
Graph enrichment for HybridMode — Neo4j expansion and community context.
Extracted from hybrid_mode.py to keep the orchestrator focused on the pipeline.
- lalandre_rag.modes.hybrid_graph.fetch_graph_context(*, act_ids, graph_rag_service, community_enricher, max_depth=None)[source]¶
Fetch graph data for acts found in retrieval results.
Uses act_ids from hybrid retrieval (semantic + BM25) as seeds for Neo4j graph traversal. Returns structured data for ranking/budgeting. Non-fatal: returns None on failure.
- Parameters:
act_ids (Set[int])
graph_rag_service (GraphRAGService)
community_enricher (CommunityContextEnricher | None)
max_depth (int | None)
- Return type:
GraphFetchResult | None
lalandre_rag.modes.hybrid_helpers¶
Source: packages/lalandre_rag/lalandre_rag/modes/hybrid_helpers.py
Helpers for HybridMode — context assembly, source building, metadata, and citation.
Extracted from hybrid_mode.py to keep the orchestrator focused on the pipeline.
- lalandre_rag.modes.hybrid_helpers.emit_progress(callback, *, phase, status, label, detail=None, count=None, duration_ms=None, meta=None)[source]¶
Emit a structured progress event when a callback is configured.
- Parameters:
callback (Callable[[Dict[str, Any]], None] | None)
phase (str)
status (str)
label (str)
detail (str | None)
count (int | None)
duration_ms (float | None)
meta (Dict[str, Any] | None)
- Return type:
None
- lalandre_rag.modes.hybrid_helpers.build_source_context(*, context_slices, max_context_chars, min_chars_per_source, max_sources)[source]¶
Build the source-context block and return (context_text, refs, remaining_chars).
- Parameters:
context_slices (List[ContextSlice])
max_context_chars (int)
min_chars_per_source (int)
max_sources (int)
- Return type:
tuple[str, List[Dict[str, Any]], int]
- lalandre_rag.modes.hybrid_helpers.build_relation_summary(*, context_slices, line_limit)[source]¶
Build a compact relation-signals block for the LLM context.
- Parameters:
context_slices (List[ContextSlice])
line_limit (int)
- Return type:
str
- lalandre_rag.modes.hybrid_helpers.format_reports_block(reports)[source]¶
Render community reports into a text block for the LLM context.
- Parameters:
reports (List[CommunityReport])
- Return type:
str
- lalandre_rag.modes.hybrid_helpers.attach_citation_validation(*, response, answer, sources)[source]¶
Validate mixed source citations in the answer and attach results to metadata.
- Parameters:
response (Dict[str, Any])
answer (str)
sources (List[Dict[str, Any]])
- Return type:
None
- lalandre_rag.modes.hybrid_helpers.build_plan_metadata(*, retrieval_plan, requested_top_k, effective_top_k, requested_granularity, effective_granularity, requested_include_relations, effective_include_relations, retrieval_query, original_question)[source]¶
Serialize a retrieval plan to an audit-friendly metadata dict.
- Parameters:
retrieval_plan (RetrievalPlan)
requested_top_k (int)
effective_top_k (int)
requested_granularity (str | None)
effective_granularity (str | None)
requested_include_relations (bool)
effective_include_relations (bool)
retrieval_query (str)
original_question (str)
- Return type:
Dict[str, Any]
- class lalandre_rag.modes.hybrid_helpers.GraphFetchResult(nodes, relationships, seed_act_ids, expanded_act_ids, community_block='', community_meta=<factory>, duration_ms=0.0)[source]¶
Bases:
objectRaw output from Neo4j graph expansion (before ranking).
- Parameters:
nodes (List[Dict[str, Any]])
relationships (List[Dict[str, Any]])
seed_act_ids (Set[int])
expanded_act_ids (Set[int])
community_block (str)
community_meta (Dict[str, Any])
duration_ms (float)
- lalandre_rag.modes.hybrid_helpers.build_ranked_graph_context(*, fetch_result, semantic_results, max_context_chars, graph_acts_limit, graph_relationships_limit, hop_decay, semantic_boost, relation_weight_factor, budget_semantic_share, budget_graph_share, budget_relation_share, min_chars_per_source, max_depth)[source]¶
Rank graph nodes/relationships and build budget-aware context.
Returns (combined_context_str, metadata_dict, graph_node_refs, relationship_refs).
- Parameters:
fetch_result (GraphFetchResult)
semantic_results (List[Any])
max_context_chars (int)
graph_acts_limit (int)
graph_relationships_limit (int)
hop_decay (float)
semantic_boost (float)
relation_weight_factor (float)
budget_semantic_share (float)
budget_graph_share (float)
budget_relation_share (float)
min_chars_per_source (int)
max_depth (int)
- Return type:
Tuple[str, Dict[str, Any], List[Dict[str, Any]], List[Dict[str, Any]]]
lalandre_rag.modes.hybrid_mode¶
Source: packages/lalandre_rag/lalandre_rag/modes/hybrid_mode.py
Hybrid mode for QA with deterministic routing and context budgeting.
Pipeline orchestrator — delegates generation to hybrid_generation and graph enrichment to hybrid_graph.
- class lalandre_rag.modes.hybrid_mode.HybridMode(retrieval_service, context_service, llm, rag_prompt, graph_rag_service=None, lightweight_llm=None, key_pool=None, entity_linker=None, external_detector=None)[source]¶
Bases:
objectMODE 3: retrieval + generation. Includes a global community-aware path for broad queries.
- Parameters:
retrieval_service (RetrievalService)
context_service (ContextService)
llm (Any)
rag_prompt (ChatPromptTemplate)
graph_rag_service (GraphRAGService | None)
lightweight_llm (Any)
key_pool (APIKeyPool | None)
entity_linker (LegalEntityLinker | None)
external_detector (Callable[[str], Any] | None)
- query(question, top_k=10, score_threshold=None, filters=None, include_relations=False, include_subjects=True, include_full_content=False, return_sources=True, collections=None, granularity=None, chat_history=None, graph_depth=None, use_graph=None, embedding_preset=None, retrieval_depth=None, cypher_documents=None, cypher_query_meta=None)[source]¶
Run the full hybrid pipeline and return a policy-compliant response.
- Parameters:
question (str)
top_k (int)
score_threshold (float | None)
filters (Dict[str, Any] | None)
include_relations (bool)
include_subjects (bool)
include_full_content (bool)
return_sources (bool)
collections (List[str] | None)
granularity (str | None)
chat_history (List[BaseMessage] | None)
graph_depth (int | None)
use_graph (bool | None)
embedding_preset (str | None)
retrieval_depth (str | None)
cypher_documents (List[Dict[str, Any]] | None)
cypher_query_meta (Dict[str, Any] | None)
- Return type:
Dict[str, Any]
- stream_query(question, top_k=10, score_threshold=None, filters=None, include_relations=False, include_subjects=True, include_full_content=False, return_sources=True, collections=None, granularity=None, chat_history=None, graph_depth=None, use_graph=None, embedding_preset=None, retrieval_depth=None, cypher_documents=None, cypher_query_meta=None)[source]¶
Stream query with live progress events emitted from a worker thread.
- Parameters:
question (str)
top_k (int)
score_threshold (float | None)
filters (Dict[str, Any] | None)
include_relations (bool)
include_subjects (bool)
include_full_content (bool)
return_sources (bool)
collections (List[str] | None)
granularity (str | None)
chat_history (List[BaseMessage] | None)
graph_depth (int | None)
use_graph (bool | None)
embedding_preset (str | None)
retrieval_depth (str | None)
cypher_documents (List[Dict[str, Any]] | None)
cypher_query_meta (Dict[str, Any] | None)
- Return type:
Iterator[Dict[str, Any] | str]
lalandre_rag.modes.llm_mode¶
Source: packages/lalandre_rag/lalandre_rag/modes/llm_mode.py
LLM Only Mode Pure LLM generation without retrieval
- class lalandre_rag.modes.llm_mode.LLMMode(llm)[source]¶
Bases:
objectMODE 2: Pure LLM (100% Generation) Generate answer using only LLM knowledge (no retrieval)
Initialize LLM only mode
- Parameters:
llm (Any) – LLM client
lalandre_rag.modes.summarize_mode¶
Source: packages/lalandre_rag/lalandre_rag/modes/summarize_mode.py
Summarize Mode Generate summaries of documents related to a topic
- class lalandre_rag.modes.summarize_mode.SummarizeMode(retrieval_service, context_service, llamaindex_adapter, citation_llm=None, act_summary_service=None, question_summary_service=None)[source]¶
Bases:
objectMODE 4: Summarization Generate summaries using TreeSummarize for hierarchical processing
Initialize summarize mode
- Parameters:
retrieval_service (RetrievalService) – Service for document retrieval
context_service (ContextService) – Service for context enrichment
llamaindex_adapter (LlamaIndexAdapter | None) – LlamaIndex adapter (optional)
citation_llm (Any)
act_summary_service (ActSummaryService | None)
question_summary_service (QuestionSummaryService | None)
- summarize_canonical(*, celex, question)[source]¶
Return a cached canonical summary response when one is available.
- Parameters:
celex (str)
question (str)
- Return type:
Dict[str, Any] | None
- summarize_question(*, topic, top_k, score_threshold, filters, include_relations, include_full_content)[source]¶
Run question summarization, optionally augmented with canonical memory.
- Parameters:
topic (str)
top_k (int)
score_threshold (float | None)
filters (Dict[str, Any] | None)
include_relations (bool)
include_full_content (bool)
- Return type:
Dict[str, Any]
- summarize(topic, top_k=10, score_threshold=None, filters=None, include_relations=True, include_full_content=False)[source]¶
Generate a summary of documents related to a topic
- Parameters:
topic (str) – Topic or question to summarize
top_k (int) – Number of documents to retrieve
filters (Dict[str, Any] | None) – Metadata filters
include_relations (bool) – Include relations in context
score_threshold (float | None)
include_full_content (bool)
- Returns:
Dictionary with summary and sources
- Return type:
Dict[str, Any]
- class lalandre_rag.modes.summarize_mode.CompareMode(retrieval_service, context_service, llamaindex_adapter, citation_llm=None, question_summary_service=None)[source]¶
Bases:
objectMODE 5: Comparison Compare multiple legal documents
Initialize compare mode
- Parameters:
retrieval_service (RetrievalService) – Service for document retrieval
context_service (ContextService) – Service for context enrichment
llamaindex_adapter (LlamaIndexAdapter | None) – LlamaIndex adapter (optional)
citation_llm (Any)
question_summary_service (QuestionSummaryService | None)
- compare(comparison_question, celex_list=None, top_k=10, score_threshold=None, include_full_content=False)[source]¶
Compare multiple legal documents
- Parameters:
comparison_question (str) – What to compare
celex_list (List[str] | None) – Optional list of specific CELEX to compare
top_k (int) – Number of documents if CELEX not specified
score_threshold (float | None)
include_full_content (bool)
- Returns:
Dictionary with comparison and sources
- Return type:
Dict[str, Any]
lalandre_rag.ner_external¶
Source: packages/lalandre_rag/lalandre_rag/ner_external.py
Glue between the NER service and prose_linker’s external_detector hook.
The NER service returns free-text spans like ("directive 2014/65/UE", 12, 32,
"directive", 0.91). prose_linker needs ExternalDetection instances
already resolved to an internal act_id. This module bridges the two:
Call the NER service (or any other zero-shot detector) to find candidate spans the regex layer might miss (paraphrases, fuzzy mentions).
Run each candidate through
LegalEntityLinkerto resolve to anact_id. Spans the linker cannot resolve (or resolves with low confidence / fallback method) are dropped — never link a span we cannot back with a chunk.Return the resolved spans as
ExternalDetectionfor the linker to merge.
The factory build_ner_external_detector is what callers use; it returns
None when no NER service URL is configured, so the rest of the pipeline
keeps the regex-only behaviour with zero overhead.
- lalandre_rag.ner_external.build_ner_external_detector(ner_client, linker, *, min_span_score=0.5, min_link_score=0.85)[source]¶
Return an
ExternalDetectorcallable backed by the NER service.- Parameters:
ner_client (NerClient) – Configured client for the NER service.
linker (LegalEntityLinker) – Same linker used for regex-based resolution; reused here to translate NER text spans into internal
act_idvalues.min_span_score (float) – Drop NER spans below this confidence threshold.
min_link_score (float) – Drop linker resolutions below this score.
- Return type:
Callable[[str], Sequence[ExternalDetection]]
The returned callable is safe to invoke on every answer: errors are swallowed and an empty list is returned, so the regex layer always wins by default.
lalandre_rag.prompts¶
Source: packages/lalandre_rag/lalandre_rag/prompts/__init__.py
Centralized prompt loaders for lalandre_rag.
All prompt text lives in prompts/ to keep code and content separated.
- lalandre_rag.prompts.get_langchain_prompt(prompt_type, *, with_history=False)[source]¶
Return the LangChain chat prompt for the given prompt_type.
When with_history is
TrueaMessagesPlaceholder("chat_history")is inserted between the system and human messages so that conversation history can be injected at invocation time. The placeholder is markedoptional=Trueso that callers without history can simply omit the key (or pass an empty list) and the prompt remains unchanged.- Parameters:
prompt_type (str)
with_history (bool)
- Return type:
ChatPromptTemplate
- lalandre_rag.prompts.get_llamaindex_prompt(prompt_type)[source]¶
Return the LlamaIndex prompt template for summary/comparison (with fallback).
- Parameters:
prompt_type (str)
- Return type:
PromptTemplate
- lalandre_rag.prompts.render_llm_only_prompt(*, question)[source]¶
Prompt used by LLM-only mode (no retrieval).
- Parameters:
question (str)
- Return type:
str
- lalandre_rag.prompts.render_planner_prompt(*, question)[source]¶
Prompt for the retrieval planner that decides multi-step strategy.
- Parameters:
question (str)
- Return type:
str
- lalandre_rag.prompts.render_compressor_prompt(*, celex, title, level, fragments, max_chars)[source]¶
Prompt for context compression of multiple fragments from one act.
- Parameters:
celex (str)
title (str)
level (str)
fragments (str)
max_chars (int)
- Return type:
str
- lalandre_rag.prompts.render_nl_to_cypher_prompt(*, question, max_graph_depth, row_limit)[source]¶
System prompt to translate natural language to Cypher (graph_helpers).
- Parameters:
question (str)
max_graph_depth (int)
row_limit (int)
- Return type:
str
lalandre_rag.prose_linker¶
Source: packages/lalandre_rag/lalandre_rag/prose_linker.py
Post-process LLM responses to make regulatory references clickable.
Uses the shared LegalEntityLinker to detect explicit identifiers (CELEX, EU
refs, national authority refs) and combined article N du <act> patterns in
the final answer text, then wraps each resolved reference in a markdown link
pointing to the library route (/library/acts/:act_id[#sub-:subdivision_id]).
Existing markdown links and citation tags ([S1], [G1], [R1],
[C1], [CM1]) are preserved — we never rewrite content inside those
regions.
- lalandre_rag.prose_linker.link_prose(text, linker, *, min_score=0.85, resolve_articles=True, allowed_act_ids=None, external_detector=None)[source]¶
Return
textwith regulatory references wrapped as markdown links.Each detected reference is linked to its library page (and optionally to a specific article subdivision). Existing markdown links and citation tags are preserved unmodified. Fallback resolutions (unvalidated) and generic targets are never linked.
If
allowed_act_idsis provided (non-None), only references whose resolvedact_idis in the set are linked. Mentions of acts not in the RAG source set remain as plain text — this prevents the UI from promising a “click to see the passage” that leads to an empty panel. PassNone(default) to disable the filter.If
external_detectoris provided, its detections are merged with the regex+fuzzy ones from this module. External detections take precedence on overlap. Designed to accept a locally-hosted third-party detector (e.g. Ref2Link) without coupling this module to it.- Parameters:
text (str)
linker (LegalEntityLinker)
min_score (float)
resolve_articles (bool)
allowed_act_ids (Set[int] | None)
external_detector (Callable[[str], Sequence[ExternalDetection]] | None)
- Return type:
str
- class lalandre_rag.prose_linker.ExternalDetection(start, end, act_id, subdivision_id=None, eli=None)[source]¶
Bases:
objectA legal-reference span detected by an external detector.
Used to plug a third-party detector (e.g. a locally-hosted Ref2Link service) next to our native regex+fuzzy engine. The external source must return spans resolved to an internal
act_id— translation from their identifier space (ELI URI, CELEX, …) to our DB id stays the caller’s responsibility.- Parameters:
start (int)
end (int)
act_id (int)
subdivision_id (int | None)
eli (str | None)
lalandre_rag.prose_rewriter¶
Source: packages/lalandre_rag/lalandre_rag/prose_rewriter.py
Post-process a chatbot answer that contains bullets into flowing prose.
Safety rails (every failure falls back to the original answer):
Skip rewriting when the answer is already mostly prose.
Reject the rewrite if any native citation tag such as
[S1],[G1],[R1],[C1], or[CM1]with an optional, L1/L2/L3suffix is altered or lost.Reject the rewrite if its length drifts too far from the original answer.
Catch any LLM exception silently.
This is best-effort: the return value is always a valid answer, and citations are preserved with the same multiplicity as the input.
- lalandre_rag.prose_rewriter.rewrite_to_prose(answer, llm, *, max_bullet_ratio=0.10)[source]¶
Rewrite a bullet-heavy answer into flowing prose.
Returns
answerunchanged if:llmisNoneoransweris empty/whitespace.The bullet ratio is below
max_bullet_ratio(nothing to rewrite).The system prompt is missing.
The LLM call fails or returns an unusable payload.
The rewritten output is out of bounds in length.
The rewritten output does not preserve the native citation tags with identical multiplicity.
- Parameters:
answer (str)
llm (Any)
max_bullet_ratio (float)
- Return type:
str
lalandre_rag.response¶
Source: packages/lalandre_rag/lalandre_rag/response/__init__.py
Response building — builder, factories, fallbacks.
lalandre_rag.response.builder¶
Source: packages/lalandre_rag/lalandre_rag/response/builder.py
Response Builder Centralized builder for unified response format across all RAG modes
- class lalandre_rag.response.builder.SourcesBlock(*, total=0, documents=<factory>, acts=<factory>)[source]¶
Bases:
BaseModelValidated sources block in a RAG response.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
total (int)
documents (List[Dict[str, Any]])
acts (Dict[str, Any])
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class lalandre_rag.response.builder.RAGResponse(*, mode, query, answer=None, sources=<factory>, metadata=<factory>)[source]¶
Bases:
BaseModelValidated unified response format for all RAG modes.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
mode (str)
query (str)
answer (str | None)
sources (SourcesBlock)
metadata (Dict[str, Any])
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- lalandre_rag.response.builder.format_doc_location(chunk_id, chunk_index, subdivision_type, subdivision_id)[source]¶
Formate la localisation d’une tranche de document (chunk ou subdivision).
Ex. : “chunk 42:3” ou “article 12” Utilisé pour construire les en-têtes de sources dans tous les modes.
- Parameters:
chunk_id (int | None)
chunk_index (int | None)
subdivision_type (str)
subdivision_id (int)
- Return type:
str
- lalandre_rag.response.builder.format_source_header(source_id, celex, location, title, regulatory_level=None)[source]¶
Formate l’en-tête standard inséré dans le contexte LLM.
Ex. : “[S1 | CELEX: 32016R0679 | L1 | article 5] Règlement général…”
- Parameters:
source_id (str)
celex (str)
location (str)
title (str)
regulatory_level (str | None)
- Return type:
str
- class lalandre_rag.response.builder.ResponseBuilder(mode, query, _answer=None, _sources=<factory>, _metadata=<factory>, _acts=<factory>)[source]¶
Bases:
objectBuilder for the unified response format
- Usage:
builder = ResponseBuilder(mode=”search”, query=”test”) builder.set_answer(None) builder.set_sources([{“celex”: “123”, “title”: “Test”}]) builder.add_metadata(“warning”, “Test warning”) response = builder.build()
- Parameters:
mode (str)
query (str)
_answer (str | None)
_sources (List[Dict[str, Any]])
_metadata (Dict[str, Any])
_acts (Dict[str, Any])
- set_sources(documents)[source]¶
Replace all source documents.
- Parameters:
documents (List[Dict[str, Any]])
- Return type:
- add_metadata(key, value)[source]¶
Add a metadata entry.
- Parameters:
key (str)
value (Any)
- Return type:
- lalandre_rag.response.builder.build_source_trace(metadata)[source]¶
Extract a compact, traceable subset of metadata for sources. Avoids duplicating large payload fields while preserving retrieval provenance.
- Parameters:
metadata (Dict[str, Any] | None)
- Return type:
Dict[str, Any]
- lalandre_rag.response.builder.build_source_document(doc, *, include_relations=False, include_subjects=False, include_full_content=True, include_content_preview=False, content_preview_length=None, include_snippet=False, snippet_length=None, content_used=None, content_truncated=None, source_id=None)[source]¶
Build a standardized source document payload from a context slice.
This keeps response formatting consistent across RAG modes while reusing upstream metadata where possible.
- Parameters:
doc (Any)
include_relations (bool)
include_subjects (bool)
include_full_content (bool)
include_content_preview (bool)
content_preview_length (int | None)
include_snippet (bool)
snippet_length (int | None)
content_used (str | None)
content_truncated (bool | None)
source_id (str | None)
- Return type:
Dict[str, Any]
- lalandre_rag.response.builder.build_act_context(act)[source]¶
Build a normalized act context payload.
- Parameters:
act (Any)
- Return type:
Dict[str, Any]
lalandre_rag.response.factories¶
Source: packages/lalandre_rag/lalandre_rag/response/factories.py
Response factory functions for each RAG mode.
- lalandre_rag.response.factories.create_llm_only_response(query, answer, include_warning=True)[source]¶
Factory pour réponse mode llm_only.
- Parameters:
query (str)
answer (str)
include_warning (bool)
- Return type:
Dict[str, Any]
- lalandre_rag.response.factories.create_rag_response(query, answer, documents, context_summary=None, acts=None)[source]¶
Factory pour réponse mode rag (hybrid RAG).
- Parameters:
query (str)
answer (str)
documents (List[Dict[str, Any]])
context_summary (Dict[str, Any] | None)
acts (Dict[str, Dict[str, Any]] | None)
- Return type:
Dict[str, Any]
- lalandre_rag.response.factories.create_summarize_response(query, answer, documents, acts=None)[source]¶
Factory for summarize-mode response.
- Parameters:
query (str)
answer (str)
documents (List[Dict[str, Any]])
acts (Dict[str, Dict[str, Any]] | None)
- Return type:
Dict[str, Any]
- lalandre_rag.response.factories.create_compare_response(query, answer, documents, documents_compared, acts=None)[source]¶
Factory for compare-mode response.
- Parameters:
query (str)
answer (str)
documents (List[Dict[str, Any]])
documents_compared (List[str])
acts (Dict[str, Dict[str, Any]] | None)
- Return type:
Dict[str, Any]
- lalandre_rag.response.factories.create_empty_response(mode, query, empty_message=None)[source]¶
Factory for an empty response (no results).
Si empty_message est None, un message par défaut adapté au mode est utilisé. Passer une string vide pour ne pas afficher de message.
- Parameters:
mode (str)
query (str)
empty_message (str | None)
- Return type:
Dict[str, Any]
- lalandre_rag.response.factories.validate_response_format(response)[source]¶
Validate that a response respects the unified format.
With Pydantic models built via ResponseBuilder.build(), validation is already guaranteed at construction time. This function remains for external callers that pass raw dicts.
- Parameters:
response (Dict[str, Any] | RAGResponse)
- Return type:
bool
lalandre_rag.response.fallbacks¶
Source: packages/lalandre_rag/lalandre_rag/response/fallbacks.py
Fallback answer builders for degraded-mode responses.
- lalandre_rag.response.fallbacks.flatten_source_items(sources)[source]¶
Flatten every evidence list carried in a
sourcespayload.- Parameters:
sources (Dict[str, Any] | None)
- Return type:
List[Dict[str, Any]]
- lalandre_rag.response.fallbacks.build_retrieval_fallback_answer(*, mode, question, documents, reason)[source]¶
Build a user-facing fallback answer when LLM generation fails.
Lists up to 3 retrieved documents so the user still gets value.
- Parameters:
mode (str)
question (str)
documents (List[Dict[str, Any]])
reason (str)
- Return type:
str
- lalandre_rag.response.fallbacks.build_no_source_blocked_answer(mode)[source]¶
Return the deterministic fail-closed answer for sourced modes.
- Parameters:
mode (str)
- Return type:
str
- lalandre_rag.response.fallbacks.build_invalid_citation_blocked_answer(mode)[source]¶
Return the fail-closed answer when sources exist but citations are invalid.
- Parameters:
mode (str)
- Return type:
str
- lalandre_rag.response.fallbacks.describe_citation_validation_failure(validation)[source]¶
Return a user-facing explanation for the current citation-validation failure.
- Parameters:
validation (Dict[str, Any] | None)
- Return type:
str
- lalandre_rag.response.fallbacks.create_blocked_sourced_response(*, mode, query, reason, answer=None, metadata=None, sources=None)[source]¶
Return a fail-closed sourced-mode response, preserving sources when available.
- Parameters:
mode (str)
query (str)
reason (str)
answer (str | None)
metadata (Dict[str, Any] | None)
sources (Dict[str, Any] | None)
- Return type:
Dict[str, Any]
- lalandre_rag.response.fallbacks.normalize_sources_payload(sources)[source]¶
Normalize empty source payloads to
Noneand keep non-empty blocks coherent.- Parameters:
sources (Dict[str, Any] | None)
- Return type:
Dict[str, Any] | None
- lalandre_rag.response.fallbacks.merge_sources_payload(base_sources, extra_sources)[source]¶
Merge two source payloads while preserving all evidence families.
- Parameters:
base_sources (Dict[str, Any] | None)
extra_sources (Dict[str, Any] | None)
- Return type:
Dict[str, Any] | None
- lalandre_rag.response.fallbacks.extract_source_ids(sources)[source]¶
Collect available source IDs from a source-doc list.
- Parameters:
sources (List[Dict[str, Any]])
- Return type:
List[str]
- lalandre_rag.response.fallbacks.repair_citations_once(*, mode, question, draft_answer, sources, llm)[source]¶
Try a single citation-repair pass. Returns
Noneon failure.- Parameters:
mode (str)
question (str)
draft_answer (str)
sources (List[Dict[str, Any]])
llm (Any)
- Return type:
str | None
- lalandre_rag.response.fallbacks.enforce_cited_answer(*, mode, question, draft_answer, sources, llm)[source]¶
Validate citations without rewriting the draft.
Preserves the streamed answer verbatim so the UI never sees its text flashed and replaced. Validation results are still returned so callers can surface citation quality in metadata, but no LLM repair pass runs and the answer is never blanked out.
- Parameters:
mode (str)
question (str)
draft_answer (str)
sources (List[Dict[str, Any]])
llm (Any)
- Return type:
Dict[str, Any]
lalandre_rag.response.policy¶
Source: packages/lalandre_rag/lalandre_rag/response/policy.py
Adaptive response policy for RAG outputs.
- class lalandre_rag.response.policy.ResponsePolicyDecision(state, reason, label, intent_class, evidence_grade, citation_status, can_use_sources, should_run_cypher, clarification_question=None)[source]¶
Bases:
objectFinal policy decision for a RAG response.
- Parameters:
state (Literal['llm_only', 'grounded', 'weakly_grounded', 'clarify', 'hard_block'])
reason (str)
label (str)
intent_class (Literal['conversational', 'documentary'])
evidence_grade (Literal['none', 'weak', 'sufficient'])
citation_status (Literal['not_applicable', 'valid', 'repaired', 'invalid'])
can_use_sources (bool)
should_run_cypher (bool)
clarification_question (str | None)
- lalandre_rag.response.policy.is_anchored_legal_question(*, question, retrieval_profile=None)[source]¶
Return whether the user question is anchored enough for strict fail-closed behavior.
- Parameters:
question (str)
retrieval_profile (str | None)
- Return type:
bool
- lalandre_rag.response.policy.infer_intent_class(*, intent_class, skip_retrieval=False)[source]¶
Infer the high-level intent class used by the response policy.
- Parameters:
intent_class (str | None)
skip_retrieval (bool)
- Return type:
Literal[‘conversational’, ‘documentary’]
- lalandre_rag.response.policy.infer_evidence_grade(*, has_sources, crag_meta=None)[source]¶
Infer evidence strength from retrieval availability and CRAG metadata.
- Parameters:
has_sources (bool)
crag_meta (Dict[str, Any] | None)
- Return type:
Literal[‘none’, ‘weak’, ‘sufficient’]
- lalandre_rag.response.policy.infer_citation_status(*, validation, repaired=False)[source]¶
Infer citation validity from the validation payload.
- Parameters:
validation (Dict[str, Any] | None)
repaired (bool)
- Return type:
Literal[‘not_applicable’, ‘valid’, ‘repaired’, ‘invalid’]
- lalandre_rag.response.policy.decide_pre_generation(*, intent_class, evidence_grade, question, retrieval_profile=None, clarification_question=None, strict_grounding_requested=False)[source]¶
Choose the policy branch before answer generation starts.
- Parameters:
intent_class (Literal['conversational', 'documentary'])
evidence_grade (Literal['none', 'weak', 'sufficient'])
question (str)
retrieval_profile (str | None)
clarification_question (str | None)
strict_grounding_requested (bool)
- Return type:
- lalandre_rag.response.policy.decide_post_generation(*, intent_class, evidence_grade, citation_status, question, has_sources, retrieval_profile=None, clarification_question=None, strict_grounding_requested=False)[source]¶
Choose the policy branch after generation and citation validation.
- Parameters:
intent_class (Literal['conversational', 'documentary'])
evidence_grade (Literal['none', 'weak', 'sufficient'])
citation_status (Literal['not_applicable', 'valid', 'repaired', 'invalid'])
question (str)
has_sources (bool)
retrieval_profile (str | None)
clarification_question (str | None)
strict_grounding_requested (bool)
- Return type:
- lalandre_rag.response.policy.flatten_policy_sources(sources)[source]¶
Flatten every supported source list into one homogeneous sequence.
- Parameters:
sources (Dict[str, Any] | None)
- Return type:
List[Dict[str, Any]]
lalandre_rag.response.source_builder¶
Source: packages/lalandre_rag/lalandre_rag/response/source_builder.py
Build final source document lists from enriched context references.
- lalandre_rag.response.source_builder.build_sources(*, refs, include_relations, include_subjects, include_full_content)[source]¶
Build final source documents from context refs.
- Parameters:
refs (List[Dict[str, Any]])
include_relations (bool)
include_subjects (bool)
include_full_content (bool)
- Return type:
List[Dict[str, Any]]
lalandre_rag.retrieval¶
Source: packages/lalandre_rag/lalandre_rag/retrieval/__init__.py
Retrieval Service Module Combines semantic search (Qdrant) and lexical search (PostgreSQL BM25) Implements Reciprocal Rank Fusion and weighted score combination
lalandre_rag.retrieval.bm25_search¶
Source: packages/lalandre_rag/lalandre_rag/retrieval/bm25_search.py
BM25 Lexical Search Service PostgreSQL full-text search with BM25-like ranking (ts_rank_cd)
- class lalandre_rag.retrieval.bm25_search.BM25SearchService(pg_repo, language='french', payload_builder=None)[source]¶
Bases:
objectBM25-based lexical search using PostgreSQL full-text search
Uses PostgreSQL’s ts_rank_cd (Cover Density Ranking) which provides BM25-like scoring that considers: - Term frequency (TF) - Document length normalization - Cover density (proximity of terms)
Responsibilities: - Execute BM25 search via PostgreSQL - Convert PostgreSQL results to RetrievalResult format - Apply filters and language configuration - Manage full-text search indexes
Does NOT: - Fuse with semantic results (handled by RetrievalService) - Generate embeddings - Access Qdrant
Initialize BM25 search service
- Parameters:
pg_repo (PostgresRepository) – PostgreSQL repository for text search
language (str) – PostgreSQL text search language configuration
payload_builder (PayloadBuilder | None)
- search(query, top_k=None, filters=None, language=None, target='subdivisions')[source]¶
Execute BM25 lexical search
- Parameters:
query (str) – Search query text
top_k (int | None) – Number of results to return (default: config.search.default_limit)
filters (Dict[str, Any] | None) – Optional metadata filters (e.g., {“act_id”: 123, “celex”: “32016R0679”})
language (str | None) – Override default language (default: “french”)
target (str) – “subdivisions” or “chunks”
- Returns:
List of RetrievalResult objects sorted by BM25 score
- Return type:
list[RetrievalResult]
lalandre_rag.retrieval.context¶
Source: packages/lalandre_rag/lalandre_rag/retrieval/context/__init__.py
Context Service Enriches retrieval results with full metadata and relationships
lalandre_rag.retrieval.context.community_reports¶
Source: packages/lalandre_rag/lalandre_rag/retrieval/context/community_reports.py
Deterministic community report builder for graph-aware global RAG mode.
- class lalandre_rag.retrieval.context.community_reports.ActMeta[source]¶
Bases:
TypedDictMinimal act metadata used while assembling community reports.
- class lalandre_rag.retrieval.context.community_reports.RelationRow[source]¶
Bases:
TypedDictNormalized relation row used by the report builder.
- class lalandre_rag.retrieval.context.community_reports.RelationTypeCount[source]¶
Bases:
TypedDictRelation-type histogram entry for one community.
- class lalandre_rag.retrieval.context.community_reports.CentralAct[source]¶
Bases:
TypedDictCentral act description used in community summaries.
- class lalandre_rag.retrieval.context.community_reports.CommunityReport(community_id, act_ids, celexes, relation_count, top_relation_types, central_acts, evidences, summary)[source]¶
Bases:
objectCompact summary of one connected component in the relation graph.
- Parameters:
community_id (str)
act_ids (List[int])
celexes (List[str])
relation_count (int)
top_relation_types (List[RelationTypeCount])
central_acts (List[CentralAct])
evidences (List[str])
summary (str)
- class lalandre_rag.retrieval.context.community_reports.CommunityReportBuilder(*, max_reports=6, min_cluster_size=2, max_evidence_per_report=3, top_relation_types_limit=5, central_acts_limit=3)[source]¶
Bases:
objectBuild deterministic community reports from context slices and act relations.
The algorithm is intentionally lightweight: - keep only relations between acts present in the retrieved context, - build connected components, - summarize each component with relation distribution and pivot acts.
- Parameters:
max_reports (int)
min_cluster_size (int)
max_evidence_per_report (int)
top_relation_types_limit (int)
central_acts_limit (int)
- build_reports(slices)[source]¶
Build deterministic community reports from enriched context slices.
- Parameters:
slices (List[ContextSlice])
- Return type:
List[CommunityReport]
lalandre_rag.retrieval.context.compressor¶
Source: packages/lalandre_rag/lalandre_rag/retrieval/context/compressor.py
Context Compressor — reduces context size via per-act LLM summarization.
Groups context slices by act, and for acts exceeding a character budget, uses an LLM call to compress the fragments into a dense summary. Preserves the ContextSlice structure so downstream code is unchanged.
- lalandre_rag.retrieval.context.compressor.compress_context(slices, llm, *, budget_chars, max_slices=20)[source]¶
Compress context slices to fit within a character budget.
Strategy: 1. Group slices by act_id 2. For acts with multiple large slices, compress into a single dense slice 3. Keep single/small slices as-is 4. Return compressed slices sorted by original score (best first)
- Parameters:
slices (List[ContextSlice]) – Input context slices (already scored and sorted)
llm (Any) – LangChain-compatible LLM for compression calls
budget_chars (int) – Target total character budget
max_slices (int) – Max slices to keep after compression
- Returns:
Compressed list of ContextSlice objects
- Return type:
List[ContextSlice]
lalandre_rag.retrieval.context.models¶
Source: packages/lalandre_rag/lalandre_rag/retrieval/context/models.py
Context Models Clean separation between act-level metadata, document metadata, and content slices.
- class lalandre_rag.retrieval.context.models.DocumentMeta(source_kind, subdivision_id, subdivision_type, sequence_order, chunk_id=None, chunk_index=None, char_start=None, char_end=None, payload=None)[source]¶
Bases:
objectDocument-level metadata for a retrieved slice.
- Parameters:
source_kind (str)
subdivision_id (int)
subdivision_type (str)
sequence_order (int)
chunk_id (int | None)
chunk_index (int | None)
char_start (int | None)
char_end (int | None)
payload (Dict[str, Any] | None)
- class lalandre_rag.retrieval.context.models.ActContext(act_id, celex, title, act_type, regulatory_level=None, url_eurlex=None, relations=None, subjects=None, adoption_date=None, force_date=None)[source]¶
Bases:
objectAct-level metadata, shared across multiple slices.
- Parameters:
act_id (int)
celex (str)
title (str)
act_type (str)
regulatory_level (str | None)
url_eurlex (str | None)
relations (List[Dict[str, Any]] | None)
subjects (List[str] | None)
adoption_date (str | None)
force_date (str | None)
- class lalandre_rag.retrieval.context.models.ContextSlice(content, score, act, doc, trace=None)[source]¶
Bases:
objectA context slice used by the LLM, with explicit act + doc metadata separation.
- Parameters:
content (str)
score (float)
act (ActContext)
doc (DocumentMeta)
trace (Dict[str, Any] | None)
lalandre_rag.retrieval.context.service¶
Source: packages/lalandre_rag/lalandre_rag/retrieval/context/service.py
Context Service Enriches retrieval results with metadata, relationships, and formatting
- class lalandre_rag.retrieval.context.service.RelationPayload[source]¶
Bases:
TypedDictNormalized relation payload attached to enriched act contexts.
- class lalandre_rag.retrieval.context.service.ContextService(pg_repo)[source]¶
Bases:
objectEnriches retrieval results into context slices
Responsibilities: - Enrich results with act metadata (title, type, CELEX) - Add relationships between documents - Format context for LLM consumption - Generate context summaries
Initialize context service
- Parameters:
pg_repo (PostgresRepository) – PostgreSQL repository for metadata queries
- enrich_results(results, include_relations=False, include_subjects=False, hydrate_content=True)[source]¶
Enrich retrieval results with act metadata, optional relations and subjects. Returns context slices with explicit act/doc separation.
- Parameters:
results (List[RetrievalResult])
include_relations (bool)
include_subjects (bool)
hydrate_content (bool)
- Return type:
List[ContextSlice]
lalandre_rag.retrieval.decomposer¶
Source: packages/lalandre_rag/lalandre_rag/retrieval/decomposer.py
Compatibility facade for the agentic decomposition runtime.
lalandre_rag.retrieval.evaluator¶
Source: packages/lalandre_rag/lalandre_rag/retrieval/evaluator.py
Compatibility facade for the agentic evaluation runtime.
lalandre_rag.retrieval.fusion_service¶
Source: packages/lalandre_rag/lalandre_rag/retrieval/fusion_service.py
Result Fusion Service Algorithms for combining search results from multiple sources
- class lalandre_rag.retrieval.fusion_service.ResultFusionService(fusion_method='rrf', lexical_weight=0.3, semantic_weight=0.7, rrf_k=60)[source]¶
Bases:
objectFusion service for combining search results
Implements multiple fusion algorithms: - Reciprocal Rank Fusion (RRF) - rank-based fusion - Weighted Score Fusion - score-based fusion with weights - Score normalization utilities
Use cases: - Combine BM25 (lexical) + semantic search - Combine multiple semantic searches - Combine Graph RAG + standard RAG - Multi-stage retrieval pipelines
Responsibilities: - Implement fusion algorithms (RRF, weighted) - Deduplicate results by subdivision_id - Normalize scores to [0, 1] range - Preserve metadata from all sources
Does NOT: - Execute searches (uses search services) - Generate embeddings - Access databases
Initialize fusion service
- Parameters:
fusion_method (str) – “rrf” or “weighted”
lexical_weight (float) – Weight for lexical scores (weighted method)
semantic_weight (float) – Weight for semantic scores (weighted method)
rrf_k (int) – RRF constant (typically 60)
- fuse(lexical_results, semantic_results, override_lexical_weight=None, override_semantic_weight=None)[source]¶
Fuse lexical and semantic search results.
When override weights are provided the method always uses weighted score fusion regardless of the configured fusion_method. This lets callers apply dynamic weights (e.g. boosted lexical weight for queries containing explicit legal references) without changing the service-level default.
- Parameters:
lexical_results (Sequence[RetrievalResult]) – BM25 or other lexical search results
semantic_results (Sequence[RetrievalResult]) – Vector-based semantic search results
override_lexical_weight (float | None) – Forces weighted fusion with this lexical weight
override_semantic_weight (float | None) – Forces weighted fusion with this semantic weight
- Returns:
Fused and sorted results
- Return type:
List[RetrievalResult]
- weighted_score_fusion(lexical_results, semantic_results, lexical_weight=None, semantic_weight=None)[source]¶
Weighted score fusion
Combines scores using weighted average: combined_score = lexical_weight * lex_score + semantic_weight * sem_score
- Parameters:
lexical_results (Sequence[RetrievalResult]) – Lexical search results (with scores)
semantic_results (Sequence[RetrievalResult]) – Semantic search results (with scores)
lexical_weight (float | None) – Weight for lexical scores (default: instance weight)
semantic_weight (float | None) – Weight for semantic scores (default: instance weight)
- Returns:
Fused results sorted by combined score (descending)
- Return type:
List[RetrievalResult]
- reciprocal_rank_fusion(lexical_results, semantic_results, k=None)[source]¶
Reciprocal Rank Fusion (RRF)
RRF formula: RRF(d) = sum(1 / (k + rank(d))) where k is a constant (typically 60) and rank starts at 1
RRF is score-agnostic and only considers ranking position, making it robust to score distribution differences.
- Parameters:
lexical_results (Sequence[RetrievalResult]) – Lexical search results (pre-sorted)
semantic_results (Sequence[RetrievalResult]) – Semantic search results (pre-sorted)
k (int | None) – RRF constant (default: instance rrf_k, typically 60)
- Returns:
Fused results sorted by RRF score (descending)
- Return type:
List[RetrievalResult]
lalandre_rag.retrieval.metrics¶
Source: packages/lalandre_rag/lalandre_rag/retrieval/metrics.py
Metrics hook interfaces for RAG retrieval instrumentation.
This package remains backend-agnostic. Services register a recorder backend (Prometheus, OTEL, etc.) at startup.
- class lalandre_rag.retrieval.metrics.RetrievalMetricsRecorder[source]¶
Bases:
ABCBackend-agnostic hook interface for retrieval metrics.
- lalandre_rag.retrieval.metrics.set_retrieval_metrics_recorder(recorder)[source]¶
Register the active retrieval metrics recorder backend.
- Parameters:
recorder (RetrievalMetricsRecorder)
- Return type:
None
lalandre_rag.retrieval.overview¶
Source: packages/lalandre_rag/lalandre_rag/retrieval/overview.py
User-facing retrieval overview helpers.
- lalandre_rag.retrieval.overview.build_retrieval_overview(items, *, effective_granularity, candidate_counts=None, top_acts_limit=3)[source]¶
Aggregate textual evidence into a product-facing hierarchy overview.
- Parameters:
items (Iterable[Any])
effective_granularity (str | None)
candidate_counts (Dict[str, int] | None)
top_acts_limit (int)
- Return type:
Dict[str, Any]
lalandre_rag.retrieval.planner¶
Source: packages/lalandre_rag/lalandre_rag/retrieval/planner.py
Compatibility facade for the agentic planner runtime.
lalandre_rag.retrieval.query_expansion¶
Source: packages/lalandre_rag/lalandre_rag/retrieval/query_expansion.py
Legal query expansion utilities.
Provides deterministic multi-query expansion for EU/FR legal retrieval.
- class lalandre_rag.retrieval.query_expansion.ExpandedQuery(text, weight, strategy)[source]¶
Bases:
objectOne expanded query candidate with a weighting hint.
- Parameters:
text (str)
weight (float)
strategy (str)
- class lalandre_rag.retrieval.query_expansion.LegalQueryExpansionService(*, min_query_chars=24)[source]¶
Bases:
objectDeterministic query expansion focused on legal references (UE/France).
The objective is recall improvement while keeping runtime bounded.
- Parameters:
min_query_chars (int)
- expand(query, *, max_variants=3)[source]¶
Expand a query into deterministic variants.
Always returns at least one query (the normalized original).
- Parameters:
query (str)
max_variants (int)
- Return type:
List[ExpandedQuery]
lalandre_rag.retrieval.query_parser¶
Source: packages/lalandre_rag/lalandre_rag/retrieval/query_parser.py
LLM-assisted query parsing for legal retrieval routing.
Note d’architecture¶
Ici, le parser d’intention utilise le LLM de génération principal (config.generation.*) — même provider, même modèle, clé depuis Vault.
- class lalandre_rag.retrieval.query_parser.ParsedQueryIntent(*, profile, granularity=None, top_k=10, include_relations_hint=False, execution_mode='hybrid', rationale='LLM parser selected retrieval profile.', use_graph=False, normalized_query=None, intent_label=None, confidence=None, output_validation_retries=0)[source]¶
Bases:
BaseModelNormalized interpretation returned by the intent parser.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
profile (str)
granularity (str | None)
top_k (int)
include_relations_hint (bool)
execution_mode (str)
rationale (str)
use_graph (bool)
normalized_query (str | None)
intent_label (str | None)
confidence (float | None)
output_validation_retries (int)
- model_config: ClassVar[ConfigDict] = {'extra': 'ignore', 'frozen': True}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- classmethod normalize_profile(v)[source]¶
Resolve profile aliases and reject unsupported routing profiles.
- Parameters:
v (Any)
- Return type:
str
- classmethod normalize_granularity(v)[source]¶
Normalize requested granularity and collapse
autotoNone.- Parameters:
v (Any)
- Return type:
str | None
- classmethod coerce_top_k(v)[source]¶
Clamp
top_kto the configured parser-safe range.- Parameters:
v (Any)
- Return type:
int
- classmethod coerce_bool(v)[source]¶
Coerce common truthy string values to booleans.
- Parameters:
v (Any)
- Return type:
bool
- classmethod coerce_confidence(v)[source]¶
Normalize optional confidence scores to the
[0, 1]range.- Parameters:
v (Any)
- Return type:
float | None
- classmethod normalize_execution_mode(v)[source]¶
Normalize the execution mode and default to
hybrid.- Parameters:
v (Any)
- Return type:
str
- classmethod clean_normalized_query(v)[source]¶
Trim the optional normalized query field.
- Parameters:
v (Any)
- Return type:
str | None
- classmethod clean_rationale(v)[source]¶
Normalize routing rationales and provide a fallback sentence.
- Parameters:
v (Any)
- Return type:
str
- classmethod clean_intent_label(v)[source]¶
Normalize the optional intent label emitted by the LLM.
- Parameters:
v (Any)
- Return type:
str | None
- apply_cross_field_defaults()[source]¶
Apply derived defaults after model validation succeeds.
- Return type:
- classmethod from_routing_output(output, *, requested_top_k, requested_granularity, output_validation_retries)[source]¶
Convert validated agent output into a normalized immutable intent.
- Parameters:
output (RoutingIntentOutput)
requested_top_k (int)
requested_granularity (str | None)
output_validation_retries (int)
- Return type:
ParsedQueryIntent | None
- class lalandre_rag.retrieval.query_parser.LLMQueryParserClient(*, provider, model, base_url, timeout_seconds, api_key=None, max_output_tokens=180, temperature=0.0, key_pool=None)[source]¶
Bases:
objectQuery parser d’intention utilisant le LLM de génération principal.
Dégrade gracieusement : si le parsing échoue, le QueryRouter bascule sur les heuristiques déterministes.
Architecture : NE PAS configurer sur le LLM d’extraction. Utilise config.generation.* — Mistral, OpenAI-compatible.
- Parameters:
provider (str)
model (str)
base_url (str)
timeout_seconds (float)
api_key (str | None)
max_output_tokens (int)
temperature (float)
key_pool (APIKeyPool | None)
- classmethod from_runtime(*, config, settings, key_pool=None)[source]¶
Factory depuis la config runtime.
Utilise config.generation.* comme source principale. search.intent_parser_* peut surcharger provider/model/base_url si besoin (ex. : utiliser un modèle plus petit dédié au routing). NE fait plus de fallback sur extraction.llm_* (réservé au LLM d’extraction).
- Parameters:
config (Any)
settings (Any)
key_pool (APIKeyPool | None)
- Return type:
LLMQueryParserClient | None
- parse(*, question, top_k, requested_granularity)[source]¶
Parse one user question into a normalized routing intent.
- Parameters:
question (str)
top_k (int)
requested_granularity (str | None)
- Return type:
ParsedQueryIntent | None
lalandre_rag.retrieval.query_router¶
Source: packages/lalandre_rag/lalandre_rag/retrieval/query_router.py
Query routing heuristics for hybrid legal retrieval.
- class lalandre_rag.retrieval.query_router.RetrievalPlan(profile, granularity, top_k, include_relations_hint, rationale, use_graph=False, execution_mode='hybrid', routing_source='heuristic', search_query=None, intent_label=None, parser_confidence=None)[source]¶
Bases:
objectResolved retrieval strategy for a user question.
- Parameters:
profile (str)
granularity (str | None)
top_k (int)
include_relations_hint (bool)
rationale (str)
use_graph (bool)
execution_mode (str)
routing_source (str)
search_query (str | None)
intent_label (str | None)
parser_confidence (float | None)
- class lalandre_rag.retrieval.query_router.QueryRouter(*, intent_parser=None)[source]¶
Bases:
objectLightweight router for selecting retrieval settings by query intent.
By default, routing is heuristic-only (fast and deterministic). An optional An LLM parser can be injected to classify intent and normalize user queries.
- Parameters:
intent_parser (LLMQueryParserClient | None)
lalandre_rag.retrieval.query_utils¶
Source: packages/lalandre_rag/lalandre_rag/retrieval/query_utils.py
Query pre-processing utilities for retrieval.
Small, pure helpers for preparing user queries before they hit the search backends (BM25, semantic, etc.).
- lalandre_rag.retrieval.query_utils.truncate_lexical_query(query, max_chars=None)[source]¶
Truncate query for BM25-based search modes.
Preserves full words by cutting at the last whitespace boundary. Returns the original query unchanged if it is short enough.
If max_chars is not provided, uses
search.max_lexical_query_charsfrom the central config.- Parameters:
query (str)
max_chars (int | None)
- Return type:
str
lalandre_rag.retrieval.rerank_service¶
Source: packages/lalandre_rag/lalandre_rag/retrieval/rerank_service.py
Rerank Service Cross-encoder reranking — local (in-process) or via dedicated HTTP service.
- class lalandre_rag.retrieval.rerank_service.RerankConfig(model_name, device, batch_size, max_candidates, max_chars, enabled=True, cache_dir=None, rerank_service_url=None, service_timeout_seconds=10.0, fallback_to_skip=True, circuit_failure_threshold=2, circuit_cooldown_seconds=30.0)[source]¶
Bases:
objectRuntime configuration for the retrieval reranker.
- Parameters:
model_name (str)
device (str)
batch_size (int)
max_candidates (int)
max_chars (int)
enabled (bool)
cache_dir (str | None)
rerank_service_url (str | None)
service_timeout_seconds (float)
fallback_to_skip (bool)
circuit_failure_threshold (int)
circuit_cooldown_seconds (float)
- class lalandre_rag.retrieval.rerank_service.RerankService(config)[source]¶
Bases:
objectCross-encoder reranker.
Two modes: - HTTP (when
rerank_service_urlis set): calls the dedicated rerank-service. - Local (fallback): loads CrossEncoder in-process via sentence-transformers.If the HTTP service is unreachable and
fallback_to_skipis True, results are returned without reranking.Includes a circuit breaker: after circuit_failure_threshold consecutive HTTP failures, reranking is skipped for circuit_cooldown_seconds.
- Parameters:
config (RerankConfig)
- rerank(query, results, top_k=None)[source]¶
Rerank retrieval results using a cross-encoder.
- Parameters:
query (str)
results (list[RetrievalResult])
top_k (int | None)
- Return type:
list[RetrievalResult]
lalandre_rag.retrieval.result¶
Source: packages/lalandre_rag/lalandre_rag/retrieval/result.py
Core retrieval result dataclass.
Extracted from service.py so that sub-modules (semantic_search, bm25_search, fusion_service, rerank_service) can import it without circular dependencies.
- class lalandre_rag.retrieval.result.RetrievalResult(content, score, subdivision_id, act_id, celex, subdivision_type, sequence_order, metadata)[source]¶
Bases:
objectSingle retrieval result with content and metadata
- Parameters:
content (str)
score (float)
subdivision_id (int)
act_id (int)
celex (str | None)
subdivision_type (str)
sequence_order (int)
metadata (Dict[str, Any])
- class lalandre_rag.retrieval.result.RetrievalStats(candidates_after_fusion=0, candidates_after_threshold=0, candidates_after_rerank=0, candidates_after_adaptive_cutoff=0, candidates_returned=0, adaptive_cutoff_applied=False, effective_score_threshold=None, fusion_lexical_weight=None, fusion_semantic_weight=None, query_variants_count=0, cache_hit=False, embedding_ms=0.0, semantic_search_ms=0.0, lexical_search_ms=0.0, parallel_search_ms=0.0, fusion_ms=0.0, rerank_ms=0.0, total_retrieve_ms=0.0)[source]¶
Bases:
objectStatistics from the last retrieve() call, for audit/traceability.
- Parameters:
candidates_after_fusion (int)
candidates_after_threshold (int)
candidates_after_rerank (int)
candidates_after_adaptive_cutoff (int)
candidates_returned (int)
adaptive_cutoff_applied (bool)
effective_score_threshold (float | None)
fusion_lexical_weight (float | None)
fusion_semantic_weight (float | None)
query_variants_count (int)
cache_hit (bool)
embedding_ms (float)
semantic_search_ms (float)
lexical_search_ms (float)
parallel_search_ms (float)
fusion_ms (float)
rerank_ms (float)
total_retrieve_ms (float)
lalandre_rag.retrieval.result_cache¶
Source: packages/lalandre_rag/lalandre_rag/retrieval/result_cache.py
Redis-backed result cache for the retrieval service.
- class lalandre_rag.retrieval.result_cache.RetrievalCache(redis_client, ttl)[source]¶
Bases:
objectThin wrapper around Redis for caching retrieval results.
- Parameters:
redis_client (Any)
ttl (int)
- property enabled: bool¶
Return whether Redis caching is currently active.
- static cache_key(query, top_k, score_threshold, filters, granularity, collections, embedding_preset=None)[source]¶
Build a stable cache key for one retrieval request shape.
- Parameters:
query (str)
top_k (int)
score_threshold (float | None)
filters (Dict[str, Any] | None)
granularity (str | None)
collections (List[str] | None)
embedding_preset (str | None)
- Return type:
str
- get(key)[source]¶
Fetch cached retrieval results for key, if present.
- Parameters:
key (str)
- Return type:
List[RetrievalResult] | None
- set(key, results)[source]¶
Store retrieval results for key with the configured TTL.
- Parameters:
key (str)
results (List[RetrievalResult])
- Return type:
None
lalandre_rag.retrieval.search_config¶
Source: packages/lalandre_rag/lalandre_rag/retrieval/search_config.py
Resolved search configuration.
Centralizes the verbose config-resolution logic that was inlined in RetrievalService.__init__.
- class lalandre_rag.retrieval.search_config.ResolvedSearchConfig(search_language, candidate_multiplier, min_candidates, max_candidates, hnsw_ef, exact_search, per_collection_oversampling, query_expansion_enabled, query_expansion_max_variants, query_expansion_min_query_chars, lexical_weight, semantic_weight, fusion_method, dynamic_fusion_enabled, lexical_boost_factor, lexical_boost_max, result_cache_ttl)[source]¶
Bases:
objectAll search-related parameters resolved from config + caller overrides.
- Parameters:
search_language (str)
candidate_multiplier (float)
min_candidates (int)
max_candidates (int)
hnsw_ef (int | None)
exact_search (bool)
per_collection_oversampling (float)
query_expansion_enabled (bool)
query_expansion_max_variants (int)
query_expansion_min_query_chars (int)
lexical_weight (float)
semantic_weight (float)
fusion_method (str)
dynamic_fusion_enabled (bool)
lexical_boost_factor (float)
lexical_boost_max (float)
result_cache_ttl (int)
- classmethod from_overrides(*, search_language=None, candidate_multiplier=None, min_candidates=None, max_candidates=None, hnsw_ef=None, exact_search=None, semantic_per_collection_oversampling=None, query_expansion_enabled=None, query_expansion_max_variants=None, query_expansion_min_query_chars=None, lexical_weight=None, semantic_weight=None, fusion_method=None, dynamic_fusion_enabled=None, result_cache_ttl=None)[source]¶
Resolve all search parameters from config defaults + explicit overrides.
- Parameters:
search_language (str | None)
candidate_multiplier (float | None)
min_candidates (int | None)
max_candidates (int | None)
hnsw_ef (int | None)
exact_search (bool | None)
semantic_per_collection_oversampling (float | None)
query_expansion_enabled (bool | None)
query_expansion_max_variants (int | None)
query_expansion_min_query_chars (int | None)
lexical_weight (float | None)
semantic_weight (float | None)
fusion_method (str | None)
dynamic_fusion_enabled (bool | None)
result_cache_ttl (int | None)
- Return type:
lalandre_rag.retrieval.semantic_search¶
Source: packages/lalandre_rag/lalandre_rag/retrieval/semantic_search.py
Semantic Search Service Vector-based search using Qdrant embeddings
- class lalandre_rag.retrieval.semantic_search.SemanticSearchService(qdrant_repos, score_threshold=None, hnsw_ef=None, exact_search=None, per_collection_oversampling=None)[source]¶
Bases:
objectSemantic search using Qdrant vector database
Uses embedding-based similarity search to find semantically related documents regardless of exact keyword matches.
Responsibilities: - Execute vector search via Qdrant - Support multiple collections (chunks, acts) - Apply metadata filters - Convert Qdrant results to RetrievalResult format
Does NOT: - Generate embeddings (uses EmbeddingService) - Fuse with lexical results (handled by FusionService) - Execute lexical search
Initialize semantic search service
- Parameters:
qdrant_repos (Dict[str, QdrantRepository]) – Dictionary of collection_name -> QdrantRepository Expected keys typically include ‘chunks’ and ‘acts’
score_threshold (float | None) – Minimum similarity score (0-1, None = no filtering)
hnsw_ef (int | None)
exact_search (bool | None)
per_collection_oversampling (float | None)
- search(query_vector, top_k=None, filters=None, collections=None, score_threshold=None, hnsw_ef=None, exact_search=None)[source]¶
Execute semantic search in one or multiple collections
- Parameters:
query_vector (List[float]) – Query embedding vector
top_k (int | None) – Number of results (default: config.search.default_limit)
filters (Dict[str, Any] | None) – Metadata filters (e.g., {“celex”: “32016R0679”})
collections (List[str] | None) – Collections to search (default: [“chunks”])
score_threshold (float | None) – Override instance score threshold
hnsw_ef (int | None)
exact_search (bool | None)
- Returns:
List of RetrievalResult objects sorted by similarity score
- Return type:
list[RetrievalResult]
lalandre_rag.retrieval.service¶
Source: packages/lalandre_rag/lalandre_rag/retrieval/service.py
Document Retrieval Service Orchestrates semantic + lexical search, fusion, reranking, and caching. Strategy implementations live in retrieval/strategies/.
- class lalandre_rag.retrieval.service.RetrievalService(qdrant_repos, pg_repo, embedding_service, reranker=None, payload_builder=None, search_language=None, candidate_multiplier=None, min_candidates=None, max_candidates=None, semantic_per_collection_oversampling=None, hnsw_ef=None, exact_search=None, query_expansion_enabled=None, query_expansion_max_variants=None, query_expansion_min_query_chars=None, lexical_weight=None, semantic_weight=None, fusion_method=None, dynamic_fusion_enabled=None, redis_client=None, result_cache_ttl=None, preset_embedding_services=None)[source]¶
Bases:
SemanticStrategyMixin,LexicalStrategyMixin,HybridStrategyMixinUnified retrieval service combining semantic and lexical search.
Responsibilities: - Execute hybrid search (retrieve) across multiple collections - Delegate semantic/lexical/hybrid-precomputed searches to strategy mixins - Fuse results using RRF or weighted scores - Rerank, deduplicate, and cache results
Does NOT: - Modify Qdrant collections - Generate embeddings (uses EmbeddingService) - Enrich context (uses ContextService)
- Parameters:
qdrant_repos (Dict[str, QdrantRepository])
pg_repo (PostgresRepository)
embedding_service (EmbeddingService)
reranker (RerankService | None)
payload_builder (PayloadBuilder | None)
search_language (str | None)
candidate_multiplier (float | None)
min_candidates (int | None)
max_candidates (int | None)
semantic_per_collection_oversampling (float | None)
hnsw_ef (int | None)
exact_search (bool | None)
query_expansion_enabled (bool | None)
query_expansion_max_variants (int | None)
query_expansion_min_query_chars (int | None)
lexical_weight (float | None)
semantic_weight (float | None)
fusion_method (str | None)
dynamic_fusion_enabled (bool | None)
redis_client (Any | None)
result_cache_ttl (int | None)
preset_embedding_services (Dict[str, EmbeddingService] | None)
- retrieve(query, top_k=10, score_threshold=None, filters=None, collections=None, granularity=None, embedding_preset=None)[source]¶
Hybrid search (semantic + BM25) with fusion, reranking, and caching.
- Parameters:
query (str) – Search query
top_k (int) – Number of final results to return
score_threshold (float | None) – Minimum score threshold (post-fusion)
filters (Dict[str, Any] | None) – Metadata filters (act_id, celex, etc.)
collections (List[str] | None) – Specific collections to search
granularity (str | None) – ‘subdivisions’, ‘chunks’, or ‘all’ (overrides collections)
embedding_preset (str | None) – Route semantic search to this preset’s collections/embedding service
- Return type:
List[RetrievalResult]
lalandre_rag.retrieval.strategies¶
Source: packages/lalandre_rag/lalandre_rag/retrieval/strategies/__init__.py
Retrieval strategy mixins (semantic, lexical, hybrid).
lalandre_rag.retrieval.strategies.hybrid¶
Source: packages/lalandre_rag/lalandre_rag/retrieval/strategies/hybrid.py
Hybrid search strategy mixin for RetrievalService (pre-computed embeddings).
- lalandre_rag.retrieval.strategies.hybrid.has_explicit_legal_reference(query)[source]¶
Return True when the query contains a strong legal-reference cue.
- Parameters:
query (str)
- Return type:
bool
- class lalandre_rag.retrieval.strategies.hybrid.HybridStrategyMixin[source]¶
Bases:
objectProvides hybrid_with_embedding(), _fuse_results(), and _resolve_fusion_override().
- hybrid_with_embedding(query, query_vector, top_k=10, score_threshold=None, filters=None, collections=None, granularity=None, embedding_preset=None)[source]¶
Execute hybrid search with a pre-computed query embedding.
Avoids redundant embedding computation when the caller already has the vector.
- Parameters:
query (str)
query_vector (List[float])
top_k (int)
score_threshold (float | None)
filters (Dict[str, Any] | None)
collections (List[str] | None)
granularity (str | None)
embedding_preset (str | None)
- Return type:
List[RetrievalResult]
lalandre_rag.retrieval.strategies.lexical¶
Source: packages/lalandre_rag/lalandre_rag/retrieval/strategies/lexical.py
Lexical (BM25) search strategy mixin for RetrievalService.
- class lalandre_rag.retrieval.strategies.lexical.LexicalStrategyMixin[source]¶
Bases:
objectProvides lexical_only() and the underlying BM25 + expansion helpers.
- lexical_only(query, top_k=10, score_threshold=None, filters=None)[source]¶
Execute lexical-only search using BM25 (no semantic component).
BM25 scores are normalized into [0, 1] before thresholding so that
score_thresholdstays comparable across modes.- Parameters:
query (str)
top_k (int)
score_threshold (float | None)
filters (Dict[str, Any] | None)
- Return type:
List[RetrievalResult]
lalandre_rag.retrieval.strategies.semantic¶
Source: packages/lalandre_rag/lalandre_rag/retrieval/strategies/semantic.py
Semantic search strategy mixin for RetrievalService.
- class lalandre_rag.retrieval.strategies.semantic.SemanticStrategyMixin[source]¶
Bases:
objectProvides semantic_only() and the underlying multi-collection + expansion helpers.
- semantic_only(query=None, query_vector=None, top_k=10, score_threshold=None, filters=None, collections=None, granularity=None, embedding_preset=None)[source]¶
Execute semantic-only search (no lexical component).
Supports both text queries (embedded on the fly) and pre-computed vectors.
- Parameters:
query (str | None)
query_vector (List[float] | None)
top_k (int)
score_threshold (float | None)
filters (Dict[str, Any] | None)
collections (List[str] | None)
granularity (str | None)
embedding_preset (str | None)
- Return type:
List[RetrievalResult]
lalandre_rag.retrieval.trace¶
Source: packages/lalandre_rag/lalandre_rag/retrieval/trace.py
Retrieval trace utilities Centralized definition of trace keys and extraction helper.
lalandre_rag.scoring¶
Source: packages/lalandre_rag/lalandre_rag/scoring/__init__.py
Shared score helpers and source score contract.
- lalandre_rag.scoring.coerce_finite_float(value)[source]¶
Return a finite float when possible, otherwise
None.- Parameters:
value (Any)
- Return type:
float | None
- lalandre_rag.scoring.clamp_unit_interval(value, *, default=0.0)[source]¶
Clamp a value to the normalized 0..1 range.
- Parameters:
value (Any)
default (float)
- Return type:
float
- lalandre_rag.scoring.non_negative(value, *, default=0.0)[source]¶
Clamp a value to the non-negative range.
- Parameters:
value (Any)
default (float)
- Return type:
float
- lalandre_rag.scoring.normalize_by_max(value, max_value, *, default=0.0)[source]¶
Normalize a non-negative value by a positive maximum.
- Parameters:
value (Any)
max_value (Any)
default (float)
- Return type:
float
- lalandre_rag.scoring.round_score(value, digits=4)[source]¶
Round a finite score after clamping it to a numeric value.
- Parameters:
value (Any)
digits (int)
- Return type:
float
lalandre_rag.service¶
Source: packages/lalandre_rag/lalandre_rag/service.py
RAG (Retrieval-Augmented Generation) Service High-level orchestration of retrieval, context enrichment, and LLM generation
- class lalandre_rag.service.RAGService(retrieval_service, context_service, llm_model=None, temperature=None, max_tokens=None, api_key=None, graph_rag_service=None, key_pool=None, act_summary_service=None, entity_linker=None, external_detector=None)[source]¶
Bases:
objectComplete RAG pipeline for legal document querying
Architecture: - Delegates LLM-only to LLMMode - Delegates hybrid RAG to HybridMode - Delegates summarization to SummarizeMode - Delegates comparison to CompareMode
This is a thin orchestration layer that initializes modes and delegates requests.
Initialize RAG service
- Parameters:
retrieval_service (RetrievalService) – Service for document retrieval
context_service (ContextService) – Service for context enrichment
llm_model (str | None) – LLM model name (default from config)
temperature (float | None) – LLM temperature (default from config)
max_tokens (int | None) – Maximum tokens for generation (default from config)
api_key (str | None) – Optional provider API key override
graph_rag_service (GraphRAGService | None)
key_pool (APIKeyPool | None)
act_summary_service (ActSummaryService | None)
entity_linker (LegalEntityLinker | None)
external_detector (Callable[[str], Any] | None)
- query_llm_only(question, include_warning=True)[source]¶
MODE 2: Pure LLM (100% Generation)
Delegates to LLMMode.
- Parameters:
question (str) – User question
include_warning (bool) – Include warning about no document grounding
- Returns:
Dictionary with LLM answer (no sources)
- Return type:
Dict[str, Any]
- stream_query_llm_only(question)[source]¶
Stream LLM-only answer token by token.
- Parameters:
question (str)
- Return type:
Iterator[str]
- query(question, top_k=10, score_threshold=None, filters=None, include_relations=False, include_subjects=False, return_sources=True, include_full_content=False, collections=None, granularity=None, chat_history=None, graph_depth=None, use_graph=None, embedding_preset=None, cypher_documents=None, cypher_query_meta=None)[source]¶
MODE 3: Hybrid RAG (default)
Delegates to HybridMode.
- Parameters:
question (str) – User question
top_k (int) – Number of documents to retrieve
filters (Dict[str, Any] | None) – Metadata filters
include_relations (bool) – Include act relations in context
include_subjects (bool) – Include subject classifications
return_sources (bool) – Return source documents in response
collections (List[str] | None) – Specific collections to search
granularity (str | None) – Quick selector
chat_history (List[BaseMessage] | None) – Optional conversation history as LangChain messages
score_threshold (float | None)
include_full_content (bool)
graph_depth (int | None)
use_graph (bool | None)
embedding_preset (str | None)
cypher_documents (List[Dict[str, Any]] | None)
cypher_query_meta (Dict[str, Any] | None)
- Returns:
Dictionary with answer, sources, and metadata
- Return type:
Dict[str, Any]
- stream_query(question, top_k=10, score_threshold=None, filters=None, include_relations=False, include_subjects=False, return_sources=True, include_full_content=False, collections=None, granularity=None, chat_history=None, graph_depth=None, use_graph=None, embedding_preset=None, retrieval_depth=None, cypher_documents=None, cypher_query_meta=None)[source]¶
Stream hybrid RAG answer: yields preamble dict then string tokens.
- Parameters:
question (str)
top_k (int)
score_threshold (float | None)
filters (Dict[str, Any] | None)
include_relations (bool)
include_subjects (bool)
return_sources (bool)
include_full_content (bool)
collections (List[str] | None)
granularity (str | None)
chat_history (List[BaseMessage] | None)
graph_depth (int | None)
use_graph (bool | None)
embedding_preset (str | None)
retrieval_depth (str | None)
cypher_documents (List[Dict[str, Any]] | None)
cypher_query_meta (Dict[str, Any] | None)
- Return type:
Iterator[Dict[str, Any] | str]
- summarize(topic, top_k=10, score_threshold=None, filters=None, include_relations=True, include_full_content=False)[source]¶
MODE 4: Summarization
Delegates to SummarizeMode.
- Parameters:
topic (str) – Topic or question to summarize
top_k (int) – Number of documents to retrieve
filters (Dict[str, Any] | None) – Metadata filters
include_relations (bool) – Include relations in context
score_threshold (float | None)
include_full_content (bool)
- Returns:
Dictionary with summary and sources
- Return type:
Dict[str, Any]
- summarize_canonical(*, celex, question)[source]¶
Return a canonical-summary response when one exists for celex.
- Parameters:
celex (str)
question (str)
- Return type:
Dict[str, Any] | None
- compare(comparison_question, celex_list=None, top_k=10, score_threshold=None, include_full_content=False)[source]¶
MODE 5: Comparison
Delegates to CompareMode.
- Parameters:
comparison_question (str) – What to compare
celex_list (List[str] | None) – Optional list of specific CELEX to compare
top_k (int) – Number of documents if CELEX not specified
score_threshold (float | None)
include_full_content (bool)
- Returns:
Dictionary with comparison and sources
- Return type:
Dict[str, Any]
lalandre_rag.summaries¶
Source: packages/lalandre_rag/lalandre_rag/summaries/__init__.py
Canonical act summaries: storage, generation, and prompt augmentation.
lalandre_rag.summaries.agent¶
Source: packages/lalandre_rag/lalandre_rag/summaries/agent.py
PydanticAI agent for structured canonical summary generation.
- class lalandre_rag.summaries.agent.CanonicalSummaryOutput(*, summary, output_validation_retries=0)[source]¶
Bases:
BaseModelStructured summary output validated by PydanticAI.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
summary (str)
output_validation_retries (int)
- model_config: ClassVar[ConfigDict] = {'extra': 'ignore'}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- lalandre_rag.summaries.agent.run_summary_agent(*, prompt, generate_text, model_name)[source]¶
Run the canonical summary agent and return validated output + retry count.
- Parameters:
prompt (str)
generate_text (Callable[[str], str])
model_name (str)
- Return type:
tuple[CanonicalSummaryOutput, int]
lalandre_rag.summaries.generator¶
Source: packages/lalandre_rag/lalandre_rag/summaries/generator.py
Canonical summary generation: LLM-based and deterministic fallback.
- class lalandre_rag.summaries.generator.CanonicalSummaryGenerator(*, llm_client, prompt_version=CANONICAL_SUMMARY_PROMPT_VERSION, model_id=None)[source]¶
Bases:
objectGenerate stable, reusable act summaries from structured act content.
- Parameters:
llm_client (Optional[JSONHTTPLLMClient])
prompt_version (str)
model_id (Optional[str])
lalandre_rag.summaries.models¶
Source: packages/lalandre_rag/lalandre_rag/summaries/models.py
Data models, constants, and helpers for canonical act summaries.
- class lalandre_rag.summaries.models.CanonicalSummarySnapshot(act_id, celex, language, status, is_stale, summary, generated_at, prompt_version, model_id, source_version_id, error_text, trace)[source]¶
Bases:
objectCanonical summary state returned by the summary service.
- Parameters:
act_id (int)
celex (str)
language (str)
status (str)
is_stale (bool)
summary (str | None)
generated_at (datetime | None)
prompt_version (str | None)
model_id (str | None)
source_version_id (int | None)
error_text (str | None)
trace (Dict[str, Any])
- property available: bool¶
Return whether the snapshot contains a ready-to-use summary.
- class lalandre_rag.summaries.models.SummaryTraceRecorder[source]¶
Bases:
objectBuild structured trace payloads for summary generation and lookup.
- static lookup(*, status, is_stale, reason=None)[source]¶
Build trace metadata for summary lookup operations.
- Parameters:
status (str)
is_stale (bool)
reason (str | None)
- Return type:
Dict[str, Any]
- static generation(*, mode, context_chars, subdivisions_used, model_id, prompt_version, extra=None)[source]¶
Build trace metadata for summary generation operations.
- Parameters:
mode (str)
context_chars (int)
subdivisions_used (int)
model_id (str)
prompt_version (str)
extra (Dict[str, Any] | None)
- Return type:
Dict[str, Any]
lalandre_rag.summaries.service¶
Source: packages/lalandre_rag/lalandre_rag/summaries/service.py
Services for reading, refreshing, and augmenting canonical act summaries.
- class lalandre_rag.summaries.service.ActSummaryService(*, pg_repo, generator=None, prompt_version=CANONICAL_SUMMARY_PROMPT_VERSION, model_id=None)[source]¶
Bases:
objectRead and refresh canonical act summaries.
- Parameters:
pg_repo (Any)
generator (Optional[CanonicalSummaryGenerator])
prompt_version (str)
model_id (Optional[str])
- static build_runtime_model_id()[source]¶
Build the model identifier recorded for generated summaries.
- Return type:
str
- get_canonical_summary_by_celex(celex)[source]¶
Fetch the current canonical summary snapshot for one act.
- Parameters:
celex (str)
- Return type:
CanonicalSummarySnapshot | None
- class lalandre_rag.summaries.service.QuestionSummaryService(act_summary_service)[source]¶
Bases:
objectAugment personalized summarize/compare prompts with canonical summaries.
- Parameters:
act_summary_service (Optional[ActSummaryService])