RAG API¶

Note

This page is generated automatically from the repository’s maintained Python module inventory.

Retrieval, graph augmentation, summaries, response building, and orchestration.

`lalandre_rag`¶

Source: packages/lalandre_rag/lalandre_rag/__init__.py

RAG Service Module High-level orchestration of retrieval, context enrichment, and LLM generation

`lalandre_rag.adapters`¶

Source: packages/lalandre_rag/lalandre_rag/adapters/__init__.py

RAG Adapters Adapters for different LLM frameworks

`lalandre_rag.adapters.llamaindex`¶

Source: packages/lalandre_rag/lalandre_rag/adapters/llamaindex.py

LlamaIndex Adapter Utilities for using LlamaIndex with context slices

class lalandre_rag.adapters.llamaindex.LlamaIndexAdapter(llama_llm)[source]¶

Bases: object

Adapter for using LlamaIndex with context slice objects

Provides: - Document to node conversion - TreeSummarize for long documents - Multi-document comparison

Initialize LlamaIndex adapter

Parameters:: llama_llm (LLM) – LlamaIndex-compatible LLM client

static context_slice_key(doc)[source]¶

Return the stable lookup key used for source identifiers.

Parameters:: doc (ContextSlice)
Return type:: Tuple[str, int, int | None]

context_slices_to_nodes(context_slices, source_id_map=None)[source]¶

Convert ContextSlice objects to LlamaIndex NodeWithScore

Parameters:

context_slices (List[ContextSlice]) – List of context slices
source_id_map (Dict[Tuple[str, int, int | None], str] | None)

Returns:

List of LlamaIndex nodes with scores

Return type:

List[NodeWithScore]

summarize(topic, context_slices, source_id_map=None)[source]¶

Use LlamaIndex TreeSummarize for hierarchical summarization

Better for long documents as it summarizes in chunks then combines

Parameters:

topic (str) – Topic to summarize
context_slices (List[ContextSlice]) – Context slices to summarize
source_id_map (Dict[Tuple[str, int, int | None], str] | None)

Returns:

Summary text

Return type:

str

compare(comparison_question, context_slices, celex_list, source_id_map=None)[source]¶

Use LlamaIndex for intelligent multi-document comparison

Groups documents by CELEX and compares systematically

Parameters:

comparison_question (str) – Question for comparison
context_slices (List[ContextSlice]) – Context slices to compare
celex_list (List[str]) – List of CELEX codes being compared
source_id_map (Dict[Tuple[str, int, int | None], str] | None)

Returns:

Comparison text

Return type:

str

`lalandre_rag.agentic`¶

Source: packages/lalandre_rag/lalandre_rag/agentic/__init__.py

Typed agentic orchestration package for RAG planning.

`lalandre_rag.agentic.deps`¶

Source: packages/lalandre_rag/lalandre_rag/agentic/deps.py

Dependency container for the PydanticAI planning runtime.

class lalandre_rag.agentic.deps.RetrievalServiceProtocol(*args, **kwargs)[source]¶

Bases: Protocol

Protocol for the retrieval service consumed by the planning graph.

retrieve(**_)[source]¶

Run retrieval and return raw retrieval results.

Parameters:: _ (Any)
Return type:: list[Any]

class lalandre_rag.agentic.deps.ContextServiceProtocol(*args, **kwargs)[source]¶

Bases: Protocol

Protocol for retrieval-result enrichment services.

enrich_results(results, include_relations=False, include_subjects=False)[source]¶

Enrich retrieval results with act metadata and relationships.

Parameters:

results (list[Any])
include_relations (bool)
include_subjects (bool)

Return type:

list[Any]

class lalandre_rag.agentic.deps.QueryRouterProtocol(*args, **kwargs)[source]¶

Bases: Protocol

Protocol for deterministic or LLM-assisted retrieval routing.

route(*, question, top_k, requested_granularity)[source]¶

Build a retrieval plan for the incoming user question.

Parameters:

question (str)
top_k (int)
requested_granularity (str | None)

Return type:

Any

class lalandre_rag.agentic.deps.GraphRAGServiceProtocol(*args, **kwargs)[source]¶

Bases: Protocol

Protocol for optional graph-retrieval services.

class lalandre_rag.agentic.deps.CommunityEnricherProtocol(*args, **kwargs)[source]¶

Bases: Protocol

Protocol for community-summary enrichers used by global mode.

build_context(*, _seed_act_ids)[source]¶

Build a community-level context block and its metadata.

Parameters:: _seed_act_ids (set[int])
Return type:: tuple[str, Dict[str, Any]]

class lalandre_rag.agentic.deps.AgenticPlanningDeps(retrieval_service, context_service, llm, lightweight_llm, rag_prompt, query_router, graph_rag_service, community_enricher, question, top_k, score_threshold, filters, include_relations, include_subjects, include_full_content, return_sources, collections, granularity, graph_depth, use_graph, embedding_preset, retrieval_depth, chat_history, progress_callback=None, preamble_callback=None, token_callback=None, final_answer_callback=None)[source]¶

Bases: object

Runtime dependencies injected into the planning graph.

Parameters:

retrieval_service (RetrievalServiceProtocol)
context_service (ContextServiceProtocol)
llm (Any)
lightweight_llm (Any)
rag_prompt (ChatPromptTemplate)
query_router (QueryRouterProtocol)
graph_rag_service (GraphRAGServiceProtocol | None)
community_enricher (CommunityEnricherProtocol | None)
question (str)
top_k (int)
score_threshold (float | None)
filters (Dict[str, Any] | None)
include_relations (bool)
include_subjects (bool)
include_full_content (bool)
return_sources (bool)
collections (List[str] | None)
granularity (str | None)
graph_depth (int | None)
use_graph (bool | None)
embedding_preset (str | None)
retrieval_depth (str | None)
chat_history (List[BaseMessage] | None)
progress_callback (Callable[[Dict[str, Any]], None] | None)
preamble_callback (Callable[[Dict[str, Any] | None, Dict[str, Any]], None] | None)
token_callback (Callable[[str], None] | None)
final_answer_callback (Callable[[str], None] | None)

`lalandre_rag.agentic.graph`¶

Source: packages/lalandre_rag/lalandre_rag/agentic/graph.py

Pydantic Graph orchestration for RAG planning phases.

class lalandre_rag.agentic.graph.LoadConversationContext[source]¶

Bases: BaseNode[PlanningGraphState, AgenticPlanningDeps, PlanningEarlyExit | PlanningContext]

Bootstrap node for future conversation-memory loading.

async run(ctx)[source]¶

Advance to the question-decomposition phase.

Parameters:: ctx (GraphRunContext[PlanningGraphState, AgenticPlanningDeps])
Return type:: DecomposeQuestion

class lalandre_rag.agentic.graph.DecomposeQuestion[source]¶

Bases: BaseNode[PlanningGraphState, AgenticPlanningDeps, PlanningEarlyExit | PlanningContext]

Placeholder decomposition node for complex multi-part questions.

async run(ctx)[source]¶

Advance to the routing phase.

Parameters:: ctx (GraphRunContext[PlanningGraphState, AgenticPlanningDeps])
Return type:: RouteQuestion

class lalandre_rag.agentic.graph.RouteQuestion[source]¶

Bases: BaseNode[PlanningGraphState, AgenticPlanningDeps, PlanningEarlyExit | PlanningContext]

Route the user question toward the appropriate retrieval profile.

async run(ctx)[source]¶

Compute routing metadata and transition to planning.

Parameters:: ctx (GraphRunContext[PlanningGraphState, AgenticPlanningDeps])
Return type:: PlanRetrieval

class lalandre_rag.agentic.graph.PlanRetrieval[source]¶

Bases: BaseNode[PlanningGraphState, AgenticPlanningDeps, PlanningEarlyExit | PlanningContext]

Build the retrieval plan and complementary-query strategy.

async run(ctx)[source]¶

Resolve the planning step and transition to retrieval.

Parameters:: ctx (GraphRunContext[PlanningGraphState, AgenticPlanningDeps])
Return type:: RunRetrieval

class lalandre_rag.agentic.graph.RunRetrieval[source]¶

Bases: BaseNode[PlanningGraphState, AgenticPlanningDeps, PlanningEarlyExit | PlanningContext]

Execute retrieval and context enrichment for the planned query.

async run(ctx)[source]¶

Run retrieval and either finish early or evaluate sufficiency.

Parameters:: ctx (GraphRunContext[PlanningGraphState, AgenticPlanningDeps])
Return type:: EvaluateEvidence | End[PlanningEarlyExit | PlanningContext]

class lalandre_rag.agentic.graph.EvaluateEvidence[source]¶

Bases: BaseNode[PlanningGraphState, AgenticPlanningDeps, PlanningEarlyExit | PlanningContext]

Assess whether retrieved evidence is sufficient for answering.

async run(ctx)[source]¶

Evaluate retrieval quality before optional graph augmentation.

Parameters:: ctx (GraphRunContext[PlanningGraphState, AgenticPlanningDeps])
Return type:: MaybeFetchGraphSupport

class lalandre_rag.agentic.graph.MaybeFetchGraphSupport[source]¶

Bases: BaseNode[PlanningGraphState, AgenticPlanningDeps, PlanningEarlyExit | PlanningContext]

Optionally augment retrieval context with graph-derived support.

async run(ctx)[source]¶

Fetch graph support when the current plan allows it.

Parameters:: ctx (GraphRunContext[PlanningGraphState, AgenticPlanningDeps])
Return type:: CompressContext

class lalandre_rag.agentic.graph.CompressContext[source]¶

Bases: BaseNode[PlanningGraphState, AgenticPlanningDeps, PlanningEarlyExit | PlanningContext]

Finalize and optionally compress context before generation.

async run(ctx)[source]¶

Produce the terminal planning artifact for downstream generation.

Parameters:: ctx (GraphRunContext[PlanningGraphState, AgenticPlanningDeps])
Return type:: End[PlanningEarlyExit | PlanningContext]

lalandre_rag.agentic.graph.run_planning_graph(*, deps)[source]¶

Execute the planning graph synchronously and return the terminal artifact.

Parameters:: deps (AgenticPlanningDeps)
Return type:: PlanningEarlyExit | PlanningContext

`lalandre_rag.agentic.models`¶

Source: packages/lalandre_rag/lalandre_rag/agentic/models.py

Pydantic and dataclass models used by the planning runtime.

class lalandre_rag.agentic.models.ComplementaryQueryOutput(*, query, level_hint=None)[source]¶

Bases: BaseModel

Structured follow-up retrieval produced by the planner.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

query (str)
level_hint (str | None)

classmethod clean_query(value)[source]¶

Normalize and validate a complementary query string.

Parameters:: value (Any)
Return type:: str

model_config: ClassVar[ConfigDict] = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class lalandre_rag.agentic.models.RoutingIntentOutput(*, profile, granularity=None, top_k=10, include_relations_hint=False, execution_mode='hybrid', rationale='LLM parser selected retrieval profile.', use_graph=False, normalized_query=None, intent_label=None, confidence=None, output_validation_retries=0)[source]¶

Bases: BaseModel

Structured output returned by the routing agent.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

profile (Literal['contextual_default', 'citation_precision', 'relationship_focus', 'global_overview'])
granularity (Literal['subdivisions', 'chunks', 'all', 'auto'] | None)
top_k (int)
include_relations_hint (bool)
execution_mode (Literal['hybrid', 'global'])
rationale (str)
use_graph (bool)
normalized_query (str | None)
intent_label (str | None)
confidence (float | None)
output_validation_retries (int)

model_config: ClassVar[ConfigDict] = {'extra': 'ignore'}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

classmethod normalize_granularity(value)[source]¶

Normalize planner granularity hints to supported values.

Parameters:: value (Any)
Return type:: str | None

classmethod clean_optional_text(value)[source]¶

Trim optional text fields and coerce blanks to None.

Parameters:: value (Any)
Return type:: str | None

classmethod clean_rationale(value)[source]¶

Normalize routing rationales and inject a default fallback.

Parameters:: value (Any)
Return type:: str

apply_defaults()[source]¶

Fill derived fields that depend on other parsed values.

Return type:: RoutingIntentOutput

class lalandre_rag.agentic.models.DecompositionOutput(*, sub_questions=<factory>, synthesize=False, output_validation_retries=0)[source]¶

Bases: BaseModel

Structured output returned by the decomposition agent.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

sub_questions (List[str])
synthesize (bool)
output_validation_retries (int)

model_config: ClassVar[ConfigDict] = {'extra': 'ignore'}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

classmethod clean_sub_questions(value)[source]¶

Normalize decomposition output and cap it to three questions.

Parameters:: value (Any)
Return type:: List[str]

class lalandre_rag.agentic.models.RetrievalPlannerOutput(*, primary_query='', intent_class='documentary', skip_retrieval=False, needs_complementary=False, complementary_queries=<factory>, needs_compression=False, clarification_question=None, strict_grounding_requested=False, rationale='', output_validation_retries=0)[source]¶

Bases: BaseModel

Structured retrieval plan returned by the planner agent.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

primary_query (str)
intent_class (Literal['conversational', 'documentary'])
skip_retrieval (bool)
needs_complementary (bool)
complementary_queries (List[ComplementaryQueryOutput])
needs_compression (bool)
clarification_question (str | None)
strict_grounding_requested (bool)
rationale (str)
output_validation_retries (int)

model_config: ClassVar[ConfigDict] = {'extra': 'ignore'}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

classmethod clean_primary_query(value)[source]¶

Normalize the planner primary query field.

Parameters:: value (Any)
Return type:: str

classmethod normalize_intent_class(value)[source]¶

Restrict the planner intent class to supported values.

Parameters:: value (Any)
Return type:: str

classmethod clean_clarification_question(value)[source]¶

Normalize optional clarification prompts.

Parameters:: value (Any)
Return type:: str | None

classmethod clean_rationale(value)[source]¶

Normalize planner rationales to a stripped string.

Parameters:: value (Any)
Return type:: str

cap_complementary_queries()[source]¶

Apply post-parse planner constraints and derived defaults.

Return type:: RetrievalPlannerOutput

class lalandre_rag.agentic.models.RetrievalRefinementOutput(*, refined_query, rationale='', output_validation_retries=0)[source]¶

Bases: BaseModel

Structured refined query returned by the corrective agent.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

refined_query (str)
rationale (str)
output_validation_retries (int)

model_config: ClassVar[ConfigDict] = {'extra': 'ignore'}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

classmethod clean_refined_query(value)[source]¶

Normalize the refined retrieval query proposed by the agent.

Parameters:: value (Any)
Return type:: str

classmethod clean_refine_rationale(value)[source]¶

Normalize the refinement rationale text.

Parameters:: value (Any)
Return type:: str

class lalandre_rag.agentic.models.RetrievalEvaluationOutput(*, status='SUFFICIENT', gap=None, output_validation_retries=0)[source]¶

Bases: BaseModel

Structured sufficiency evaluation returned by the CRAG evaluator.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

status (Literal['SUFFICIENT', 'PARTIAL', 'INSUFFICIENT'])
gap (str | None)
output_validation_retries (int)

model_config: ClassVar[ConfigDict] = {'extra': 'ignore'}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

classmethod clean_gap(value)[source]¶

Normalize optional evidence-gap descriptions.

Parameters:: value (Any)
Return type:: str | None

class lalandre_rag.agentic.models.GraphSupportDecision(*, use_graph=False, use_cypher=False, rationale='')[source]¶

Bases: BaseModel

Structured graph support decision reserved for future graph/Cypher routing.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

use_graph (bool)
use_cypher (bool)
rationale (str)

model_config: ClassVar[ConfigDict] = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class lalandre_rag.agentic.models.PhaseTraceEvent(*, phase, status, label, detail=None, count=None, duration_ms=None, meta=<factory>, tool=None)[source]¶

Bases: BaseModel

Single trace event emitted by the planning runtime.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

phase (str)
status (str)
label (str)
detail (str | None)
count (int | None)
duration_ms (float | None)
meta (Dict[str, Any])
tool (str | None)

model_config: ClassVar[ConfigDict] = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class lalandre_rag.agentic.models.PlanningEarlyExit(kind, routing_ms, planner_ms, retrieval_ms=0.0, intent_class='documentary', clarification_question=None, strict_grounding_requested=False, agentic_rationale='', agentic_meta=<factory>, best_score=0.0, gate_threshold=0.0, candidates_dropped=0)[source]¶

Bases: object

Signals that the planning pipeline hit a terminal condition.

Parameters:

kind (str)
routing_ms (float)
planner_ms (float)
retrieval_ms (float)
intent_class (Literal['conversational', 'documentary'])
clarification_question (str | None)
strict_grounding_requested (bool)
agentic_rationale (str)
agentic_meta (Dict[str, Any])
best_score (float)
gate_threshold (float)
candidates_dropped (int)

class lalandre_rag.agentic.models.PlanningContext(context_slices, graph_fetch, retrieval_plan, agentic_plan, agentic_meta, retrieval_query, effective_top_k, effective_granularity, effective_include_relations, community_meta, routing_ms, planner_ms, retrieval_ms, context_enrichment_ms, graph_enrichment_ms, complementary_ms, compression_ms, retrieval_stats=None)[source]¶

Bases: object

Artifacts produced by the planning graph and consumed by generation.

Parameters:

context_slices (List[Any])
graph_fetch (Any | None)
retrieval_plan (Any)
agentic_plan (Any | None)
agentic_meta (Dict[str, Any])
retrieval_query (str)
effective_top_k (int)
effective_granularity (str | None)
effective_include_relations (bool)
community_meta (Dict[str, Any])
routing_ms (float)
planner_ms (float)
retrieval_ms (float)
context_enrichment_ms (float)
graph_enrichment_ms (float)
complementary_ms (float)
compression_ms (float)
retrieval_stats (Any)

class lalandre_rag.agentic.models.PlanningGraphState(question, top_k, score_threshold, filters, include_relations, include_subjects, collections, granularity, graph_depth, use_graph, embedding_preset, planner_run_id, planner_path=<factory>, trace_events=<factory>, output_validation_retries=0, decompose_ms=0.0, routing_ms=0.0, planner_ms=0.0, retrieval_ms=0.0, context_enrichment_ms=0.0, graph_enrichment_ms=0.0, complementary_ms=0.0, compression_ms=0.0, effective_top_k=0, effective_granularity=None, effective_include_relations=False, decomposition_result=None, retrieval_plan=None, agentic_plan=None, retrieval_query=None, retrieval_results=<factory>, context_slices=<factory>, community_meta=<factory>, graph_fetch=None, retrieval_depth=None, planning_future=None, graph_prefetch_future=None, retrieval_stats=None, agentic_meta=<factory>, early_exit=None)[source]¶

Bases: object

Mutable planning state shared across graph nodes.

Parameters:

question (str)
top_k (int)
score_threshold (float | None)
filters (Dict[str, Any] | None)
include_relations (bool)
include_subjects (bool)
collections (List[str] | None)
granularity (str | None)
graph_depth (int | None)
use_graph (bool | None)
embedding_preset (str | None)
planner_run_id (str)
planner_path (List[str])
trace_events (List[PhaseTraceEvent])
output_validation_retries (int)
decompose_ms (float)
routing_ms (float)
planner_ms (float)
retrieval_ms (float)
context_enrichment_ms (float)
graph_enrichment_ms (float)
complementary_ms (float)
compression_ms (float)
effective_top_k (int)
effective_granularity (str | None)
effective_include_relations (bool)
decomposition_result (Any)
retrieval_plan (Any | None)
agentic_plan (Any | None)
retrieval_query (str | None)
retrieval_results (List[Any])
context_slices (List[Any])
community_meta (Dict[str, Any])
graph_fetch (Any | None)
retrieval_depth (str | None)
planning_future (Any | None)
graph_prefetch_future (Any | None)
retrieval_stats (Any)
agentic_meta (Dict[str, Any])
early_exit (PlanningEarlyExit | None)

`lalandre_rag.agentic.runtime`¶

Source: packages/lalandre_rag/lalandre_rag/agentic/runtime.py

Concrete planning runtime for the PydanticAI-driven RAG pipeline.

class lalandre_rag.agentic.runtime.AgenticComplementaryQuery(query, level_hint=None)[source]¶

Bases: object

A targeted follow-up query proposed by the planner.

Parameters:

query (str)
level_hint (str | None)

class lalandre_rag.agentic.runtime.AgenticRetrievalPlan(primary_query, intent_class='documentary', skip_retrieval=False, needs_complementary=False, complementary_queries=<factory>, needs_compression=False, clarification_question=None, strict_grounding_requested=False, rationale='', planning_ms=0.0, planner_used=False, output_validation_retries=0)[source]¶

Bases: object

Planner decision for retrieval/refinement phases.

Parameters:

primary_query (str)
intent_class (str)
skip_retrieval (bool)
needs_complementary (bool)
complementary_queries (List[AgenticComplementaryQuery])
needs_compression (bool)
clarification_question (str | None)
strict_grounding_requested (bool)
rationale (str)
planning_ms (float)
planner_used (bool)
output_validation_retries (int)

class lalandre_rag.agentic.runtime.DecomposedQuery(sub_questions=<factory>, synthesize=False, decomposed=False, decompose_ms=0.0, output_validation_retries=0)[source]¶

Bases: object

Structured decomposition used by the planning graph.

Parameters:

sub_questions (List[str])
synthesize (bool)
decomposed (bool)
decompose_ms (float)
output_validation_retries (int)

class lalandre_rag.agentic.runtime.EvalResult(status, gap_hint, eval_ms, fallback=False, output_validation_retries=0)[source]¶

Bases: object

Sufficiency evaluation for CRAG correction.

Parameters:

status (str)
gap_hint (str | None)
eval_ms (float)
fallback (bool)
output_validation_retries (int)

lalandre_rag.agentic.runtime.decompose_query(question, llm, *, heuristic_only=True, max_sub_questions=3)[source]¶

Decompose a complex question into independent sub-questions.

Parameters:

question (str)
llm (Any)
heuristic_only (bool)
max_sub_questions (int)

Return type:

DecomposedQuery

lalandre_rag.agentic.runtime.evaluate_retrieval(question, results, llm)[source]¶

Evaluate whether current retrieval evidence is sufficient.

Parameters:

question (str)
results (List[RetrievalResult])
llm (Any)

Return type:

EvalResult

lalandre_rag.agentic.runtime.plan_retrieval(question, llm)[source]¶

Run the planner LLM to decide the retrieval strategy.

Parameters:

question (str)
llm (Any)

Return type:

AgenticRetrievalPlan

lalandre_rag.agentic.runtime.refine_retrieval(question, gap_hint, llm)[source]¶

Generate a refined retrieval query targeting the identified gap.

Parameters:

question (str)
gap_hint (str)
llm (Any)

Return type:

AgenticRetrievalPlan

`lalandre_rag.agentic.tools`¶

Source: packages/lalandre_rag/lalandre_rag/agentic/tools.py

PydanticAI tools and adapters for structured planning outputs.

lalandre_rag.agentic.tools.run_intent_parser_agent(*, question, top_k, requested_granularity, generate_text, model_name)[source]¶

Run the structured intent parser agent for one question.

Parameters:

question (str)
top_k (int)
requested_granularity (str | None)
generate_text (Callable[[str], str])
model_name (str)

Return type:

tuple[RoutingIntentOutput, int]

lalandre_rag.agentic.tools.run_decomposition_agent(*, question, llm, model_name='planner:decompose')[source]¶

Run the decomposition agent for one complex question.

Parameters:

question (str)
llm (Any)
model_name (str)

Return type:

tuple[DecompositionOutput, int]

lalandre_rag.agentic.tools.run_planner_agent(*, question, llm, model_name='planner:retrieve')[source]¶

Run the retrieval planner agent.

Parameters:

question (str)
llm (Any)
model_name (str)

Return type:

tuple[RetrievalPlannerOutput, int]

lalandre_rag.agentic.tools.run_refinement_agent(*, question, gap_hint, llm, model_name='planner:refine')[source]¶

Run the corrective refinement agent for weak retrieval results.

Parameters:

question (str)
gap_hint (str)
llm (Any)
model_name (str)

Return type:

tuple[RetrievalRefinementOutput, int]

lalandre_rag.agentic.tools.run_evaluation_agent(*, question, context_preview, llm, model_name='planner:evaluate')[source]¶

Run the sufficiency evaluator on a preview of retrieved context.

Parameters:

question (str)
context_preview (str)
llm (Any)
model_name (str)

Return type:

tuple[RetrievalEvaluationOutput, int]

`lalandre_rag.citation_sanitizer`¶

Source: packages/lalandre_rag/lalandre_rag/citation_sanitizer.py

Normalize malformed citation tags emitted by the main LLM.

The RAG prompt instructs the LLM to use strict native tags like [S1], [G2, L2], [R3], [C1], [CM4]. In practice the LLM often slips in extra material between the brackets, e.g.:

[S2, Annex I C(4) ; RTS 2 Annex III §13.1]
[G9, article 25(2)]
[G7, considérant 71]

These ad-hoc forms are not recognized by the front-end regex (which only matches the strict format) and break the prose_rewriter’s integrity check (it counts strict tags only). This module rewrites them back to the strict form before any post-processing runs:

[S2, Annex I C(4) ; RTS 2 Annex III §13.1] → [S2]
[G9, article 25(2)] → [G9]
[S1, L1] → [S1, L1] (preserved — already valid)
[G2, L2 ; article 25] → [G2, L2] (level kept, article precision dropped)

The article precision is not lost — the prompt instructs the LLM to write it in the surrounding prose (« L'article 25 [G9, L2] »). The sanitizer just strips it from inside the brackets where it doesn’t belong.

lalandre_rag.citation_sanitizer.normalize_citation_tags(text)[source]¶

Rewrite malformed citation tags into the strict native format.

Idempotent: a text already in strict form passes through unchanged. Never raises.

Parameters:: text (str)
Return type:: str

`lalandre_rag.graph`¶

Source: packages/lalandre_rag/lalandre_rag/graph/__init__.py

Graph RAG utilities — ranking, context budget, map-reduce, service, helpers.

`lalandre_rag.graph.community`¶

Source: packages/lalandre_rag/lalandre_rag/graph/community.py

Community-aware context enrichment for Graph RAG (Level 3).

Communities are stored as :Community nodes in Neo4j, linked to Acts via BELONGS_TO relationships. This module queries Neo4j directly — no JSON files on disk.

Usage:

enricher = CommunityContextEnricher(neo4j_repo)
community_block = enricher.build_context(seed_act_ids, max_communities=4)

class lalandre_rag.graph.community.CommunityContextEnricher(neo4j_repo)[source]¶

Bases: object

Enrich graph context with community summaries stored in Neo4j.

Parameters:: neo4j_repo (Any)

property available: bool¶: Return whether community enrichment can query Neo4j.

build_context(seed_act_ids, max_communities=4, max_chars=3000)[source]¶

Build a community context block for the LLM.

Returns (formatted_text, metadata_dict).

Parameters:

seed_act_ids (Set[int])
max_communities (int)
max_chars (int)

Return type:

tuple[str, Dict[str, Any]]

`lalandre_rag.graph.context_budget`¶

Source: packages/lalandre_rag/lalandre_rag/graph/context_budget.py

Token-budget-aware context builder for Graph RAG.

Instead of blindly truncating acts by position, this module manages a character budget split across three zones:

Semantic zone (60 %): content from Qdrant vector matches
Graph zone (30 %): act titles and descriptions from Neo4j expansion
Relation zone (10 %): relationship descriptions

Each zone is filled with the highest-ranked items first, so the LLM always receives the most relevant content regardless of total volume.

Usage:

budget = GraphContextBudget(max_chars=20000)
context = budget.build(
    semantic_results=semantic_results,
    ranked_nodes=ranked_nodes,
    ranked_relationships=ranked_relationships,
)

class lalandre_rag.graph.context_budget.BudgetAllocation(semantic_chars, graph_chars, relation_chars)[source]¶

Bases: object

Character-budget allocation across context zones.

Parameters:

semantic_chars (int)
graph_chars (int)
relation_chars (int)

property total: int¶: Return the total allocated character budget across all zones.

class lalandre_rag.graph.context_budget.ContextBuildResult(combined_context, source_id_map, semantic_count, graph_nodes_used, relationships_used, budget_allocation, graph_node_refs=<factory>, relationship_refs=<factory>, chars_used=<factory>)[source]¶

Bases: object

Output of the context builder.

Parameters:

combined_context (str)
source_id_map (Dict[Tuple[str, int | None, int | None], str])
semantic_count (int)
graph_nodes_used (int)
relationships_used (int)
budget_allocation (BudgetAllocation)
graph_node_refs (List[Dict[str, Any]])
relationship_refs (List[Dict[str, Any]])
chars_used (Dict[str, int])

class lalandre_rag.graph.context_budget.GraphContextBudget(max_chars=20000, semantic_share=0.60, graph_share=0.30, relation_share=0.10, min_chars_per_source=200)[source]¶

Bases: object

Build LLM context from scored and ranked graph results.

Parameters:

max_chars (int) – Total character budget for the whole context block.
semantic_share (float) – Fraction reserved for semantic search content.
graph_share (float) – Fraction reserved for graph-expanded act information.
relation_share (float) – Fraction reserved for relationship descriptions.
min_chars_per_source (int) – Minimum chars reserved for each semantic source.

build(*, semantic_results, ranked_nodes, ranked_relationships)[source]¶

Assemble the full context string from ranked results.

Spills unused budget from one zone to the next (semantic → graph → relation).

Parameters:

semantic_results (List[Any])
ranked_nodes (List[Dict[str, Any]])
ranked_relationships (List[Dict[str, Any]])

Return type:

ContextBuildResult

`lalandre_rag.graph.helpers`¶

Source: packages/lalandre_rag/lalandre_rag/graph/helpers.py

Graph-mode helper functions for the RAG service.

Contains NL→Cypher prompt building, Cypher extraction from LLM output, and Cypher-row context formatting. These are domain-level utilities that live in the package layer, not in the service layer.

lalandre_rag.graph.helpers.build_nl_to_cypher_prompt(*, question, max_graph_depth, row_limit)[source]¶

Return the system prompt that translates a natural-language question into Cypher.

Parameters:

question (str)
max_graph_depth (int)
row_limit (int)

Return type:

str

lalandre_rag.graph.helpers.normalize_cypher_candidate(candidate)[source]¶

Strip common model artifacts around an otherwise valid Cypher query.

Parameters:: candidate (str)
Return type:: str

lalandre_rag.graph.helpers.extract_cypher(text)[source]¶

Attempt to extract a Cypher query from raw LLM text.

Tries, in order: 1. JSON object with a "cypher" key. 2. Fenced code block (`cypher … `). 3. Bare Cypher starting with a keyword (MATCH, WITH, …).

Parameters:: text (str)
Return type:: str | None

lalandre_rag.graph.helpers.format_cypher_rows_for_context(*, rows, max_rows, max_chars)[source]¶

Serialize Cypher result rows into a compact text block for LLM context.

Parameters:

rows (List[Dict[str, Any]])
max_rows (int)
max_chars (int)

Return type:

str

`lalandre_rag.graph.map_reduce`¶

Source: packages/lalandre_rag/lalandre_rag/graph/map_reduce.py

Map-Reduce generation for Graph RAG.

When the assembled context exceeds a certain threshold, a single LLM call can time out or produce degraded answers. This module splits the context into chunks, runs parallel “map” calls to produce partial summaries, then merges them into a final answer with a “reduce” call.

Pipeline:

context_chunks  ──►  LLM map (parallel)  ──►  partial summaries
                                                    │
question ──────────────────────────────────────►  LLM reduce  ──►  answer

Usage:

answer = await map_reduce_generate(
    context=long_context,
    question=question,
    llm=llm_chain,
    chunk_chars=6000,
    map_timeout=15.0,
    reduce_timeout=20.0,
)

async lalandre_rag.graph.map_reduce.map_reduce_generate(*, context, question, llm, chunk_chars=None, map_timeout=None, reduce_timeout=None, max_parallel=None)[source]¶

Map-reduce generation pipeline.

Split context into chunks of chunk_chars
Run up to max_parallel map calls concurrently
Merge with a single reduce call

Falls back to the concatenated map summaries if the reduce step fails. All parameters default to values from config.graph.

Parameters:

context (str)
question (str)
llm (Any)
chunk_chars (int | None)
map_timeout (float | None)
reduce_timeout (float | None)
max_parallel (int | None)

Return type:

str

lalandre_rag.graph.map_reduce.should_use_map_reduce(context, threshold=None)[source]¶

Decide whether to use map-reduce based on context length.

Parameters:

context (str)
threshold (int | None)

Return type:

bool

`lalandre_rag.graph.neo4j_adapter`¶

Source: packages/lalandre_rag/lalandre_rag/graph/neo4j_adapter.py

Optional Neo4j GraphRAG integration for graph-mode retrieval.

This module prefers official Neo4j GraphRAG retrievers when the dependency is installed, while keeping the existing Lalandre graph pipeline as a fallback.

class lalandre_rag.graph.neo4j_adapter.Text2CypherSearchOutput(generated_cypher, rows, metadata)[source]¶

Bases: object

Result payload for official Text2Cypher retrieval.

Parameters:

generated_cypher (str)
rows (List[Dict[str, Any]])
metadata (Dict[str, Any])

class lalandre_rag.graph.neo4j_adapter.Neo4jGraphRAGAdapter(*, neo4j_driver, neo4j_database, qdrant_client, qdrant_collection_name, llm_provider, llm_model, llm_temperature, llm_max_tokens, llm_api_key, mistral_api_key, llm_base_url, key_pool, read_only_validator, row_serializer)[source]¶

Bases: object

Bridge between Lalandre graph mode and official Neo4j GraphRAG retrievers.

Parameters:

neo4j_driver (Driver)
neo4j_database (str | None)
qdrant_client (Any)
qdrant_collection_name (str)
llm_provider (str)
llm_model (str)
llm_temperature (float)
llm_max_tokens (int)
llm_api_key (str | None)
mistral_api_key (str | None)
llm_base_url (str | None)
key_pool (APIKeyPool | None)
read_only_validator (Callable[[str], str])
row_serializer (Callable[[Any], Any])

is_available()[source]¶

Return whether the adapter is ready to serve official GraphRAG calls.

Return type:: bool

text_to_cypher_search(*, question, max_graph_depth, row_limit)[source]¶

Run the official Neo4j Text2Cypher retriever and normalize its output.

Parameters:

question (str)
max_graph_depth (int)
row_limit (int)

Return type:

Text2CypherSearchOutput

`lalandre_rag.graph.ranker`¶

Source: packages/lalandre_rag/lalandre_rag/graph/ranker.py

Graph node ranking for Graph RAG.

Scores and ranks graph-expanded nodes by relevance to avoid sending noise to the LLM. Three signals are combined:

Hop distance – nodes closer to the seed acts score higher.
Semantic overlap – nodes that also appear in the Qdrant results get a boost (they matched the query both semantically and structurally).
Relation type weight – AMENDS / IMPLEMENTS (strong legal ties) outweigh CITES / DEROGATES (weaker references).

Usage:

ranked = rank_graph_nodes(
    graph_context=graph_context,
    relationships=relationships,
    semantic_act_ids=semantic_act_ids,
    seed_act_ids=seed_act_ids,
)
# ranked is sorted best-first; slice to your budget

lalandre_rag.graph.ranker.rank_graph_nodes(*, graph_context, relationships, semantic_act_ids, seed_act_ids, max_depth=5, hop_decay=0.5, semantic_boost=0.3, relation_weight_factor=0.25)[source]¶

Score and rank graph-expanded nodes.

Each node receives a normalized composite score in [0, 1] based on hop distance, semantic overlap, and incident relation strength.

Parameters:

graph_context (List[Dict[str, Any]]) – Graph-expanded act nodes to score.
relationships (List[Dict[str, Any]]) – Graph relationships connecting the candidate nodes.
semantic_act_ids (Set[int]) – Act identifiers also returned by semantic search.
seed_act_ids (Set[int]) – Seed act identifiers used to start graph expansion.
max_depth (int) – Maximum BFS depth used to estimate hop distance.
hop_decay (float) – Exponential decay applied to hop distance.
semantic_boost (float) – Non-negative weight applied to semantic overlap.
relation_weight_factor (float) – Non-negative weight applied to relation strength.

Returns:

The input nodes enriched with ranking metadata and sorted best-first.

Return type:

List[Dict[str, Any]]

lalandre_rag.graph.ranker.rank_relationships(*, relationships, top_act_ids)[source]¶

Keep only relationships that connect nodes in top_act_ids and sort by relation-type weight descending.

Parameters:

relationships (List[Dict[str, Any]])
top_act_ids (Set[int])

Return type:

List[Dict[str, Any]]

`lalandre_rag.graph.service`¶

Source: packages/lalandre_rag/lalandre_rag/graph/service.py

Graph service for RAG.

Combines semantic search with graph traversal for enhanced regulatory context retrieval.

class lalandre_rag.graph.service.GraphRAGService(neo4j_repo, qdrant_repo, embedding_service, key_pool=None)[source]¶

Bases: object

Graph-Enhanced Retrieval-Augmented Generation Service

This service implements the Graph RAG approach by: 1. Using semantic search (Qdrant) to find relevant subdivisions 2. Enriching results with act-level graph context (Neo4j) to capture relationships 3. Providing regulatory ecosystem understanding at the act level

Key capabilities: - Semantic search (subdivision-level) with graph expansion (act-level) - Regulatory path discovery between acts - Temporal relationship tracking (amendments, repeals, etc.) - Full regulatory context retrieval

Note: Graph operations focus on act-level relationships for performance.: Subdivision details are retrieved from Qdrant/PostgreSQL.

Initialize Graph RAG service

Parameters:

neo4j_repo (Neo4jRepository) – Neo4j repository for graph operations
qdrant_repo (QdrantRepository) – Qdrant repository for semantic search
embedding_service (EmbeddingService) – Service for generating embeddings
key_pool (APIKeyPool | None)

supports_official_text2cypher()[source]¶

Return whether the official Neo4j Text2Cypher adapter is available.

Return type:: bool

text_to_cypher_search(*, question, max_graph_depth, row_limit)[source]¶

Delegate a graph question to the official Text2Cypher adapter.

Parameters:

question (str)
max_graph_depth (int)
row_limit (int)

Return type:

Text2CypherSearchOutput

`lalandre_rag.graph.source_payloads`¶

Source: packages/lalandre_rag/lalandre_rag/graph/source_payloads.py

Helpers to serialize graph-derived evidence into a consistent payload.

lalandre_rag.graph.source_payloads.build_graph_node_source_item(*, node, source_id, sequence_order)[source]¶

Serialize a ranked graph node into a user-facing evidence item.

Parameters:

node (Dict[str, Any])
source_id (str)
sequence_order (int)

Return type:

Dict[str, Any]

lalandre_rag.graph.source_payloads.build_graph_edge_source_item(*, relationship, source_id, sequence_order, start_celex, end_celex)[source]¶

Serialize a ranked graph relationship into a user-facing evidence item.

Parameters:

relationship (Dict[str, Any])
source_id (str)
sequence_order (int)
start_celex (str)
end_celex (str)

Return type:

Dict[str, Any]

lalandre_rag.graph.source_payloads.build_cypher_row_source_item(*, row, row_index, include_full_content, content_preview_chars, query_id, graph_query_strategy, generated_cypher)[source]¶

Serialize a Cypher row into a concrete evidence item without fake scoring.

Parameters:

row (Dict[str, Any])
row_index (int)
include_full_content (bool)
content_preview_chars (int)
query_id (str)
graph_query_strategy (str)
generated_cypher (str | None)

Return type:

Dict[str, Any]

`lalandre_rag.linker_factory`¶

Source: packages/lalandre_rag/lalandre_rag/linker_factory.py

Build a LegalEntityLinker wired for the rag-service runtime.

The linker is shared with the extraction pipeline. For RAG use, we additionally populate act_id on each ActAliasEntry (so resolutions carry the act primary key) and supply an article_lookup callable that maps (act_id, article_number) to a subdivision id using a small in-memory cache seeded from the database.

lalandre_rag.linker_factory.build_linker(pg_repo, *, fuzzy_threshold, fuzzy_min_gap, fuzzy_limit, min_alias_chars, article_cache_size=4096)[source]¶

Construct a LegalEntityLinker seeded from the acts table.

The returned linker carries act_id on every alias entry and is wired with an article_lookup callable that queries subdivisions on demand (LRU-cached, bounded).

Parameters:

pg_repo (PostgresRepository)
fuzzy_threshold (float)
fuzzy_min_gap (float)
fuzzy_limit (int)
min_alias_chars (int)
article_cache_size (int)

Return type:

LegalEntityLinker

lalandre_rag.linker_factory.build_external_detector(linker)[source]¶

Construct the optional NER-backed ExternalDetector for prose linking.

Reads NER_SERVICE_URL from the environment. When unset (or empty), returns None so the regex+fuzzy linker keeps its V1 behaviour with zero overhead. When set, builds a small HTTP client and wraps it in an adapter that resolves NER spans through the same LegalEntityLinker.

Parameters:: linker (LegalEntityLinker)
Return type:: Callable[[str], Sequence[ExternalDetection]] | None

`lalandre_rag.llm`¶

Source: packages/lalandre_rag/lalandre_rag/llm/__init__.py

LLM factory utilities for RAG.

`lalandre_rag.llm.factory`¶

Source: packages/lalandre_rag/lalandre_rag/llm/factory.py

Factory utilities for RAG LLM clients.

class lalandre_rag.llm.factory.RAGLLMClients(provider, model_name, chat_llm, llamaindex_llm)[source]¶

Bases: object

Bundled LLM clients used by RAG modes.

Parameters:

provider (str)
model_name (str)
chat_llm (Any)
llamaindex_llm (LLM | None)

lalandre_rag.llm.factory.build_rag_llm_clients(*, provider, model_name, temperature, max_tokens, timeout_seconds, base_url, mistral_base_url, context_window, api_key, mistral_api_key, key_pool=None)[source]¶

Build provider-specific clients for RAG.

When key_pool is provided and contains >1 key, multiple underlying clients are created and dispatched through the shared pool.

Supported providers: mistral, openai_compatible.

Parameters:

provider (str)
model_name (str)
temperature (float)
max_tokens (int)
timeout_seconds (float)
base_url (str | None)
mistral_base_url (str)
context_window (int)
api_key (str | None)
mistral_api_key (str | None)
key_pool (APIKeyPool | None)

Return type:

RAGLLMClients

`lalandre_rag.models`¶

Source: packages/lalandre_rag/lalandre_rag/models/__init__.py

Shared API models for the RAG service.

`lalandre_rag.models.api`¶

Source: packages/lalandre_rag/lalandre_rag/models/api.py

Shared API models for the RAG service.

These models are the single source of truth for request/response schemas shared between the rag-service and the api-gateway.

class lalandre_rag.models.api.QueryMetadata[source]¶

Bases: TypedDict

Known fields in QueryResponse.metadata. Additional mode-specific fields may be present.

class lalandre_rag.models.api.SourcesResponse(*, total, documents, acts=None, graph_nodes=None, graph_edges=None, cypher_rows=None, community_reports=None, graph_query=None)[source]¶

Bases: BaseModel

Structured sources format from ResponseBuilder

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

total (int)
documents (List[Dict[str, Any]])
acts (Dict[str, Any] | None)
graph_nodes (List[Dict[str, Any]] | None)
graph_edges (List[Dict[str, Any]] | None)
cypher_rows (List[Dict[str, Any]] | None)
community_reports (List[Dict[str, Any]] | None)
graph_query (Dict[str, Any] | None)

model_config: ClassVar[ConfigDict] = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class lalandre_rag.models.api.QueryResponse(*, query_id, question, answer, mode, sources=None, metadata, conversation_id=None, message_id=None)[source]¶

Bases: BaseModel

Response model for RAG queries

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

query_id (str)
question (str)
answer (str)
mode (str)
sources (SourcesResponse | None)
metadata (Dict[str, Any])
conversation_id (str | None)
message_id (str | None)

model_config: ClassVar[ConfigDict] = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class lalandre_rag.models.api.SearchRequest(*, query=None, query_embedding=None, top_k=None, mode=None, score_threshold=None, granularity=None, embedding_preset=None, include_full_content=False, filters=None)[source]¶

Bases: BaseModel

Search request (semantic / lexical / hybrid).

top_k, mode and granularity are optional — when omitted the rag-service resolves them from SearchConfig defaults.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

query (str | None)
query_embedding (List[float] | None)
top_k (int | None)
mode (str | None)
score_threshold (float | None)
granularity (str | None)
embedding_preset (str | None)
include_full_content (bool)
filters (Dict[str, Any] | None)

validate_query()[source]¶

Ensure at least a text query or a precomputed embedding is provided.

Return type:: SearchRequest

model_config: ClassVar[ConfigDict] = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class lalandre_rag.models.api.SearchResult(*, celex, subdivision_id, chunk_id=None, chunk_index=None, content, score, metadata, trace=None)[source]¶

Bases: BaseModel

Search result item

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

celex (str | None)
subdivision_id (int)
chunk_id (int | None)
chunk_index (int | None)
content (str)
score (float)
metadata (Dict[str, Any])
trace (Dict[str, Any] | None)

model_config: ClassVar[ConfigDict] = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class lalandre_rag.models.api.SearchResponse(*, search_id, results, total, mode)[source]¶

Bases: BaseModel

Search response

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

search_id (str)
results (List[SearchResult])
total (int)
mode (str)

model_config: ClassVar[ConfigDict] = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

`lalandre_rag.modes`¶

Source: packages/lalandre_rag/lalandre_rag/modes/__init__.py

RAG query mode package.

`lalandre_rag.modes.hybrid_generation`¶

Source: packages/lalandre_rag/lalandre_rag/modes/hybrid_generation.py

Generation strategies for HybridMode — standard and global modes, QA chain execution.

Extracted from hybrid_mode.py to keep the orchestrator focused on the pipeline.

class lalandre_rag.modes.hybrid_generation.SourceArtifacts(acts, documents, validation_sources, payload, build_ms)[source]¶

Bases: object

Prepared source payloads shared by sync and streaming generation paths.

Parameters:

acts (Dict[str, Any])
documents (List[Dict[str, Any]])
validation_sources (List[Dict[str, Any]])
payload (Dict[str, Any] | None)
build_ms (float)

lalandre_rag.modes.hybrid_generation.run_rag_chain(*, rag_prompt, llm, question, context, graph_context='', chat_history=None)[source]¶

Run the blocking QA chain and return the generated answer text.

Parameters:

rag_prompt (ChatPromptTemplate)
llm (Any)
question (str)
context (str)
graph_context (str)
chat_history (List[BaseMessage] | None)

Return type:

str

lalandre_rag.modes.hybrid_generation.stream_rag_chain(*, rag_prompt, llm, question, context, graph_context='', chat_history=None)[source]¶

Stream answer chunks from the QA chain.

Parameters:

rag_prompt (ChatPromptTemplate)
llm (Any)
question (str)
context (str)
graph_context (str)
chat_history (List[BaseMessage] | None)

Return type:

Iterator[str]

lalandre_rag.modes.hybrid_generation.query_standard_mode(*, question, context_slices, llm, rag_prompt, include_relations, include_subjects, include_full_content, return_sources, graph_fetch=None, chat_history=None, progress_callback=None, preamble_callback=None, token_callback=None, final_answer_callback=None, entity_linker=None, external_detector=None, lightweight_llm=None, cypher_documents=None, cypher_query_meta=None)[source]¶

Run the standard hybrid QA path with optional graph support.

Parameters:

question (str)
context_slices (List[ContextSlice])
llm (Any)
rag_prompt (ChatPromptTemplate)
include_relations (bool)
include_subjects (bool)
include_full_content (bool)
return_sources (bool)
graph_fetch (GraphFetchResult | None)
chat_history (List[BaseMessage] | None)
progress_callback (Callable[[Dict[str, Any]], None] | None)
preamble_callback (Callable[[Dict[str, Any] | None, Dict[str, Any]], None] | None)
token_callback (Callable[[str], None] | None)
final_answer_callback (Callable[[str], None] | None)
entity_linker (LegalEntityLinker | None)
external_detector (Callable[[str], Any] | None)
lightweight_llm (Any)
cypher_documents (List[Dict[str, Any]] | None)
cypher_query_meta (Dict[str, Any] | None)

Return type:

Dict[str, Any]

lalandre_rag.modes.hybrid_generation.query_global_mode(*, question, context_slices, llm, rag_prompt, include_full_content, include_subjects, return_sources, graph_fetch=None, chat_history=None, progress_callback=None, preamble_callback=None, token_callback=None, final_answer_callback=None, entity_linker=None, external_detector=None, lightweight_llm=None, cypher_documents=None, cypher_query_meta=None)[source]¶

Run the global GraphRAG path with community reporting.

Parameters:

question (str)
context_slices (List[ContextSlice])
llm (Any)
rag_prompt (ChatPromptTemplate)
include_full_content (bool)
include_subjects (bool)
return_sources (bool)
graph_fetch (GraphFetchResult | None)
chat_history (List[BaseMessage] | None)
progress_callback (Callable[[Dict[str, Any]], None] | None)
preamble_callback (Callable[[Dict[str, Any] | None, Dict[str, Any]], None] | None)
token_callback (Callable[[str], None] | None)
final_answer_callback (Callable[[str], None] | None)
entity_linker (LegalEntityLinker | None)
external_detector (Callable[[str], Any] | None)
lightweight_llm (Any)
cypher_documents (List[Dict[str, Any]] | None)
cypher_query_meta (Dict[str, Any] | None)

Return type:

Dict[str, Any]

`lalandre_rag.modes.hybrid_graph`¶

Source: packages/lalandre_rag/lalandre_rag/modes/hybrid_graph.py

Graph enrichment for HybridMode — Neo4j expansion and community context.

Extracted from hybrid_mode.py to keep the orchestrator focused on the pipeline.

lalandre_rag.modes.hybrid_graph.fetch_graph_context(*, act_ids, graph_rag_service, community_enricher, max_depth=None)[source]¶

Fetch graph data for acts found in retrieval results.

Uses act_ids from hybrid retrieval (semantic + BM25) as seeds for Neo4j graph traversal. Returns structured data for ranking/budgeting. Non-fatal: returns None on failure.

Parameters:

act_ids (Set[int])
graph_rag_service (GraphRAGService)
community_enricher (CommunityContextEnricher | None)
max_depth (int | None)

Return type:

GraphFetchResult | None

`lalandre_rag.modes.hybrid_helpers`¶

Source: packages/lalandre_rag/lalandre_rag/modes/hybrid_helpers.py

Helpers for HybridMode — context assembly, source building, metadata, and citation.

Extracted from hybrid_mode.py to keep the orchestrator focused on the pipeline.

lalandre_rag.modes.hybrid_helpers.emit_progress(callback, *, phase, status, label, detail=None, count=None, duration_ms=None, meta=None)[source]¶

Emit a structured progress event when a callback is configured.

Parameters:

callback (Callable[[Dict[str, Any]], None] | None)
phase (str)
status (str)
label (str)
detail (str | None)
count (int | None)
duration_ms (float | None)
meta (Dict[str, Any] | None)

Return type:

None

lalandre_rag.modes.hybrid_helpers.build_source_context(*, context_slices, max_context_chars, min_chars_per_source, max_sources)[source]¶

Build the source-context block and return (context_text, refs, remaining_chars).

Parameters:

context_slices (List[ContextSlice])
max_context_chars (int)
min_chars_per_source (int)
max_sources (int)

Return type:

tuple[str, List[Dict[str, Any]], int]

lalandre_rag.modes.hybrid_helpers.build_relation_summary(*, context_slices, line_limit)[source]¶

Build a compact relation-signals block for the LLM context.

Parameters:

context_slices (List[ContextSlice])
line_limit (int)

Return type:

str

lalandre_rag.modes.hybrid_helpers.format_reports_block(reports)[source]¶

Render community reports into a text block for the LLM context.

Parameters:: reports (List[CommunityReport])
Return type:: str

lalandre_rag.modes.hybrid_helpers.attach_citation_validation(*, response, answer, sources)[source]¶

Validate mixed source citations in the answer and attach results to metadata.

Parameters:

response (Dict[str, Any])
answer (str)
sources (List[Dict[str, Any]])

Return type:

None

lalandre_rag.modes.hybrid_helpers.build_plan_metadata(*, retrieval_plan, requested_top_k, effective_top_k, requested_granularity, effective_granularity, requested_include_relations, effective_include_relations, retrieval_query, original_question)[source]¶

Serialize a retrieval plan to an audit-friendly metadata dict.

Parameters:

retrieval_plan (RetrievalPlan)
requested_top_k (int)
effective_top_k (int)
requested_granularity (str | None)
effective_granularity (str | None)
requested_include_relations (bool)
effective_include_relations (bool)
retrieval_query (str)
original_question (str)

Return type:

Dict[str, Any]

class lalandre_rag.modes.hybrid_helpers.GraphFetchResult(nodes, relationships, seed_act_ids, expanded_act_ids, community_block='', community_meta=<factory>, duration_ms=0.0)[source]¶

Bases: object

Raw output from Neo4j graph expansion (before ranking).

Parameters:

nodes (List[Dict[str, Any]])
relationships (List[Dict[str, Any]])
seed_act_ids (Set[int])
expanded_act_ids (Set[int])
community_block (str)
community_meta (Dict[str, Any])
duration_ms (float)

lalandre_rag.modes.hybrid_helpers.build_ranked_graph_context(*, fetch_result, semantic_results, max_context_chars, graph_acts_limit, graph_relationships_limit, hop_decay, semantic_boost, relation_weight_factor, budget_semantic_share, budget_graph_share, budget_relation_share, min_chars_per_source, max_depth)[source]¶

Rank graph nodes/relationships and build budget-aware context.

Returns (combined_context_str, metadata_dict, graph_node_refs, relationship_refs).

Parameters:

fetch_result (GraphFetchResult)
semantic_results (List[Any])
max_context_chars (int)
graph_acts_limit (int)
graph_relationships_limit (int)
hop_decay (float)
semantic_boost (float)
relation_weight_factor (float)
budget_semantic_share (float)
budget_graph_share (float)
budget_relation_share (float)
min_chars_per_source (int)
max_depth (int)

Return type:

Tuple[str, Dict[str, Any], List[Dict[str, Any]], List[Dict[str, Any]]]

`lalandre_rag.modes.hybrid_mode`¶

Source: packages/lalandre_rag/lalandre_rag/modes/hybrid_mode.py

Hybrid mode for QA with deterministic routing and context budgeting.

Pipeline orchestrator — delegates generation to hybrid_generation and graph enrichment to hybrid_graph.

class lalandre_rag.modes.hybrid_mode.HybridMode(retrieval_service, context_service, llm, rag_prompt, graph_rag_service=None, lightweight_llm=None, key_pool=None, entity_linker=None, external_detector=None)[source]¶

Bases: object

MODE 3: retrieval + generation. Includes a global community-aware path for broad queries.

Parameters:

retrieval_service (RetrievalService)
context_service (ContextService)
llm (Any)
rag_prompt (ChatPromptTemplate)
graph_rag_service (GraphRAGService | None)
lightweight_llm (Any)
key_pool (APIKeyPool | None)
entity_linker (LegalEntityLinker | None)
external_detector (Callable[[str], Any] | None)

query(question, top_k=10, score_threshold=None, filters=None, include_relations=False, include_subjects=True, include_full_content=False, return_sources=True, collections=None, granularity=None, chat_history=None, graph_depth=None, use_graph=None, embedding_preset=None, retrieval_depth=None, cypher_documents=None, cypher_query_meta=None)[source]¶

Run the full hybrid pipeline and return a policy-compliant response.

Parameters:

question (str)
top_k (int)
score_threshold (float | None)
filters (Dict[str, Any] | None)
include_relations (bool)
include_subjects (bool)
include_full_content (bool)
return_sources (bool)
collections (List[str] | None)
granularity (str | None)
chat_history (List[BaseMessage] | None)
graph_depth (int | None)
use_graph (bool | None)
embedding_preset (str | None)
retrieval_depth (str | None)
cypher_documents (List[Dict[str, Any]] | None)
cypher_query_meta (Dict[str, Any] | None)

Return type:

Dict[str, Any]

stream_query(question, top_k=10, score_threshold=None, filters=None, include_relations=False, include_subjects=True, include_full_content=False, return_sources=True, collections=None, granularity=None, chat_history=None, graph_depth=None, use_graph=None, embedding_preset=None, retrieval_depth=None, cypher_documents=None, cypher_query_meta=None)[source]¶

Stream query with live progress events emitted from a worker thread.

Parameters:

question (str)
top_k (int)
score_threshold (float | None)
filters (Dict[str, Any] | None)
include_relations (bool)
include_subjects (bool)
include_full_content (bool)
return_sources (bool)
collections (List[str] | None)
granularity (str | None)
chat_history (List[BaseMessage] | None)
graph_depth (int | None)
use_graph (bool | None)
embedding_preset (str | None)
retrieval_depth (str | None)
cypher_documents (List[Dict[str, Any]] | None)
cypher_query_meta (Dict[str, Any] | None)

Return type:

Iterator[Dict[str, Any] | str]

`lalandre_rag.modes.llm_mode`¶

Source: packages/lalandre_rag/lalandre_rag/modes/llm_mode.py

LLM Only Mode Pure LLM generation without retrieval

class lalandre_rag.modes.llm_mode.LLMMode(llm)[source]¶

Bases: object

MODE 2: Pure LLM (100% Generation) Generate answer using only LLM knowledge (no retrieval)

Initialize LLM only mode

Parameters:: llm (Any) – LLM client

query(question, include_warning=True)[source]¶

Generate answer using only LLM knowledge

Parameters:

question (str) – User question
include_warning (bool) – Include warning about no document grounding

Returns:

Dictionary with LLM answer (no sources)

Return type:

Dict[str, Any]

stream_query(question)[source]¶

Stream LLM answer token by token.

Parameters:: question (str)
Return type:: Iterator[str]

`lalandre_rag.modes.summarize_mode`¶

Source: packages/lalandre_rag/lalandre_rag/modes/summarize_mode.py

Summarize Mode Generate summaries of documents related to a topic

class lalandre_rag.modes.summarize_mode.SummarizeMode(retrieval_service, context_service, llamaindex_adapter, citation_llm=None, act_summary_service=None, question_summary_service=None)[source]¶

Bases: object

MODE 4: Summarization Generate summaries using TreeSummarize for hierarchical processing

Initialize summarize mode

Parameters:

retrieval_service (RetrievalService) – Service for document retrieval
context_service (ContextService) – Service for context enrichment
llamaindex_adapter (LlamaIndexAdapter | None) – LlamaIndex adapter (optional)
citation_llm (Any)
act_summary_service (ActSummaryService | None)
question_summary_service (QuestionSummaryService | None)

summarize_canonical(*, celex, question)[source]¶

Return a cached canonical summary response when one is available.

Parameters:

celex (str)
question (str)

Return type:

Dict[str, Any] | None

summarize_question(*, topic, top_k, score_threshold, filters, include_relations, include_full_content)[source]¶

Run question summarization, optionally augmented with canonical memory.

Parameters:

topic (str)
top_k (int)
score_threshold (float | None)
filters (Dict[str, Any] | None)
include_relations (bool)
include_full_content (bool)

Return type:

Dict[str, Any]

summarize(topic, top_k=10, score_threshold=None, filters=None, include_relations=True, include_full_content=False)[source]¶

Generate a summary of documents related to a topic

Parameters:

topic (str) – Topic or question to summarize
top_k (int) – Number of documents to retrieve
filters (Dict[str, Any] | None) – Metadata filters
include_relations (bool) – Include relations in context
score_threshold (float | None)
include_full_content (bool)

Returns:

Dictionary with summary and sources

Return type:

Dict[str, Any]

class lalandre_rag.modes.summarize_mode.CompareMode(retrieval_service, context_service, llamaindex_adapter, citation_llm=None, question_summary_service=None)[source]¶

Bases: object

MODE 5: Comparison Compare multiple legal documents

Initialize compare mode

Parameters:

retrieval_service (RetrievalService) – Service for document retrieval
context_service (ContextService) – Service for context enrichment
llamaindex_adapter (LlamaIndexAdapter | None) – LlamaIndex adapter (optional)
citation_llm (Any)
question_summary_service (QuestionSummaryService | None)

compare(comparison_question, celex_list=None, top_k=10, score_threshold=None, include_full_content=False)[source]¶

Compare multiple legal documents

Parameters:

comparison_question (str) – What to compare
celex_list (List[str] | None) – Optional list of specific CELEX to compare
top_k (int) – Number of documents if CELEX not specified
score_threshold (float | None)
include_full_content (bool)

Returns:

Dictionary with comparison and sources

Return type:

Dict[str, Any]

`lalandre_rag.ner_external`¶

Source: packages/lalandre_rag/lalandre_rag/ner_external.py

Glue between the NER service and prose_linker’s external_detector hook.

The NER service returns free-text spans like ("directive 2014/65/UE", 12, 32, "directive", 0.91). prose_linker needs ExternalDetection instances already resolved to an internal act_id. This module bridges the two:

Call the NER service (or any other zero-shot detector) to find candidate spans the regex layer might miss (paraphrases, fuzzy mentions).
Run each candidate through LegalEntityLinker to resolve to an act_id. Spans the linker cannot resolve (or resolves with low confidence / fallback method) are dropped — never link a span we cannot back with a chunk.
Return the resolved spans as ExternalDetection for the linker to merge.

The factory build_ner_external_detector is what callers use; it returns None when no NER service URL is configured, so the rest of the pipeline keeps the regex-only behaviour with zero overhead.

lalandre_rag.ner_external.build_ner_external_detector(ner_client, linker, *, min_span_score=0.5, min_link_score=0.85)[source]¶

Return an ExternalDetector callable backed by the NER service.

Parameters:

ner_client (NerClient) – Configured client for the NER service.
linker (LegalEntityLinker) – Same linker used for regex-based resolution; reused here to translate NER text spans into internal act_id values.
min_span_score (float) – Drop NER spans below this confidence threshold.
min_link_score (float) – Drop linker resolutions below this score.

Return type:

Callable[[str], Sequence[ExternalDetection]]

The returned callable is safe to invoke on every answer: errors are swallowed and an empty list is returned, so the regex layer always wins by default.

`lalandre_rag.prompts`¶

Source: packages/lalandre_rag/lalandre_rag/prompts/__init__.py

Centralized prompt loaders for lalandre_rag. All prompt text lives in prompts/ to keep code and content separated.

lalandre_rag.prompts.get_langchain_prompt(prompt_type, *, with_history=False)[source]¶

Return the LangChain chat prompt for the given prompt_type.

When with_history is True a MessagesPlaceholder("chat_history") is inserted between the system and human messages so that conversation history can be injected at invocation time. The placeholder is marked optional=True so that callers without history can simply omit the key (or pass an empty list) and the prompt remains unchanged.

Parameters:

prompt_type (str)
with_history (bool)

Return type:

ChatPromptTemplate

lalandre_rag.prompts.get_llamaindex_prompt(prompt_type)[source]¶

Return the LlamaIndex prompt template for summary/comparison (with fallback).

Parameters:: prompt_type (str)
Return type:: PromptTemplate

lalandre_rag.prompts.render_llm_only_prompt(*, question)[source]¶

Prompt used by LLM-only mode (no retrieval).

Parameters:: question (str)
Return type:: str

lalandre_rag.prompts.render_planner_prompt(*, question)[source]¶

Prompt for the retrieval planner that decides multi-step strategy.

Parameters:: question (str)
Return type:: str

lalandre_rag.prompts.render_compressor_prompt(*, celex, title, level, fragments, max_chars)[source]¶

Prompt for context compression of multiple fragments from one act.

Parameters:

celex (str)
title (str)
level (str)
fragments (str)
max_chars (int)

Return type:

str

lalandre_rag.prompts.render_nl_to_cypher_prompt(*, question, max_graph_depth, row_limit)[source]¶

System prompt to translate natural language to Cypher (graph_helpers).

Parameters:

question (str)
max_graph_depth (int)
row_limit (int)

Return type:

str

lalandre_rag.prompts.render_intent_parser_prompt(*, question, top_k, requested_granularity)[source]¶

Prompt for the LLM intent parser used in query_parser.

Parameters:

question (str)
top_k (int)
requested_granularity (str)

Return type:

str

lalandre_rag.prompts.get_text2cypher_prompt_template()[source]¶

Template consumed by neo4j_graphrag Text2Cypher retriever.

Return type:: str

`lalandre_rag.prose_linker`¶

Source: packages/lalandre_rag/lalandre_rag/prose_linker.py

Post-process LLM responses to make regulatory references clickable.

Uses the shared LegalEntityLinker to detect explicit identifiers (CELEX, EU refs, national authority refs) and combined article N du <act> patterns in the final answer text, then wraps each resolved reference in a markdown link pointing to the library route (/library/acts/:act_id[#sub-:subdivision_id]).

Existing markdown links and citation tags ([S1], [G1], [R1], [C1], [CM1]) are preserved — we never rewrite content inside those regions.

lalandre_rag.prose_linker.link_prose(text, linker, *, min_score=0.85, resolve_articles=True, allowed_act_ids=None, external_detector=None)[source]¶

Return text with regulatory references wrapped as markdown links.

Each detected reference is linked to its library page (and optionally to a specific article subdivision). Existing markdown links and citation tags are preserved unmodified. Fallback resolutions (unvalidated) and generic targets are never linked.

If allowed_act_ids is provided (non-None), only references whose resolved act_id is in the set are linked. Mentions of acts not in the RAG source set remain as plain text — this prevents the UI from promising a “click to see the passage” that leads to an empty panel. Pass None (default) to disable the filter.

If external_detector is provided, its detections are merged with the regex+fuzzy ones from this module. External detections take precedence on overlap. Designed to accept a locally-hosted third-party detector (e.g. Ref2Link) without coupling this module to it.

Parameters:

text (str)
linker (LegalEntityLinker)
min_score (float)
resolve_articles (bool)
allowed_act_ids (Set[int] | None)
external_detector (Callable[[str], Sequence[ExternalDetection]] | None)

Return type:

str

class lalandre_rag.prose_linker.ExternalDetection(start, end, act_id, subdivision_id=None, eli=None)[source]¶

Bases: object

A legal-reference span detected by an external detector.

Used to plug a third-party detector (e.g. a locally-hosted Ref2Link service) next to our native regex+fuzzy engine. The external source must return spans resolved to an internal act_id — translation from their identifier space (ELI URI, CELEX, …) to our DB id stays the caller’s responsibility.

Parameters:

start (int)
end (int)
act_id (int)
subdivision_id (int | None)
eli (str | None)

`lalandre_rag.prose_rewriter`¶

Source: packages/lalandre_rag/lalandre_rag/prose_rewriter.py

Post-process a chatbot answer that contains bullets into flowing prose.

Safety rails (every failure falls back to the original answer):

Skip rewriting when the answer is already mostly prose.
Reject the rewrite if any native citation tag such as [S1], [G1], [R1], [C1], or [CM1] with an optional , L1/L2/L3 suffix is altered or lost.
Reject the rewrite if its length drifts too far from the original answer.
Catch any LLM exception silently.

This is best-effort: the return value is always a valid answer, and citations are preserved with the same multiplicity as the input.

lalandre_rag.prose_rewriter.rewrite_to_prose(answer, llm, *, max_bullet_ratio=0.10)[source]¶

Rewrite a bullet-heavy answer into flowing prose.

Returns answer unchanged if:

llm is None or answer is empty/whitespace.
The bullet ratio is below max_bullet_ratio (nothing to rewrite).
The system prompt is missing.
The LLM call fails or returns an unusable payload.
The rewritten output is out of bounds in length.
The rewritten output does not preserve the native citation tags with identical multiplicity.

Parameters:

answer (str)
llm (Any)
max_bullet_ratio (float)

Return type:

str

`lalandre_rag.response`¶

Source: packages/lalandre_rag/lalandre_rag/response/__init__.py

Response building — builder, factories, fallbacks.

`lalandre_rag.response.builder`¶

Source: packages/lalandre_rag/lalandre_rag/response/builder.py

Response Builder Centralized builder for unified response format across all RAG modes

class lalandre_rag.response.builder.SourcesBlock(*, total=0, documents=<factory>, acts=<factory>)[source]¶

Bases: BaseModel

Validated sources block in a RAG response.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

total (int)
documents (List[Dict[str, Any]])
acts (Dict[str, Any])

sync_total()[source]¶

Keep total aligned with the number of document sources.

Return type:: SourcesBlock

model_config: ClassVar[ConfigDict] = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class lalandre_rag.response.builder.RAGResponse(*, mode, query, answer=None, sources=<factory>, metadata=<factory>)[source]¶

Bases: BaseModel

Validated unified response format for all RAG modes.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

mode (str)
query (str)
answer (str | None)
sources (SourcesBlock)
metadata (Dict[str, Any])

model_config: ClassVar[ConfigDict] = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

lalandre_rag.response.builder.format_doc_location(chunk_id, chunk_index, subdivision_type, subdivision_id)[source]¶

Formate la localisation d’une tranche de document (chunk ou subdivision).

Ex. : “chunk 42:3” ou “article 12” Utilisé pour construire les en-têtes de sources dans tous les modes.

Parameters:

chunk_id (int | None)
chunk_index (int | None)
subdivision_type (str)
subdivision_id (int)

Return type:

str

lalandre_rag.response.builder.format_source_header(source_id, celex, location, title, regulatory_level=None)[source]¶

Formate l’en-tête standard inséré dans le contexte LLM.

Ex. : “[S1 | CELEX: 32016R0679 | L1 | article 5] Règlement général…”

Parameters:

source_id (str)
celex (str)
location (str)
title (str)
regulatory_level (str | None)

Return type:

str

class lalandre_rag.response.builder.ResponseBuilder(mode, query, _answer=None, _sources=<factory>, _metadata=<factory>, _acts=<factory>)[source]¶

Bases: object

Builder for the unified response format

Usage:: builder = ResponseBuilder(mode=”search”, query=”test”) builder.set_answer(None) builder.set_sources([{“celex”: “123”, “title”: “Test”}]) builder.add_metadata(“warning”, “Test warning”) response = builder.build()

Parameters:

mode (str)
query (str)
_answer (str | None)
_sources (List[Dict[str, Any]])
_metadata (Dict[str, Any])
_acts (Dict[str, Any])

set_answer(answer)[source]¶

Set the generated answer.

Parameters:: answer (str | None)
Return type:: ResponseBuilder

set_sources(documents)[source]¶

Replace all source documents.

Parameters:: documents (List[Dict[str, Any]])
Return type:: ResponseBuilder

add_metadata(key, value)[source]¶

Add a metadata entry.

Parameters:

key (str)
value (Any)

Return type:

ResponseBuilder

set_acts(acts)[source]¶

Replace all act contexts.

Parameters:: acts (Dict[str, Any])
Return type:: ResponseBuilder

build()[source]¶

Build the response in the unified format.

Returns:: Dict with the validated unified format.
Return type:: Dict[str, Any]

lalandre_rag.response.builder.build_source_trace(metadata)[source]¶

Extract a compact, traceable subset of metadata for sources. Avoids duplicating large payload fields while preserving retrieval provenance.

Parameters:: metadata (Dict[str, Any] | None)
Return type:: Dict[str, Any]

lalandre_rag.response.builder.build_source_document(doc, *, include_relations=False, include_subjects=False, include_full_content=True, include_content_preview=False, content_preview_length=None, include_snippet=False, snippet_length=None, content_used=None, content_truncated=None, source_id=None)[source]¶

Build a standardized source document payload from a context slice.

This keeps response formatting consistent across RAG modes while reusing upstream metadata where possible.

Parameters:

doc (Any)
include_relations (bool)
include_subjects (bool)
include_full_content (bool)
include_content_preview (bool)
content_preview_length (int | None)
include_snippet (bool)
snippet_length (int | None)
content_used (str | None)
content_truncated (bool | None)
source_id (str | None)

Return type:

Dict[str, Any]

lalandre_rag.response.builder.build_act_context(act)[source]¶

Build a normalized act context payload.

Parameters:: act (Any)
Return type:: Dict[str, Any]

lalandre_rag.response.builder.validate_citations(answer, source_ids)[source]¶

Validate that citations in the answer refer to available source IDs. Expected formats: [S1], [G1], [R1], [C1], …

Parameters:

answer (str | None)
source_ids (List[str])

Return type:

Dict[str, Any]

lalandre_rag.response.builder.collect_act_contexts(docs)[source]¶

Collect unique act contexts from a list of context slices.

Parameters:: docs (List[Any])
Return type:: Dict[str, Dict[str, Any]]

`lalandre_rag.response.factories`¶

Source: packages/lalandre_rag/lalandre_rag/response/factories.py

Response factory functions for each RAG mode.

lalandre_rag.response.factories.create_llm_only_response(query, answer, include_warning=True)[source]¶

Factory pour réponse mode llm_only.

Parameters:

query (str)
answer (str)
include_warning (bool)

Return type:

Dict[str, Any]

lalandre_rag.response.factories.create_rag_response(query, answer, documents, context_summary=None, acts=None)[source]¶

Factory pour réponse mode rag (hybrid RAG).

Parameters:

query (str)
answer (str)
documents (List[Dict[str, Any]])
context_summary (Dict[str, Any] | None)
acts (Dict[str, Dict[str, Any]] | None)

Return type:

Dict[str, Any]

lalandre_rag.response.factories.create_summarize_response(query, answer, documents, acts=None)[source]¶

Factory for summarize-mode response.

Parameters:

query (str)
answer (str)
documents (List[Dict[str, Any]])
acts (Dict[str, Dict[str, Any]] | None)

Return type:

Dict[str, Any]

lalandre_rag.response.factories.create_compare_response(query, answer, documents, documents_compared, acts=None)[source]¶

Factory for compare-mode response.

Parameters:

query (str)
answer (str)
documents (List[Dict[str, Any]])
documents_compared (List[str])
acts (Dict[str, Dict[str, Any]] | None)

Return type:

Dict[str, Any]

lalandre_rag.response.factories.create_empty_response(mode, query, empty_message=None)[source]¶

Factory for an empty response (no results).

Si empty_message est None, un message par défaut adapté au mode est utilisé. Passer une string vide pour ne pas afficher de message.

Parameters:

mode (str)
query (str)
empty_message (str | None)

Return type:

Dict[str, Any]

lalandre_rag.response.factories.validate_response_format(response)[source]¶

Validate that a response respects the unified format.

With Pydantic models built via ResponseBuilder.build(), validation is already guaranteed at construction time. This function remains for external callers that pass raw dicts.

Parameters:: response (Dict[str, Any] | RAGResponse)
Return type:: bool

`lalandre_rag.response.fallbacks`¶

Source: packages/lalandre_rag/lalandre_rag/response/fallbacks.py

Fallback answer builders for degraded-mode responses.

lalandre_rag.response.fallbacks.flatten_source_items(sources)[source]¶

Flatten every evidence list carried in a sources payload.

Parameters:: sources (Dict[str, Any] | None)
Return type:: List[Dict[str, Any]]

lalandre_rag.response.fallbacks.build_retrieval_fallback_answer(*, mode, question, documents, reason)[source]¶

Build a user-facing fallback answer when LLM generation fails.

Lists up to 3 retrieved documents so the user still gets value.

Parameters:

mode (str)
question (str)
documents (List[Dict[str, Any]])
reason (str)

Return type:

str

lalandre_rag.response.fallbacks.build_no_source_blocked_answer(mode)[source]¶

Return the deterministic fail-closed answer for sourced modes.

Parameters:: mode (str)
Return type:: str

lalandre_rag.response.fallbacks.build_invalid_citation_blocked_answer(mode)[source]¶

Return the fail-closed answer when sources exist but citations are invalid.

Parameters:: mode (str)
Return type:: str

lalandre_rag.response.fallbacks.describe_citation_validation_failure(validation)[source]¶

Return a user-facing explanation for the current citation-validation failure.

Parameters:: validation (Dict[str, Any] | None)
Return type:: str

lalandre_rag.response.fallbacks.create_blocked_sourced_response(*, mode, query, reason, answer=None, metadata=None, sources=None)[source]¶

Return a fail-closed sourced-mode response, preserving sources when available.

Parameters:

mode (str)
query (str)
reason (str)
answer (str | None)
metadata (Dict[str, Any] | None)
sources (Dict[str, Any] | None)

Return type:

Dict[str, Any]

lalandre_rag.response.fallbacks.normalize_sources_payload(sources)[source]¶

Normalize empty source payloads to None and keep non-empty blocks coherent.

Parameters:: sources (Dict[str, Any] | None)
Return type:: Dict[str, Any] | None

lalandre_rag.response.fallbacks.merge_sources_payload(base_sources, extra_sources)[source]¶

Merge two source payloads while preserving all evidence families.

Parameters:

base_sources (Dict[str, Any] | None)
extra_sources (Dict[str, Any] | None)

Return type:

Dict[str, Any] | None

lalandre_rag.response.fallbacks.extract_source_ids(sources)[source]¶

Collect available source IDs from a source-doc list.

Parameters:: sources (List[Dict[str, Any]])
Return type:: List[str]

lalandre_rag.response.fallbacks.repair_citations_once(*, mode, question, draft_answer, sources, llm)[source]¶

Try a single citation-repair pass. Returns None on failure.

Parameters:

mode (str)
question (str)
draft_answer (str)
sources (List[Dict[str, Any]])
llm (Any)

Return type:

str | None

lalandre_rag.response.fallbacks.enforce_cited_answer(*, mode, question, draft_answer, sources, llm)[source]¶

Validate citations without rewriting the draft.

Preserves the streamed answer verbatim so the UI never sees its text flashed and replaced. Validation results are still returned so callers can surface citation quality in metadata, but no LLM repair pass runs and the answer is never blanked out.

Parameters:

mode (str)
question (str)
draft_answer (str)
sources (List[Dict[str, Any]])
llm (Any)

Return type:

Dict[str, Any]

`lalandre_rag.response.policy`¶

Source: packages/lalandre_rag/lalandre_rag/response/policy.py

Adaptive response policy for RAG outputs.

class lalandre_rag.response.policy.ResponsePolicyDecision(state, reason, label, intent_class, evidence_grade, citation_status, can_use_sources, should_run_cypher, clarification_question=None)[source]¶

Bases: object

Final policy decision for a RAG response.

Parameters:

state (Literal['llm_only', 'grounded', 'weakly_grounded', 'clarify', 'hard_block'])
reason (str)
label (str)
intent_class (Literal['conversational', 'documentary'])
evidence_grade (Literal['none', 'weak', 'sufficient'])
citation_status (Literal['not_applicable', 'valid', 'repaired', 'invalid'])
can_use_sources (bool)
should_run_cypher (bool)
clarification_question (str | None)

legacy_metadata()[source]¶

Return backward-compatible metadata fields derived from the decision.

Return type:: Dict[str, Any]

lalandre_rag.response.policy.is_anchored_legal_question(*, question, retrieval_profile=None)[source]¶

Return whether the user question is anchored enough for strict fail-closed behavior.

Parameters:

question (str)
retrieval_profile (str | None)

Return type:

bool

lalandre_rag.response.policy.infer_intent_class(*, intent_class, skip_retrieval=False)[source]¶

Infer the high-level intent class used by the response policy.

Parameters:

intent_class (str | None)
skip_retrieval (bool)

Return type:

Literal[‘conversational’, ‘documentary’]

lalandre_rag.response.policy.infer_evidence_grade(*, has_sources, crag_meta=None)[source]¶

Infer evidence strength from retrieval availability and CRAG metadata.

Parameters:

has_sources (bool)
crag_meta (Dict[str, Any] | None)

Return type:

Literal[‘none’, ‘weak’, ‘sufficient’]

lalandre_rag.response.policy.infer_citation_status(*, validation, repaired=False)[source]¶

Infer citation validity from the validation payload.

Parameters:

validation (Dict[str, Any] | None)
repaired (bool)

Return type:

Literal[‘not_applicable’, ‘valid’, ‘repaired’, ‘invalid’]

lalandre_rag.response.policy.decide_pre_generation(*, intent_class, evidence_grade, question, retrieval_profile=None, clarification_question=None, strict_grounding_requested=False)[source]¶

Choose the policy branch before answer generation starts.

Parameters:

intent_class (Literal['conversational', 'documentary'])
evidence_grade (Literal['none', 'weak', 'sufficient'])
question (str)
retrieval_profile (str | None)
clarification_question (str | None)
strict_grounding_requested (bool)

Return type:

ResponsePolicyDecision

lalandre_rag.response.policy.decide_post_generation(*, intent_class, evidence_grade, citation_status, question, has_sources, retrieval_profile=None, clarification_question=None, strict_grounding_requested=False)[source]¶

Choose the policy branch after generation and citation validation.

Parameters:

intent_class (Literal['conversational', 'documentary'])
evidence_grade (Literal['none', 'weak', 'sufficient'])
citation_status (Literal['not_applicable', 'valid', 'repaired', 'invalid'])
question (str)
has_sources (bool)
retrieval_profile (str | None)
clarification_question (str | None)
strict_grounding_requested (bool)

Return type:

ResponsePolicyDecision

lalandre_rag.response.policy.flatten_policy_sources(sources)[source]¶

Flatten every supported source list into one homogeneous sequence.

Parameters:: sources (Dict[str, Any] | None)
Return type:: List[Dict[str, Any]]

lalandre_rag.response.policy.build_clarification_answer(*, clarification_question=None)[source]¶

Build a fail-closed clarification response for underspecified questions.

Parameters:: clarification_question (str | None)
Return type:: str

lalandre_rag.response.policy.build_weakly_grounded_answer(*, sources, reason, clarification_question=None)[source]¶

Build a cautious fallback answer from the currently available sources.

Parameters:

sources (Dict[str, Any] | None)
reason (str)
clarification_question (str | None)

Return type:

str

`lalandre_rag.response.source_builder`¶

Source: packages/lalandre_rag/lalandre_rag/response/source_builder.py

Build final source document lists from enriched context references.

lalandre_rag.response.source_builder.build_sources(*, refs, include_relations, include_subjects, include_full_content)[source]¶

Build final source documents from context refs.

Parameters:

refs (List[Dict[str, Any]])
include_relations (bool)
include_subjects (bool)
include_full_content (bool)

Return type:

List[Dict[str, Any]]

`lalandre_rag.retrieval`¶

Source: packages/lalandre_rag/lalandre_rag/retrieval/__init__.py

Retrieval Service Module Combines semantic search (Qdrant) and lexical search (PostgreSQL BM25) Implements Reciprocal Rank Fusion and weighted score combination

`lalandre_rag.retrieval.bm25_search`¶

Source: packages/lalandre_rag/lalandre_rag/retrieval/bm25_search.py

BM25 Lexical Search Service PostgreSQL full-text search with BM25-like ranking (ts_rank_cd)

class lalandre_rag.retrieval.bm25_search.BM25SearchService(pg_repo, language='french', payload_builder=None)[source]¶

Bases: object

BM25-based lexical search using PostgreSQL full-text search

Uses PostgreSQL’s ts_rank_cd (Cover Density Ranking) which provides BM25-like scoring that considers: - Term frequency (TF) - Document length normalization - Cover density (proximity of terms)

Responsibilities: - Execute BM25 search via PostgreSQL - Convert PostgreSQL results to RetrievalResult format - Apply filters and language configuration - Manage full-text search indexes

Does NOT: - Fuse with semantic results (handled by RetrievalService) - Generate embeddings - Access Qdrant

Initialize BM25 search service

Parameters:

pg_repo (PostgresRepository) – PostgreSQL repository for text search
language (str) – PostgreSQL text search language configuration
payload_builder (PayloadBuilder | None)

search(query, top_k=None, filters=None, language=None, target='subdivisions')[source]¶

Execute BM25 lexical search

Parameters:

query (str) – Search query text
top_k (int | None) – Number of results to return (default: config.search.default_limit)
filters (Dict[str, Any] | None) – Optional metadata filters (e.g., {“act_id”: 123, “celex”: “32016R0679”})
language (str | None) – Override default language (default: “french”)
target (str) – “subdivisions” or “chunks”

Returns:

List of RetrievalResult objects sorted by BM25 score

Return type:

list[RetrievalResult]

get_statistics()[source]¶

Get BM25 search statistics

Returns:: Dictionary with configuration and statistics
Return type:: Dict[str, Any]

`lalandre_rag.retrieval.context`¶

Source: packages/lalandre_rag/lalandre_rag/retrieval/context/__init__.py

Context Service Enriches retrieval results with full metadata and relationships

`lalandre_rag.retrieval.context.community_reports`¶

Source: packages/lalandre_rag/lalandre_rag/retrieval/context/community_reports.py

Deterministic community report builder for graph-aware global RAG mode.

class lalandre_rag.retrieval.context.community_reports.ActMeta[source]¶

Bases: TypedDict

Minimal act metadata used while assembling community reports.

class lalandre_rag.retrieval.context.community_reports.RelationRow[source]¶

Bases: TypedDict

Normalized relation row used by the report builder.

class lalandre_rag.retrieval.context.community_reports.RelationTypeCount[source]¶

Bases: TypedDict

Relation-type histogram entry for one community.

class lalandre_rag.retrieval.context.community_reports.CentralAct[source]¶

Bases: TypedDict

Central act description used in community summaries.

class lalandre_rag.retrieval.context.community_reports.CommunityReport(community_id, act_ids, celexes, relation_count, top_relation_types, central_acts, evidences, summary)[source]¶

Bases: object

Compact summary of one connected component in the relation graph.

Parameters:

community_id (str)
act_ids (List[int])
celexes (List[str])
relation_count (int)
top_relation_types (List[RelationTypeCount])
central_acts (List[CentralAct])
evidences (List[str])
summary (str)

to_dict()[source]¶

Serialize the report to a plain dictionary for API responses.

Return type:: Dict[str, Any]

class lalandre_rag.retrieval.context.community_reports.CommunityReportBuilder(*, max_reports=6, min_cluster_size=2, max_evidence_per_report=3, top_relation_types_limit=5, central_acts_limit=3)[source]¶

Bases: object

Build deterministic community reports from context slices and act relations.

The algorithm is intentionally lightweight: - keep only relations between acts present in the retrieved context, - build connected components, - summarize each component with relation distribution and pivot acts.

Parameters:

max_reports (int)
min_cluster_size (int)
max_evidence_per_report (int)
top_relation_types_limit (int)
central_acts_limit (int)

build_reports(slices)[source]¶

Build deterministic community reports from enriched context slices.

Parameters:: slices (List[ContextSlice])
Return type:: List[CommunityReport]

`lalandre_rag.retrieval.context.compressor`¶

Source: packages/lalandre_rag/lalandre_rag/retrieval/context/compressor.py

Context Compressor — reduces context size via per-act LLM summarization.

Groups context slices by act, and for acts exceeding a character budget, uses an LLM call to compress the fragments into a dense summary. Preserves the ContextSlice structure so downstream code is unchanged.

lalandre_rag.retrieval.context.compressor.compress_context(slices, llm, *, budget_chars, max_slices=20)[source]¶

Compress context slices to fit within a character budget.

Strategy: 1. Group slices by act_id 2. For acts with multiple large slices, compress into a single dense slice 3. Keep single/small slices as-is 4. Return compressed slices sorted by original score (best first)

Parameters:

slices (List[ContextSlice]) – Input context slices (already scored and sorted)
llm (Any) – LangChain-compatible LLM for compression calls
budget_chars (int) – Target total character budget
max_slices (int) – Max slices to keep after compression

Returns:

Compressed list of ContextSlice objects

Return type:

List[ContextSlice]

`lalandre_rag.retrieval.context.models`¶

Source: packages/lalandre_rag/lalandre_rag/retrieval/context/models.py

Context Models Clean separation between act-level metadata, document metadata, and content slices.

class lalandre_rag.retrieval.context.models.DocumentMeta(source_kind, subdivision_id, subdivision_type, sequence_order, chunk_id=None, chunk_index=None, char_start=None, char_end=None, payload=None)[source]¶

Bases: object

Document-level metadata for a retrieved slice.

Parameters:

source_kind (str)
subdivision_id (int)
subdivision_type (str)
sequence_order (int)
chunk_id (int | None)
chunk_index (int | None)
char_start (int | None)
char_end (int | None)
payload (Dict[str, Any] | None)

class lalandre_rag.retrieval.context.models.ActContext(act_id, celex, title, act_type, regulatory_level=None, url_eurlex=None, relations=None, subjects=None, adoption_date=None, force_date=None)[source]¶

Bases: object

Act-level metadata, shared across multiple slices.

Parameters:

act_id (int)
celex (str)
title (str)
act_type (str)
regulatory_level (str | None)
url_eurlex (str | None)
relations (List[Dict[str, Any]] | None)
subjects (List[str] | None)
adoption_date (str | None)
force_date (str | None)

class lalandre_rag.retrieval.context.models.ContextSlice(content, score, act, doc, trace=None)[source]¶

Bases: object

A context slice used by the LLM, with explicit act + doc metadata separation.

Parameters:

content (str)
score (float)
act (ActContext)
doc (DocumentMeta)
trace (Dict[str, Any] | None)

`lalandre_rag.retrieval.context.service`¶

Source: packages/lalandre_rag/lalandre_rag/retrieval/context/service.py

Context Service Enriches retrieval results with metadata, relationships, and formatting

class lalandre_rag.retrieval.context.service.RelationPayload[source]¶

Bases: TypedDict

Normalized relation payload attached to enriched act contexts.

class lalandre_rag.retrieval.context.service.ContextService(pg_repo)[source]¶

Bases: object

Enriches retrieval results into context slices

Responsibilities: - Enrich results with act metadata (title, type, CELEX) - Add relationships between documents - Format context for LLM consumption - Generate context summaries

Initialize context service

Parameters:: pg_repo (PostgresRepository) – PostgreSQL repository for metadata queries

enrich_results(results, include_relations=False, include_subjects=False, hydrate_content=True)[source]¶

Enrich retrieval results with act metadata, optional relations and subjects. Returns context slices with explicit act/doc separation.

Parameters:

results (List[RetrievalResult])
include_relations (bool)
include_subjects (bool)
hydrate_content (bool)

Return type:

List[ContextSlice]

`lalandre_rag.retrieval.decomposer`¶

Source: packages/lalandre_rag/lalandre_rag/retrieval/decomposer.py

Compatibility facade for the agentic decomposition runtime.

`lalandre_rag.retrieval.evaluator`¶

Source: packages/lalandre_rag/lalandre_rag/retrieval/evaluator.py

Compatibility facade for the agentic evaluation runtime.

`lalandre_rag.retrieval.fusion_service`¶

Source: packages/lalandre_rag/lalandre_rag/retrieval/fusion_service.py

Result Fusion Service Algorithms for combining search results from multiple sources

class lalandre_rag.retrieval.fusion_service.ResultFusionService(fusion_method='rrf', lexical_weight=0.3, semantic_weight=0.7, rrf_k=60)[source]¶

Bases: object

Fusion service for combining search results

Implements multiple fusion algorithms: - Reciprocal Rank Fusion (RRF) - rank-based fusion - Weighted Score Fusion - score-based fusion with weights - Score normalization utilities

Use cases: - Combine BM25 (lexical) + semantic search - Combine multiple semantic searches - Combine Graph RAG + standard RAG - Multi-stage retrieval pipelines

Responsibilities: - Implement fusion algorithms (RRF, weighted) - Deduplicate results by subdivision_id - Normalize scores to [0, 1] range - Preserve metadata from all sources

Does NOT: - Execute searches (uses search services) - Generate embeddings - Access databases

Initialize fusion service

Parameters:

fusion_method (str) – “rrf” or “weighted”
lexical_weight (float) – Weight for lexical scores (weighted method)
semantic_weight (float) – Weight for semantic scores (weighted method)
rrf_k (int) – RRF constant (typically 60)

fuse(lexical_results, semantic_results, override_lexical_weight=None, override_semantic_weight=None)[source]¶

Fuse lexical and semantic search results.

When override weights are provided the method always uses weighted score fusion regardless of the configured fusion_method. This lets callers apply dynamic weights (e.g. boosted lexical weight for queries containing explicit legal references) without changing the service-level default.

Parameters:

lexical_results (Sequence[RetrievalResult]) – BM25 or other lexical search results
semantic_results (Sequence[RetrievalResult]) – Vector-based semantic search results
override_lexical_weight (float | None) – Forces weighted fusion with this lexical weight
override_semantic_weight (float | None) – Forces weighted fusion with this semantic weight

Returns:

Fused and sorted results

Return type:

List[RetrievalResult]

weighted_score_fusion(lexical_results, semantic_results, lexical_weight=None, semantic_weight=None)[source]¶

Weighted score fusion

Combines scores using weighted average: combined_score = lexical_weight * lex_score + semantic_weight * sem_score

Parameters:

lexical_results (Sequence[RetrievalResult]) – Lexical search results (with scores)
semantic_results (Sequence[RetrievalResult]) – Semantic search results (with scores)
lexical_weight (float | None) – Weight for lexical scores (default: instance weight)
semantic_weight (float | None) – Weight for semantic scores (default: instance weight)

Returns:

Fused results sorted by combined score (descending)

Return type:

List[RetrievalResult]

reciprocal_rank_fusion(lexical_results, semantic_results, k=None)[source]¶

Reciprocal Rank Fusion (RRF)

RRF formula: RRF(d) = sum(1 / (k + rank(d))) where k is a constant (typically 60) and rank starts at 1

RRF is score-agnostic and only considers ranking position, making it robust to score distribution differences.

Parameters:

lexical_results (Sequence[RetrievalResult]) – Lexical search results (pre-sorted)
semantic_results (Sequence[RetrievalResult]) – Semantic search results (pre-sorted)
k (int | None) – RRF constant (default: instance rrf_k, typically 60)

Returns:

Fused results sorted by RRF score (descending)

Return type:

List[RetrievalResult]

get_statistics()[source]¶

Get fusion service statistics

Returns:: Dictionary with configuration
Return type:: Dict[str, Any]

`lalandre_rag.retrieval.metrics`¶

Source: packages/lalandre_rag/lalandre_rag/retrieval/metrics.py

Metrics hook interfaces for RAG retrieval instrumentation.

This package remains backend-agnostic. Services register a recorder backend (Prometheus, OTEL, etc.) at startup.

class lalandre_rag.retrieval.metrics.RetrievalMetricsRecorder[source]¶

Bases: ABC

Backend-agnostic hook interface for retrieval metrics.

abstractmethod observe_phase(*, operation, phase, duration_seconds)[source]¶

Record latency for one retrieval phase.

Parameters:

operation (str)
phase (str)
duration_seconds (float)

Return type:

None

abstractmethod observe_error(*, operation, phase, exc_or_reason)[source]¶

Record one retrieval error or degradation reason.

Parameters:

operation (str)
phase (str)
exc_or_reason (Any)

Return type:

None

lalandre_rag.retrieval.metrics.set_retrieval_metrics_recorder(recorder)[source]¶

Parameters:: recorder (RetrievalMetricsRecorder)
Return type:: None

lalandre_rag.retrieval.metrics.observe_retrieval_phase(*, operation, phase, duration_seconds)[source]¶

Forward one retrieval phase observation to the active recorder.

Parameters:

operation (str)
phase (str)
duration_seconds (float)

Return type:

None

lalandre_rag.retrieval.metrics.observe_retrieval_error(*, operation, phase, exc_or_reason)[source]¶

Forward one retrieval error observation to the active recorder.

Parameters:

operation (str)
phase (str)
exc_or_reason (Any)

Return type:

None

`lalandre_rag.retrieval.overview`¶

Source: packages/lalandre_rag/lalandre_rag/retrieval/overview.py

User-facing retrieval overview helpers.

lalandre_rag.retrieval.overview.build_retrieval_overview(items, *, effective_granularity, candidate_counts=None, top_acts_limit=3)[source]¶

Aggregate textual evidence into a product-facing hierarchy overview.

Parameters:

items (Iterable[Any])
effective_granularity (str | None)
candidate_counts (Dict[str, int] | None)
top_acts_limit (int)

Return type:

Dict[str, Any]

`lalandre_rag.retrieval.planner`¶

Source: packages/lalandre_rag/lalandre_rag/retrieval/planner.py

Compatibility facade for the agentic planner runtime.

`lalandre_rag.retrieval.query_expansion`¶

Source: packages/lalandre_rag/lalandre_rag/retrieval/query_expansion.py

Legal query expansion utilities.

Provides deterministic multi-query expansion for EU/FR legal retrieval.

class lalandre_rag.retrieval.query_expansion.ExpandedQuery(text, weight, strategy)[source]¶

Bases: object

One expanded query candidate with a weighting hint.

Parameters:

text (str)
weight (float)
strategy (str)

class lalandre_rag.retrieval.query_expansion.LegalQueryExpansionService(*, min_query_chars=24)[source]¶

Bases: object

Deterministic query expansion focused on legal references (UE/France).

The objective is recall improvement while keeping runtime bounded.

Parameters:: min_query_chars (int)

expand(query, *, max_variants=3)[source]¶

Expand a query into deterministic variants.

Always returns at least one query (the normalized original).

Parameters:

query (str)
max_variants (int)

Return type:

List[ExpandedQuery]

`lalandre_rag.retrieval.query_parser`¶

Source: packages/lalandre_rag/lalandre_rag/retrieval/query_parser.py

LLM-assisted query parsing for legal retrieval routing.

Note d’architecture¶

Ici, le parser d’intention utilise le LLM de génération principal (config.generation.*) — même provider, même modèle, clé depuis Vault.

class lalandre_rag.retrieval.query_parser.ParsedQueryIntent(*, profile, granularity=None, top_k=10, include_relations_hint=False, execution_mode='hybrid', rationale='LLM parser selected retrieval profile.', use_graph=False, normalized_query=None, intent_label=None, confidence=None, output_validation_retries=0)[source]¶

Bases: BaseModel

Normalized interpretation returned by the intent parser.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

profile (str)
granularity (str | None)
top_k (int)
include_relations_hint (bool)
execution_mode (str)
rationale (str)
use_graph (bool)
normalized_query (str | None)
intent_label (str | None)
confidence (float | None)
output_validation_retries (int)

model_config: ClassVar[ConfigDict] = {'extra': 'ignore', 'frozen': True}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

classmethod normalize_profile(v)[source]¶

Resolve profile aliases and reject unsupported routing profiles.

Parameters:: v (Any)
Return type:: str

classmethod normalize_granularity(v)[source]¶

Normalize requested granularity and collapse auto to None.

Parameters:: v (Any)
Return type:: str | None

classmethod coerce_top_k(v)[source]¶

Clamp top_k to the configured parser-safe range.

Parameters:: v (Any)
Return type:: int

classmethod coerce_bool(v)[source]¶

Coerce common truthy string values to booleans.

Parameters:: v (Any)
Return type:: bool

classmethod coerce_confidence(v)[source]¶

Normalize optional confidence scores to the [0, 1] range.

Parameters:: v (Any)
Return type:: float | None

classmethod normalize_execution_mode(v)[source]¶

Normalize the execution mode and default to hybrid.

Parameters:: v (Any)
Return type:: str

classmethod clean_normalized_query(v)[source]¶

Trim the optional normalized query field.

Parameters:: v (Any)
Return type:: str | None

classmethod clean_rationale(v)[source]¶

Normalize routing rationales and provide a fallback sentence.

Parameters:: v (Any)
Return type:: str

classmethod clean_intent_label(v)[source]¶

Normalize the optional intent label emitted by the LLM.

Parameters:: v (Any)
Return type:: str | None

apply_cross_field_defaults()[source]¶

Apply derived defaults after model validation succeeds.

Return type:: ParsedQueryIntent

classmethod from_routing_output(output, *, requested_top_k, requested_granularity, output_validation_retries)[source]¶

Convert validated agent output into a normalized immutable intent.

Parameters:

output (RoutingIntentOutput)
requested_top_k (int)
requested_granularity (str | None)
output_validation_retries (int)

Return type:

ParsedQueryIntent | None

class lalandre_rag.retrieval.query_parser.LLMQueryParserClient(*, provider, model, base_url, timeout_seconds, api_key=None, max_output_tokens=180, temperature=0.0, key_pool=None)[source]¶

Bases: object

Query parser d’intention utilisant le LLM de génération principal.

Dégrade gracieusement : si le parsing échoue, le QueryRouter bascule sur les heuristiques déterministes.

Architecture : NE PAS configurer sur le LLM d’extraction. Utilise config.generation.* — Mistral, OpenAI-compatible.

Parameters:

provider (str)
model (str)
base_url (str)
timeout_seconds (float)
api_key (str | None)
max_output_tokens (int)
temperature (float)
key_pool (APIKeyPool | None)

classmethod from_runtime(*, config, settings, key_pool=None)[source]¶

Factory depuis la config runtime.

Utilise config.generation.* comme source principale. search.intent_parser_* peut surcharger provider/model/base_url si besoin (ex. : utiliser un modèle plus petit dédié au routing). NE fait plus de fallback sur extraction.llm_* (réservé au LLM d’extraction).

Parameters:

config (Any)
settings (Any)
key_pool (APIKeyPool | None)

Return type:

LLMQueryParserClient | None

parse(*, question, top_k, requested_granularity)[source]¶

Parse one user question into a normalized routing intent.

Parameters:

question (str)
top_k (int)
requested_granularity (str | None)

Return type:

ParsedQueryIntent | None

`lalandre_rag.retrieval.query_router`¶

Source: packages/lalandre_rag/lalandre_rag/retrieval/query_router.py

Query routing heuristics for hybrid legal retrieval.

class lalandre_rag.retrieval.query_router.RetrievalPlan(profile, granularity, top_k, include_relations_hint, rationale, use_graph=False, execution_mode='hybrid', routing_source='heuristic', search_query=None, intent_label=None, parser_confidence=None)[source]¶

Bases: object

Resolved retrieval strategy for a user question.

Parameters:

profile (str)
granularity (str | None)
top_k (int)
include_relations_hint (bool)
rationale (str)
use_graph (bool)
execution_mode (str)
routing_source (str)
search_query (str | None)
intent_label (str | None)
parser_confidence (float | None)

class lalandre_rag.retrieval.query_router.QueryRouter(*, intent_parser=None)[source]¶

Bases: object

Lightweight router for selecting retrieval settings by query intent.

By default, routing is heuristic-only (fast and deterministic). An optional An LLM parser can be injected to classify intent and normalize user queries.

Parameters:: intent_parser (LLMQueryParserClient | None)

route(*, question, top_k, requested_granularity)[source]¶

Route a question to the appropriate retrieval profile and defaults.

Parameters:

question (str)
top_k (int)
requested_granularity (str | None)

Return type:

RetrievalPlan

`lalandre_rag.retrieval.query_utils`¶

Source: packages/lalandre_rag/lalandre_rag/retrieval/query_utils.py

Query pre-processing utilities for retrieval.

Small, pure helpers for preparing user queries before they hit the search backends (BM25, semantic, etc.).

lalandre_rag.retrieval.query_utils.truncate_lexical_query(query, max_chars=None)[source]¶

Truncate query for BM25-based search modes.

Preserves full words by cutting at the last whitespace boundary. Returns the original query unchanged if it is short enough.

If max_chars is not provided, uses search.max_lexical_query_chars from the central config.

Parameters:

query (str)
max_chars (int | None)

Return type:

str

`lalandre_rag.retrieval.rerank_service`¶

Source: packages/lalandre_rag/lalandre_rag/retrieval/rerank_service.py

Rerank Service Cross-encoder reranking — local (in-process) or via dedicated HTTP service.

class lalandre_rag.retrieval.rerank_service.RerankConfig(model_name, device, batch_size, max_candidates, max_chars, enabled=True, cache_dir=None, rerank_service_url=None, service_timeout_seconds=10.0, fallback_to_skip=True, circuit_failure_threshold=2, circuit_cooldown_seconds=30.0)[source]¶

Bases: object

Runtime configuration for the retrieval reranker.

Parameters:

model_name (str)
device (str)
batch_size (int)
max_candidates (int)
max_chars (int)
enabled (bool)
cache_dir (str | None)
rerank_service_url (str | None)
service_timeout_seconds (float)
fallback_to_skip (bool)
circuit_failure_threshold (int)
circuit_cooldown_seconds (float)

class lalandre_rag.retrieval.rerank_service.RerankService(config)[source]¶

Bases: object

Cross-encoder reranker.

Two modes: - HTTP (when rerank_service_url is set): calls the dedicated rerank-service. - Local (fallback): loads CrossEncoder in-process via sentence-transformers.

If the HTTP service is unreachable and fallback_to_skip is True, results are returned without reranking.

Includes a circuit breaker: after circuit_failure_threshold consecutive HTTP failures, reranking is skipped for circuit_cooldown_seconds.

Parameters:: config (RerankConfig)

load()[source]¶

Eagerly load the local reranker model (no-op in HTTP mode).

Return type:: bool

rerank(query, results, top_k=None)[source]¶

Rerank retrieval results using a cross-encoder.

Parameters:

query (str)
results (list[RetrievalResult])
top_k (int | None)

Return type:

list[RetrievalResult]

`lalandre_rag.retrieval.result`¶

Source: packages/lalandre_rag/lalandre_rag/retrieval/result.py

Core retrieval result dataclass.

Extracted from service.py so that sub-modules (semantic_search, bm25_search, fusion_service, rerank_service) can import it without circular dependencies.

class lalandre_rag.retrieval.result.RetrievalResult(content, score, subdivision_id, act_id, celex, subdivision_type, sequence_order, metadata)[source]¶

Bases: object

Single retrieval result with content and metadata

Parameters:

content (str)
score (float)
subdivision_id (int)
act_id (int)
celex (str | None)
subdivision_type (str)
sequence_order (int)
metadata (Dict[str, Any])

class lalandre_rag.retrieval.result.RetrievalStats(candidates_after_fusion=0, candidates_after_threshold=0, candidates_after_rerank=0, candidates_after_adaptive_cutoff=0, candidates_returned=0, adaptive_cutoff_applied=False, effective_score_threshold=None, fusion_lexical_weight=None, fusion_semantic_weight=None, query_variants_count=0, cache_hit=False, embedding_ms=0.0, semantic_search_ms=0.0, lexical_search_ms=0.0, parallel_search_ms=0.0, fusion_ms=0.0, rerank_ms=0.0, total_retrieve_ms=0.0)[source]¶

Bases: object

Statistics from the last retrieve() call, for audit/traceability.

Parameters:

candidates_after_fusion (int)
candidates_after_threshold (int)
candidates_after_rerank (int)
candidates_after_adaptive_cutoff (int)
candidates_returned (int)
adaptive_cutoff_applied (bool)
effective_score_threshold (float | None)
fusion_lexical_weight (float | None)
fusion_semantic_weight (float | None)
query_variants_count (int)
cache_hit (bool)
embedding_ms (float)
semantic_search_ms (float)
lexical_search_ms (float)
parallel_search_ms (float)
fusion_ms (float)
rerank_ms (float)
total_retrieve_ms (float)

`lalandre_rag.retrieval.result_cache`¶

Source: packages/lalandre_rag/lalandre_rag/retrieval/result_cache.py

Redis-backed result cache for the retrieval service.

class lalandre_rag.retrieval.result_cache.RetrievalCache(redis_client, ttl)[source]¶

Bases: object

Thin wrapper around Redis for caching retrieval results.

Parameters:

redis_client (Any)
ttl (int)

property enabled: bool¶: Return whether Redis caching is currently active.

static cache_key(query, top_k, score_threshold, filters, granularity, collections, embedding_preset=None)[source]¶

Build a stable cache key for one retrieval request shape.

Parameters:

query (str)
top_k (int)
score_threshold (float | None)
filters (Dict[str, Any] | None)
granularity (str | None)
collections (List[str] | None)
embedding_preset (str | None)

Return type:

str

get(key)[source]¶

Fetch cached retrieval results for key, if present.

Parameters:: key (str)
Return type:: List[RetrievalResult] | None

set(key, results)[source]¶

Store retrieval results for key with the configured TTL.

Parameters:

key (str)
results (List[RetrievalResult])

Return type:

None

`lalandre_rag.retrieval.search_config`¶

Source: packages/lalandre_rag/lalandre_rag/retrieval/search_config.py

Resolved search configuration.

Centralizes the verbose config-resolution logic that was inlined in RetrievalService.__init__.

class lalandre_rag.retrieval.search_config.ResolvedSearchConfig(search_language, candidate_multiplier, min_candidates, max_candidates, hnsw_ef, exact_search, per_collection_oversampling, query_expansion_enabled, query_expansion_max_variants, query_expansion_min_query_chars, lexical_weight, semantic_weight, fusion_method, dynamic_fusion_enabled, lexical_boost_factor, lexical_boost_max, result_cache_ttl)[source]¶

Bases: object

All search-related parameters resolved from config + caller overrides.

Parameters:

search_language (str)
candidate_multiplier (float)
min_candidates (int)
max_candidates (int)
hnsw_ef (int | None)
exact_search (bool)
per_collection_oversampling (float)
query_expansion_enabled (bool)
query_expansion_max_variants (int)
query_expansion_min_query_chars (int)
lexical_weight (float)
semantic_weight (float)
fusion_method (str)
dynamic_fusion_enabled (bool)
lexical_boost_factor (float)
lexical_boost_max (float)
result_cache_ttl (int)

classmethod from_overrides(*, search_language=None, candidate_multiplier=None, min_candidates=None, max_candidates=None, hnsw_ef=None, exact_search=None, semantic_per_collection_oversampling=None, query_expansion_enabled=None, query_expansion_max_variants=None, query_expansion_min_query_chars=None, lexical_weight=None, semantic_weight=None, fusion_method=None, dynamic_fusion_enabled=None, result_cache_ttl=None)[source]¶

Resolve all search parameters from config defaults + explicit overrides.

Parameters:

search_language (str | None)
candidate_multiplier (float | None)
min_candidates (int | None)
max_candidates (int | None)
hnsw_ef (int | None)
exact_search (bool | None)
semantic_per_collection_oversampling (float | None)
query_expansion_enabled (bool | None)
query_expansion_max_variants (int | None)
query_expansion_min_query_chars (int | None)
lexical_weight (float | None)
semantic_weight (float | None)
fusion_method (str | None)
dynamic_fusion_enabled (bool | None)
result_cache_ttl (int | None)

Return type:

ResolvedSearchConfig

`lalandre_rag.retrieval.semantic_search`¶

Source: packages/lalandre_rag/lalandre_rag/retrieval/semantic_search.py

Semantic Search Service Vector-based search using Qdrant embeddings

class lalandre_rag.retrieval.semantic_search.SemanticSearchService(qdrant_repos, score_threshold=None, hnsw_ef=None, exact_search=None, per_collection_oversampling=None)[source]¶

Bases: object

Semantic search using Qdrant vector database

Uses embedding-based similarity search to find semantically related documents regardless of exact keyword matches.

Responsibilities: - Execute vector search via Qdrant - Support multiple collections (chunks, acts) - Apply metadata filters - Convert Qdrant results to RetrievalResult format

Does NOT: - Generate embeddings (uses EmbeddingService) - Fuse with lexical results (handled by FusionService) - Execute lexical search

Initialize semantic search service

Parameters:

qdrant_repos (Dict[str, QdrantRepository]) – Dictionary of collection_name -> QdrantRepository Expected keys typically include ‘chunks’ and ‘acts’
score_threshold (float | None) – Minimum similarity score (0-1, None = no filtering)
hnsw_ef (int | None)
exact_search (bool | None)
per_collection_oversampling (float | None)

search(query_vector, top_k=None, filters=None, collections=None, score_threshold=None, hnsw_ef=None, exact_search=None)[source]¶

Execute semantic search in one or multiple collections

Parameters:

query_vector (List[float]) – Query embedding vector
top_k (int | None) – Number of results (default: config.search.default_limit)
filters (Dict[str, Any] | None) – Metadata filters (e.g., {“celex”: “32016R0679”})
collections (List[str] | None) – Collections to search (default: [“chunks”])
score_threshold (float | None) – Override instance score threshold
hnsw_ef (int | None)
exact_search (bool | None)

Returns:

List of RetrievalResult objects sorted by similarity score

Return type:

list[RetrievalResult]

get_statistics()[source]¶

Get semantic search statistics

Returns:: Dictionary with configuration and collection info
Return type:: Dict[str, Any]

`lalandre_rag.retrieval.service`¶

Source: packages/lalandre_rag/lalandre_rag/retrieval/service.py

Document Retrieval Service Orchestrates semantic + lexical search, fusion, reranking, and caching. Strategy implementations live in retrieval/strategies/.

class lalandre_rag.retrieval.service.RetrievalService(qdrant_repos, pg_repo, embedding_service, reranker=None, payload_builder=None, search_language=None, candidate_multiplier=None, min_candidates=None, max_candidates=None, semantic_per_collection_oversampling=None, hnsw_ef=None, exact_search=None, query_expansion_enabled=None, query_expansion_max_variants=None, query_expansion_min_query_chars=None, lexical_weight=None, semantic_weight=None, fusion_method=None, dynamic_fusion_enabled=None, redis_client=None, result_cache_ttl=None, preset_embedding_services=None)[source]¶

Bases: SemanticStrategyMixin, LexicalStrategyMixin, HybridStrategyMixin

Unified retrieval service combining semantic and lexical search.

Responsibilities: - Execute hybrid search (retrieve) across multiple collections - Delegate semantic/lexical/hybrid-precomputed searches to strategy mixins - Fuse results using RRF or weighted scores - Rerank, deduplicate, and cache results

Does NOT: - Modify Qdrant collections - Generate embeddings (uses EmbeddingService) - Enrich context (uses ContextService)

Parameters:

qdrant_repos (Dict[str, QdrantRepository])
pg_repo (PostgresRepository)
embedding_service (EmbeddingService)
reranker (RerankService | None)
payload_builder (PayloadBuilder | None)
search_language (str | None)
candidate_multiplier (float | None)
min_candidates (int | None)
max_candidates (int | None)
semantic_per_collection_oversampling (float | None)
hnsw_ef (int | None)
exact_search (bool | None)
query_expansion_enabled (bool | None)
query_expansion_max_variants (int | None)
query_expansion_min_query_chars (int | None)
lexical_weight (float | None)
semantic_weight (float | None)
fusion_method (str | None)
dynamic_fusion_enabled (bool | None)
redis_client (Any | None)
result_cache_ttl (int | None)
preset_embedding_services (Dict[str, EmbeddingService] | None)

retrieve(query, top_k=10, score_threshold=None, filters=None, collections=None, granularity=None, embedding_preset=None)[source]¶

Hybrid search (semantic + BM25) with fusion, reranking, and caching.

Parameters:

query (str) – Search query
top_k (int) – Number of final results to return
score_threshold (float | None) – Minimum score threshold (post-fusion)
filters (Dict[str, Any] | None) – Metadata filters (act_id, celex, etc.)
collections (List[str] | None) – Specific collections to search
granularity (str | None) – ‘subdivisions’, ‘chunks’, or ‘all’ (overrides collections)
embedding_preset (str | None) – Route semantic search to this preset’s collections/embedding service

Return type:

List[RetrievalResult]

get_statistics()[source]¶

Return retrieval service statistics.

Return type:: Dict[str, Any]

`lalandre_rag.retrieval.strategies`¶

Source: packages/lalandre_rag/lalandre_rag/retrieval/strategies/__init__.py

Retrieval strategy mixins (semantic, lexical, hybrid).

`lalandre_rag.retrieval.strategies.hybrid`¶

Source: packages/lalandre_rag/lalandre_rag/retrieval/strategies/hybrid.py

Hybrid search strategy mixin for RetrievalService (pre-computed embeddings).

lalandre_rag.retrieval.strategies.hybrid.has_explicit_legal_reference(query)[source]¶

Return True when the query contains a strong legal-reference cue.

Parameters:: query (str)
Return type:: bool

class lalandre_rag.retrieval.strategies.hybrid.HybridStrategyMixin[source]¶

Bases: object

Provides hybrid_with_embedding(), _fuse_results(), and _resolve_fusion_override().

hybrid_with_embedding(query, query_vector, top_k=10, score_threshold=None, filters=None, collections=None, granularity=None, embedding_preset=None)[source]¶

Execute hybrid search with a pre-computed query embedding.

Avoids redundant embedding computation when the caller already has the vector.

Parameters:

query (str)
query_vector (List[float])
top_k (int)
score_threshold (float | None)
filters (Dict[str, Any] | None)
collections (List[str] | None)
granularity (str | None)
embedding_preset (str | None)

Return type:

List[RetrievalResult]

`lalandre_rag.retrieval.strategies.lexical`¶

Source: packages/lalandre_rag/lalandre_rag/retrieval/strategies/lexical.py

Lexical (BM25) search strategy mixin for RetrievalService.

class lalandre_rag.retrieval.strategies.lexical.LexicalStrategyMixin[source]¶

Bases: object

Provides lexical_only() and the underlying BM25 + expansion helpers.

lexical_only(query, top_k=10, score_threshold=None, filters=None)[source]¶

Execute lexical-only search using BM25 (no semantic component).

BM25 scores are normalized into [0, 1] before thresholding so that score_threshold stays comparable across modes.

Parameters:

query (str)
top_k (int)
score_threshold (float | None)
filters (Dict[str, Any] | None)

Return type:

List[RetrievalResult]

`lalandre_rag.retrieval.strategies.semantic`¶

Source: packages/lalandre_rag/lalandre_rag/retrieval/strategies/semantic.py

Semantic search strategy mixin for RetrievalService.

class lalandre_rag.retrieval.strategies.semantic.SemanticStrategyMixin[source]¶

Bases: object

Provides semantic_only() and the underlying multi-collection + expansion helpers.

semantic_only(query=None, query_vector=None, top_k=10, score_threshold=None, filters=None, collections=None, granularity=None, embedding_preset=None)[source]¶

Execute semantic-only search (no lexical component).

Supports both text queries (embedded on the fly) and pre-computed vectors.

Parameters:

query (str | None)
query_vector (List[float] | None)
top_k (int)
score_threshold (float | None)
filters (Dict[str, Any] | None)
collections (List[str] | None)
granularity (str | None)
embedding_preset (str | None)

Return type:

List[RetrievalResult]

`lalandre_rag.retrieval.trace`¶

Source: packages/lalandre_rag/lalandre_rag/retrieval/trace.py

Retrieval trace utilities Centralized definition of trace keys and extraction helper.

lalandre_rag.retrieval.trace.extract_trace(metadata)[source]¶

Extract a compact, traceable subset of metadata for sources. Avoids duplicating large payload fields while preserving retrieval provenance.

Parameters:: metadata (Dict[str, Any] | None)
Return type:: Dict[str, Any]

`lalandre_rag.scoring`¶

Source: packages/lalandre_rag/lalandre_rag/scoring/__init__.py

Shared score helpers and source score contract.

lalandre_rag.scoring.coerce_finite_float(value)[source]¶

Return a finite float when possible, otherwise None.

Parameters:: value (Any)
Return type:: float | None

lalandre_rag.scoring.clamp_unit_interval(value, *, default=0.0)[source]¶

Clamp a value to the normalized 0..1 range.

Parameters:

value (Any)
default (float)

Return type:

float

lalandre_rag.scoring.non_negative(value, *, default=0.0)[source]¶

Clamp a value to the non-negative range.

Parameters:

value (Any)
default (float)

Return type:

float

lalandre_rag.scoring.normalize_by_max(value, max_value, *, default=0.0)[source]¶

Normalize a non-negative value by a positive maximum.

Parameters:

value (Any)
max_value (Any)
default (float)

Return type:

float

lalandre_rag.scoring.round_score(value, digits=4)[source]¶

Round a finite score after clamping it to a numeric value.

Parameters:

value (Any)
digits (int)

Return type:

float

lalandre_rag.scoring.build_relevance_score_payload(score, *, rank_score=None)[source]¶

Build a normalized relevance score payload for user-facing sources.

Parameters:

score (Any)
rank_score (Any | None)

Return type:

Dict[str, Any]

lalandre_rag.scoring.build_evidence_score_payload()[source]¶

Build a payload for tangible but non-scored evidence.

Return type:: Dict[str, str]

`lalandre_rag.service`¶

Source: packages/lalandre_rag/lalandre_rag/service.py

RAG (Retrieval-Augmented Generation) Service High-level orchestration of retrieval, context enrichment, and LLM generation

class lalandre_rag.service.RAGService(retrieval_service, context_service, llm_model=None, temperature=None, max_tokens=None, api_key=None, graph_rag_service=None, key_pool=None, act_summary_service=None, entity_linker=None, external_detector=None)[source]¶

Bases: object

Complete RAG pipeline for legal document querying

Architecture: - Delegates LLM-only to LLMMode - Delegates hybrid RAG to HybridMode - Delegates summarization to SummarizeMode - Delegates comparison to CompareMode

This is a thin orchestration layer that initializes modes and delegates requests.

Initialize RAG service

Parameters:

retrieval_service (RetrievalService) – Service for document retrieval
context_service (ContextService) – Service for context enrichment
llm_model (str | None) – LLM model name (default from config)
temperature (float | None) – LLM temperature (default from config)
max_tokens (int | None) – Maximum tokens for generation (default from config)
api_key (str | None) – Optional provider API key override
graph_rag_service (GraphRAGService | None)
key_pool (APIKeyPool | None)
act_summary_service (ActSummaryService | None)
entity_linker (LegalEntityLinker | None)
external_detector (Callable[[str], Any] | None)

query_llm_only(question, include_warning=True)[source]¶

MODE 2: Pure LLM (100% Generation)

Delegates to LLMMode.

Parameters:

question (str) – User question
include_warning (bool) – Include warning about no document grounding

Returns:

Dictionary with LLM answer (no sources)

Return type:

Dict[str, Any]

stream_query_llm_only(question)[source]¶

Stream LLM-only answer token by token.

Parameters:: question (str)
Return type:: Iterator[str]

query(question, top_k=10, score_threshold=None, filters=None, include_relations=False, include_subjects=False, return_sources=True, include_full_content=False, collections=None, granularity=None, chat_history=None, graph_depth=None, use_graph=None, embedding_preset=None, cypher_documents=None, cypher_query_meta=None)[source]¶

MODE 3: Hybrid RAG (default)

Delegates to HybridMode.

Parameters:

question (str) – User question
top_k (int) – Number of documents to retrieve
filters (Dict[str, Any] | None) – Metadata filters
include_relations (bool) – Include act relations in context
include_subjects (bool) – Include subject classifications
return_sources (bool) – Return source documents in response
collections (List[str] | None) – Specific collections to search
granularity (str | None) – Quick selector
chat_history (List[BaseMessage] | None) – Optional conversation history as LangChain messages
score_threshold (float | None)
include_full_content (bool)
graph_depth (int | None)
use_graph (bool | None)
embedding_preset (str | None)
cypher_documents (List[Dict[str, Any]] | None)
cypher_query_meta (Dict[str, Any] | None)

Returns:

Dictionary with answer, sources, and metadata

Return type:

Dict[str, Any]

stream_query(question, top_k=10, score_threshold=None, filters=None, include_relations=False, include_subjects=False, return_sources=True, include_full_content=False, collections=None, granularity=None, chat_history=None, graph_depth=None, use_graph=None, embedding_preset=None, retrieval_depth=None, cypher_documents=None, cypher_query_meta=None)[source]¶

Stream hybrid RAG answer: yields preamble dict then string tokens.

Parameters:

question (str)
top_k (int)
score_threshold (float | None)
filters (Dict[str, Any] | None)
include_relations (bool)
include_subjects (bool)
return_sources (bool)
include_full_content (bool)
collections (List[str] | None)
granularity (str | None)
chat_history (List[BaseMessage] | None)
graph_depth (int | None)
use_graph (bool | None)
embedding_preset (str | None)
retrieval_depth (str | None)
cypher_documents (List[Dict[str, Any]] | None)
cypher_query_meta (Dict[str, Any] | None)

Return type:

Iterator[Dict[str, Any] | str]

summarize(topic, top_k=10, score_threshold=None, filters=None, include_relations=True, include_full_content=False)[source]¶

MODE 4: Summarization

Delegates to SummarizeMode.

Parameters:

topic (str) – Topic or question to summarize
top_k (int) – Number of documents to retrieve
filters (Dict[str, Any] | None) – Metadata filters
include_relations (bool) – Include relations in context
score_threshold (float | None)
include_full_content (bool)

Returns:

Dictionary with summary and sources

Return type:

Dict[str, Any]

summarize_canonical(*, celex, question)[source]¶

Return a canonical-summary response when one exists for celex.

Parameters:

celex (str)
question (str)

Return type:

Dict[str, Any] | None

compare(comparison_question, celex_list=None, top_k=10, score_threshold=None, include_full_content=False)[source]¶

MODE 5: Comparison

Delegates to CompareMode.

Parameters:

comparison_question (str) – What to compare
celex_list (List[str] | None) – Optional list of specific CELEX to compare
top_k (int) – Number of documents if CELEX not specified
score_threshold (float | None)
include_full_content (bool)

Returns:

Dictionary with comparison and sources

Return type:

Dict[str, Any]

get_statistics()[source]¶

Get RAG pipeline statistics

Returns:: Dictionary with pipeline statistics
Return type:: Dict[str, Any]

`lalandre_rag.summaries`¶

Source: packages/lalandre_rag/lalandre_rag/summaries/__init__.py

Canonical act summaries: storage, generation, and prompt augmentation.

`lalandre_rag.summaries.agent`¶

Source: packages/lalandre_rag/lalandre_rag/summaries/agent.py

PydanticAI agent for structured canonical summary generation.

class lalandre_rag.summaries.agent.CanonicalSummaryOutput(*, summary, output_validation_retries=0)[source]¶

Bases: BaseModel

Structured summary output validated by PydanticAI.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

summary (str)
output_validation_retries (int)

model_config: ClassVar[ConfigDict] = {'extra': 'ignore'}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

classmethod clean_summary(v)[source]¶

Normalize and validate the structured summary text.

Parameters:: v (Any)
Return type:: str

lalandre_rag.summaries.agent.run_summary_agent(*, prompt, generate_text, model_name)[source]¶

Run the canonical summary agent and return validated output + retry count.

Parameters:

prompt (str)
generate_text (Callable[[str], str])
model_name (str)

Return type:

tuple[CanonicalSummaryOutput, int]

`lalandre_rag.summaries.generator`¶

Source: packages/lalandre_rag/lalandre_rag/summaries/generator.py

Canonical summary generation: LLM-based and deterministic fallback.

class lalandre_rag.summaries.generator.CanonicalSummaryGenerator(*, llm_client, prompt_version=CANONICAL_SUMMARY_PROMPT_VERSION, model_id=None)[source]¶

Bases: object

Generate stable, reusable act summaries from structured act content.

Parameters:

llm_client (Optional[JSONHTTPLLMClient])
prompt_version (str)
model_id (Optional[str])

generate(*, act, version, subdivisions)[source]¶

Generate a canonical summary with LLM-first and deterministic fallback.

Parameters:

act (Any)
version (Any)
subdivisions (Sequence[Any])

Return type:

Dict[str, Any]

`lalandre_rag.summaries.models`¶

Source: packages/lalandre_rag/lalandre_rag/summaries/models.py

Data models, constants, and helpers for canonical act summaries.

class lalandre_rag.summaries.models.CanonicalSummarySnapshot(act_id, celex, language, status, is_stale, summary, generated_at, prompt_version, model_id, source_version_id, error_text, trace)[source]¶

Bases: object

Canonical summary state returned by the summary service.

Parameters:

act_id (int)
celex (str)
language (str)
status (str)
is_stale (bool)
summary (str | None)
generated_at (datetime | None)
prompt_version (str | None)
model_id (str | None)
source_version_id (int | None)
error_text (str | None)
trace (Dict[str, Any])

property available: bool¶: Return whether the snapshot contains a ready-to-use summary.

class lalandre_rag.summaries.models.SummaryTraceRecorder[source]¶

Bases: object

Build structured trace payloads for summary generation and lookup.

static lookup(*, status, is_stale, reason=None)[source]¶

Build trace metadata for summary lookup operations.

Parameters:

status (str)
is_stale (bool)
reason (str | None)

Return type:

Dict[str, Any]

static generation(*, mode, context_chars, subdivisions_used, model_id, prompt_version, extra=None)[source]¶

Build trace metadata for summary generation operations.

Parameters:

mode (str)
context_chars (int)
subdivisions_used (int)
model_id (str)
prompt_version (str)
extra (Dict[str, Any] | None)

Return type:

Dict[str, Any]

lalandre_rag.summaries.models.build_default_summary_questions(celex)[source]¶

Return the default summary prompts recognized for a CELEX identifier.

Parameters:: celex (str)
Return type:: set[str]

lalandre_rag.summaries.models.is_default_summary_question(question, celex)[source]¶

Return whether question matches the default summary prompt family.

Parameters:

question (str)
celex (str)

Return type:

bool

`lalandre_rag.summaries.service`¶

Source: packages/lalandre_rag/lalandre_rag/summaries/service.py

Services for reading, refreshing, and augmenting canonical act summaries.

class lalandre_rag.summaries.service.ActSummaryService(*, pg_repo, generator=None, prompt_version=CANONICAL_SUMMARY_PROMPT_VERSION, model_id=None)[source]¶

Bases: object

Read and refresh canonical act summaries.

Parameters:

pg_repo (Any)
generator (Optional[CanonicalSummaryGenerator])
prompt_version (str)
model_id (Optional[str])

static build_runtime_model_id()[source]¶

Build the model identifier recorded for generated summaries.

Return type:: str

get_canonical_summary_by_celex(celex)[source]¶

Fetch the current canonical summary snapshot for one act.

Parameters:: celex (str)
Return type:: CanonicalSummarySnapshot | None

refresh_canonical_summary_for_celex(celex)[source]¶

Regenerate and persist the canonical summary for one act.

Parameters:: celex (str)
Return type:: CanonicalSummarySnapshot

list_celex_needing_canonical_summary()[source]¶

List acts whose canonical summaries are missing, failed, or stale.

Return type:: List[str]

class lalandre_rag.summaries.service.QuestionSummaryService(act_summary_service)[source]¶

Bases: object

Augment personalized summarize/compare prompts with canonical summaries.

Parameters:: act_summary_service (Optional[ActSummaryService])

augment_question(*, celex, question)[source]¶

Augment one summarize question with cached canonical context when available.

Parameters:

celex (str | None)
question (str)

Return type:

tuple[str, Dict[str, Any]]

augment_compare_question(*, comparison_question, celex_list)[source]¶

Augment a compare question with cached summaries for each compared act.

Parameters:

comparison_question (str)
celex_list (Iterable[str])

Return type:

tuple[str, Dict[str, Any]]

RAG API¶

lalandre_rag¶

lalandre_rag.adapters¶

lalandre_rag.adapters.llamaindex¶

lalandre_rag.agentic¶

lalandre_rag.agentic.deps¶

lalandre_rag.agentic.graph¶

lalandre_rag.agentic.models¶

lalandre_rag.agentic.runtime¶

lalandre_rag.agentic.tools¶

lalandre_rag.citation_sanitizer¶

lalandre_rag.graph¶

lalandre_rag.graph.community¶

lalandre_rag.graph.context_budget¶

lalandre_rag.graph.helpers¶

lalandre_rag.graph.map_reduce¶

lalandre_rag.graph.neo4j_adapter¶

lalandre_rag.graph.ranker¶

lalandre_rag.graph.service¶

lalandre_rag.graph.source_payloads¶

lalandre_rag.linker_factory¶

lalandre_rag.llm¶

lalandre_rag.llm.factory¶

lalandre_rag.models¶

lalandre_rag.models.api¶

lalandre_rag.modes¶

lalandre_rag.modes.hybrid_generation¶

lalandre_rag.modes.hybrid_graph¶

lalandre_rag.modes.hybrid_helpers¶

lalandre_rag.modes.hybrid_mode¶

lalandre_rag.modes.llm_mode¶

lalandre_rag.modes.summarize_mode¶

lalandre_rag.ner_external¶

lalandre_rag.prompts¶

lalandre_rag.prose_linker¶

lalandre_rag.prose_rewriter¶

lalandre_rag.response¶

lalandre_rag.response.builder¶

lalandre_rag.response.factories¶

lalandre_rag.response.fallbacks¶

lalandre_rag.response.policy¶

lalandre_rag.response.source_builder¶

lalandre_rag.retrieval¶

lalandre_rag.retrieval.bm25_search¶

lalandre_rag.retrieval.context¶

lalandre_rag.retrieval.context.community_reports¶

lalandre_rag.retrieval.context.compressor¶

lalandre_rag.retrieval.context.models¶

lalandre_rag.retrieval.context.service¶

lalandre_rag.retrieval.decomposer¶

lalandre_rag.retrieval.evaluator¶

lalandre_rag.retrieval.fusion_service¶

lalandre_rag.retrieval.metrics¶

lalandre_rag.retrieval.overview¶

lalandre_rag.retrieval.planner¶

lalandre_rag.retrieval.query_expansion¶

lalandre_rag.retrieval.query_parser¶

Note d’architecture¶

lalandre_rag.retrieval.query_router¶

lalandre_rag.retrieval.query_utils¶

lalandre_rag.retrieval.rerank_service¶

lalandre_rag.retrieval.result¶

lalandre_rag.retrieval.result_cache¶

lalandre_rag.retrieval.search_config¶

lalandre_rag.retrieval.semantic_search¶

lalandre_rag.retrieval.service¶

lalandre_rag.retrieval.strategies¶

lalandre_rag.retrieval.strategies.hybrid¶

lalandre_rag.retrieval.strategies.lexical¶

lalandre_rag.retrieval.strategies.semantic¶

lalandre_rag.retrieval.trace¶

lalandre_rag.scoring¶

lalandre_rag.service¶

lalandre_rag.summaries¶

lalandre_rag.summaries.agent¶

lalandre_rag.summaries.generator¶

lalandre_rag.summaries.models¶

lalandre_rag.summaries.service¶

`lalandre_rag`¶

`lalandre_rag.adapters`¶

`lalandre_rag.adapters.llamaindex`¶

`lalandre_rag.agentic`¶

`lalandre_rag.agentic.deps`¶

`lalandre_rag.agentic.graph`¶

`lalandre_rag.agentic.models`¶

`lalandre_rag.agentic.runtime`¶

`lalandre_rag.agentic.tools`¶

`lalandre_rag.citation_sanitizer`¶

`lalandre_rag.graph`¶

`lalandre_rag.graph.community`¶

`lalandre_rag.graph.context_budget`¶

`lalandre_rag.graph.helpers`¶

`lalandre_rag.graph.map_reduce`¶

`lalandre_rag.graph.neo4j_adapter`¶

`lalandre_rag.graph.ranker`¶

`lalandre_rag.graph.service`¶

`lalandre_rag.graph.source_payloads`¶

`lalandre_rag.linker_factory`¶

`lalandre_rag.llm`¶

`lalandre_rag.llm.factory`¶

`lalandre_rag.models`¶

`lalandre_rag.models.api`¶

`lalandre_rag.modes`¶

`lalandre_rag.modes.hybrid_generation`¶

`lalandre_rag.modes.hybrid_graph`¶

`lalandre_rag.modes.hybrid_helpers`¶

`lalandre_rag.modes.hybrid_mode`¶

`lalandre_rag.modes.llm_mode`¶

`lalandre_rag.modes.summarize_mode`¶

`lalandre_rag.ner_external`¶

`lalandre_rag.prompts`¶

`lalandre_rag.prose_linker`¶

`lalandre_rag.prose_rewriter`¶

`lalandre_rag.response`¶

`lalandre_rag.response.builder`¶

`lalandre_rag.response.factories`¶

`lalandre_rag.response.fallbacks`¶

`lalandre_rag.response.policy`¶

`lalandre_rag.response.source_builder`¶

`lalandre_rag.retrieval`¶

`lalandre_rag.retrieval.bm25_search`¶

`lalandre_rag.retrieval.context`¶

`lalandre_rag.retrieval.context.community_reports`¶

`lalandre_rag.retrieval.context.compressor`¶

`lalandre_rag.retrieval.context.models`¶

`lalandre_rag.retrieval.context.service`¶

`lalandre_rag.retrieval.decomposer`¶

`lalandre_rag.retrieval.evaluator`¶

`lalandre_rag.retrieval.fusion_service`¶

`lalandre_rag.retrieval.metrics`¶

`lalandre_rag.retrieval.overview`¶

`lalandre_rag.retrieval.planner`¶

`lalandre_rag.retrieval.query_expansion`¶

`lalandre_rag.retrieval.query_parser`¶

`lalandre_rag.retrieval.query_router`¶

`lalandre_rag.retrieval.query_utils`¶

`lalandre_rag.retrieval.rerank_service`¶

`lalandre_rag.retrieval.result`¶

`lalandre_rag.retrieval.result_cache`¶

`lalandre_rag.retrieval.search_config`¶

`lalandre_rag.retrieval.semantic_search`¶

`lalandre_rag.retrieval.service`¶

`lalandre_rag.retrieval.strategies`¶

`lalandre_rag.retrieval.strategies.hybrid`¶

`lalandre_rag.retrieval.strategies.lexical`¶

`lalandre_rag.retrieval.strategies.semantic`¶

`lalandre_rag.retrieval.trace`¶

`lalandre_rag.scoring`¶

`lalandre_rag.service`¶

`lalandre_rag.summaries`¶

`lalandre_rag.summaries.agent`¶

`lalandre_rag.summaries.generator`¶

`lalandre_rag.summaries.models`¶

`lalandre_rag.summaries.service`¶