Extraction API

Note

This page is generated automatically from the repository’s maintained Python module inventory.

Relationship extraction, entity linking, and extraction-specific metrics.

lalandre_extraction

Source: packages/lalandre_extraction/lalandre_extraction/__init__.py

Extracts legal relationships from regulatory texts (GraphRAG-style).

lalandre_extraction.llm

Source: packages/lalandre_extraction/lalandre_extraction/llm/__init__.py

LLM-based relation extraction: structured agent, models, and HTTP client.

lalandre_extraction.llm.agent

Source: packages/lalandre_extraction/lalandre_extraction/llm/agent.py

PydanticAI agent for structured relation extraction.

lalandre_extraction.llm.agent.run_extraction_agent(*, prompt, generate_text, model_name, min_evidence_chars=8, min_rationale_chars=24)[source]

Run the extraction agent and return validated output + retry count.

Parameters:
  • prompt (str)

  • generate_text (Callable[[str], str])

  • model_name (str)

  • min_evidence_chars (int)

  • min_rationale_chars (int)

Return type:

tuple[ExtractionOutput, int]

lalandre_extraction.llm.client

Source: packages/lalandre_extraction/lalandre_extraction/llm/client.py

LLM client for GraphRAG-style relation extraction.

Extracts legal relations directly from text chunks using LLM APIs with API key pool round-robin for rate-limit distribution.

lalandre_extraction.llm.client.load_prompt_template(path=None)[source]

Load a prompt template from path, or fall back to the built-in default.

Parameters:

path (str | None)

Return type:

str

class lalandre_extraction.llm.client.RawExtractedRelation(target_reference, relation_type, text_evidence, relation_rationale='')[source]

Bases: object

Relation as returned by the LLM (before entity resolution).

Parameters:
  • target_reference (str)

  • relation_type (str)

  • text_evidence (str)

  • relation_rationale (str)

class lalandre_extraction.llm.client.ExtractionLLMClient(*, provider, model, base_url, key_pool, timeout_seconds, max_output_tokens, temperature=0.0, min_evidence_chars=8, min_rationale_chars=24, system_prompt='You are an EU/FR legal relation extractor. Return valid JSON only.', min_output_tokens=80)[source]

Bases: object

GraphRAG-style LLM extraction with API key pool round-robin. The model reads raw text chunks and identifies all legal relations directly

Parameters:
  • provider (str)

  • model (str)

  • base_url (str)

  • key_pool (APIKeyPool)

  • timeout_seconds (float)

  • max_output_tokens (int)

  • temperature (float)

  • min_evidence_chars (int)

  • min_rationale_chars (int)

  • system_prompt (str)

  • min_output_tokens (int)

property provider: str

Return the normalized provider name used by the client.

property model: str

Return the model identifier used by the client.

classmethod from_runtime(*, config, key_pool=None)[source]

Build an extraction client from the application runtime config.

Parameters:
Return type:

ExtractionLLMClient | None

extract_relations(chunk, source_celex='')[source]

Extract all legal relations from a text chunk.

Parameters:
  • chunk (str)

  • source_celex (str)

Return type:

List[RawExtractedRelation]

lalandre_extraction.llm.models

Source: packages/lalandre_extraction/lalandre_extraction/llm/models.py

Pydantic models for structured LLM extraction output.

class lalandre_extraction.llm.models.ExtractedRelationItem(*, target_reference, relation_type, text_evidence, relation_rationale='')[source]

Bases: BaseModel

Single relation extracted by the LLM.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • target_reference (str)

  • relation_type (str)

  • text_evidence (str)

  • relation_rationale (str)

model_config: ClassVar[ConfigDict] = {'extra': 'ignore'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

classmethod normalize_relation_type(v)[source]

Normalize the extracted relation type to a lowercase string.

Parameters:

v (Any)

Return type:

str

classmethod clean_text(v)[source]

Strip mandatory text fields emitted by the extraction model.

Parameters:

v (Any)

Return type:

str

classmethod clean_rationale(v)[source]

Normalize optional rationale text and default to an empty string.

Parameters:

v (Any)

Return type:

str

class lalandre_extraction.llm.models.ExtractionOutput(*, relations=<factory>, output_validation_retries=0)[source]

Bases: BaseModel

Structured output from the extraction agent.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
model_config: ClassVar[ConfigDict] = {'extra': 'ignore'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

lalandre_extraction.metrics

Source: packages/lalandre_extraction/lalandre_extraction/metrics.py

Metrics hook interfaces for extraction LLM instrumentation.

class lalandre_extraction.metrics.ExtractionMetricsRecorder[source]

Bases: ABC

Backend-agnostic hook interface for extraction LLM metrics.

abstractmethod observe_call(*, provider, model, outcome, duration_seconds)[source]

Record one extraction LLM call outcome and latency.

Parameters:
  • provider (str)

  • model (str)

  • outcome (str)

  • duration_seconds (float)

Return type:

None

abstractmethod observe_error(*, provider, model, error_type)[source]

Record one extraction LLM error bucket.

Parameters:
  • provider (str)

  • model (str)

  • error_type (str)

Return type:

None

abstractmethod observe_json_parse(*, provider, model, parse_mode)[source]

Record how raw model output was parsed into structured relations.

Parameters:
  • provider (str)

  • model (str)

  • parse_mode (str)

Return type:

None

abstractmethod observe_relations(*, provider, model, count)[source]

Record the number of relations produced by a successful call.

Parameters:
  • provider (str)

  • model (str)

  • count (int)

Return type:

None

lalandre_extraction.metrics.set_extraction_metrics_recorder(recorder)[source]

Register the active extraction metrics recorder backend.

Parameters:

recorder (ExtractionMetricsRecorder)

Return type:

None

lalandre_extraction.metrics.observe_extraction_llm_call(*, provider, model, outcome, duration_seconds)[source]

Forward one extraction-call latency observation to the active recorder.

Parameters:
  • provider (str)

  • model (str)

  • outcome (str)

  • duration_seconds (float)

Return type:

None

lalandre_extraction.metrics.observe_extraction_llm_error(*, provider, model, error_type)[source]

Forward one extraction error observation to the active recorder.

Parameters:
  • provider (str)

  • model (str)

  • error_type (str)

Return type:

None

lalandre_extraction.metrics.observe_extraction_llm_relations(*, provider, model, count)[source]

Forward one extracted-relations count observation to the active recorder.

Parameters:
  • provider (str)

  • model (str)

  • count (int)

Return type:

None

lalandre_extraction.prompts

Source: packages/lalandre_extraction/lalandre_extraction/prompts/__init__.py

Prompt assets bundled with the extraction package.

lalandre_extraction.relation_extractor

Source: packages/lalandre_extraction/lalandre_extraction/relation_extractor.py

Regulatory Relation Extractor

GraphRAG-style LLM-first extraction of legal relationships between acts. The LLM reads text chunks directly and identifies all relations. Entity linking resolves references to canonical CELEX identifiers post-extraction.

class lalandre_extraction.relation_extractor.ExtractedRelation(*, source_celex, target_celex, relation_type, confidence, text_evidence, relation_description=None, extraction_method='llm_extraction', effect_date=None, source_subdivision=None, target_subdivision=None, raw_target_reference=None, resolution_method=None, resolution_score=None)[source]

Bases: BaseModel

A legal relationship extracted from text.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • source_celex (str)

  • target_celex (str)

  • relation_type (RelationType)

  • confidence (float)

  • text_evidence (str)

  • relation_description (str | None)

  • extraction_method (str)

  • effect_date (datetime | None)

  • source_subdivision (str | None)

  • target_subdivision (str | None)

  • raw_target_reference (str | None)

  • resolution_method (str | None)

  • resolution_score (float | None)

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class lalandre_extraction.relation_extractor.RegulatoryRelationExtractor(max_chunk_size=None, entity_linker=None, validation_enabled=None, min_evidence_chars=None, llm_client=None)[source]

Bases: object

LLM-first extraction pipeline (GraphRAG-style): - text chunking - LLM extraction per chunk - entity linking (post-LLM resolution) - confidence scoring - validation & merge

Parameters:
set_entity_linker(entity_linker)[source]

Replace the entity linker used for post-LLM reference resolution.

Parameters:

entity_linker (LegalEntityLinker | None)

Return type:

None

extract_relations(text, source_celex, min_confidence=None)[source]

Extract, resolve, merge, and validate relations from one act text.

Parameters:
  • text (str)

  • source_celex (str)

  • min_confidence (float | None)

Return type:

List[ExtractedRelation]