Core API¶

Note

This page is generated automatically from the repository’s maintained Python module inventory.

Shared runtime configuration, repositories, HTTP helpers, queue primitives, and utilities.

`lalandre_core`¶

Source: packages/lalandre_core/lalandre_core/__init__.py

Core shared utilities for Lalandre services.

`lalandre_core.config`¶

Source: packages/lalandre_core/lalandre_core/config.py

Configuration management for project

class lalandre_core.config.EnvSettings(_case_sensitive=None, _nested_model_default_partial_update=None, _env_prefix=None, _env_prefix_target=None, _env_file=PosixPath('.'), _env_file_encoding=None, _env_ignore_empty=None, _env_nested_delimiter=None, _env_nested_max_split=None, _env_parse_none_str=None, _env_parse_enums=None, _cli_prog_name=None, _cli_parse_args=None, _cli_settings_source=None, _cli_parse_none_str=None, _cli_hide_none_type=None, _cli_avoid_json=None, _cli_enforce_required=None, _cli_use_class_docs_for_groups=None, _cli_exit_on_error=None, _cli_prefix=None, _cli_flag_prefix_char=None, _cli_implicit_flags=None, _cli_ignore_unknown_args=None, _cli_kebab_case=None, _cli_shortcuts=None, _secrets_dir=None, _build_sources=None, *, APP_CONFIG_FILE=None, APP_CONFIG_OVERRIDE_FILE=None, GATEWAY_ALLOWED_ORIGINS=None, DB_PASSWORD=None, QDRANT_API_KEY=None, NEO4J_PASSWORD=None, LLM_API_KEY=None, SEARCH_INTENT_PARSER_API_KEY=None, MISTRAL_API_KEY=None, MISTRAL_API_KEY_2=None, MISTRAL_API_KEY_3=None, MISTRAL_API_KEY_4=None, MISTRAL_API_KEY_5=None, MISTRAL_API_KEY_6=None, MISTRAL_API_KEY_7=None, MISTRAL_API_KEY_8=None, MISTRAL_API_KEY_9=None, MISTRAL_API_KEY_10=None)[source]¶

Bases: BaseSettings

Environment-backed settings loaded before the YAML application config.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

_case_sensitive (bool | None)
_nested_model_default_partial_update (bool | None)
_env_prefix (str | None)
_env_prefix_target (EnvPrefixTarget | None)
_env_file (DotenvType | None)
_env_file_encoding (str | None)
_env_ignore_empty (bool | None)
_env_nested_delimiter (str | None)
_env_nested_max_split (int | None)
_env_parse_none_str (str | None)
_env_parse_enums (bool | None)
_cli_prog_name (str | None)
_cli_parse_args (bool | list[str] | tuple[str, ...] | None)
_cli_settings_source (CliSettingsSource[Any] | None)
_cli_parse_none_str (str | None)
_cli_hide_none_type (bool | None)
_cli_avoid_json (bool | None)
_cli_enforce_required (bool | None)
_cli_use_class_docs_for_groups (bool | None)
_cli_exit_on_error (bool | None)
_cli_prefix (str | None)
_cli_flag_prefix_char (str | None)
_cli_implicit_flags (bool | Literal['dual', 'toggle'] | None)
_cli_ignore_unknown_args (bool | None)
_cli_kebab_case (bool | Literal['all', 'no_enums'] | None)
_cli_shortcuts (Mapping[str, str | list[str]] | None)
_secrets_dir (PathType | None)
_build_sources (tuple[tuple[PydanticBaseSettingsSource, ...], dict[str, Any]] | None)
APP_CONFIG_FILE (str | None)
APP_CONFIG_OVERRIDE_FILE (str | None)
GATEWAY_ALLOWED_ORIGINS (str | None)
DB_PASSWORD (str | None)
QDRANT_API_KEY (str | None)
NEO4J_PASSWORD (str | None)
LLM_API_KEY (str | None)
SEARCH_INTENT_PARSER_API_KEY (str | None)
MISTRAL_API_KEY (str | None)
MISTRAL_API_KEY_2 (str | None)
MISTRAL_API_KEY_3 (str | None)
MISTRAL_API_KEY_4 (str | None)
MISTRAL_API_KEY_5 (str | None)
MISTRAL_API_KEY_6 (str | None)
MISTRAL_API_KEY_7 (str | None)
MISTRAL_API_KEY_8 (str | None)
MISTRAL_API_KEY_9 (str | None)
MISTRAL_API_KEY_10 (str | None)

model_config: ClassVar[SettingsConfigDict] = {'arbitrary_types_allowed': True, 'case_sensitive': False, 'cli_avoid_json': False, 'cli_enforce_required': False, 'cli_exit_on_error': True, 'cli_flag_prefix_char': '-', 'cli_hide_none_type': False, 'cli_ignore_unknown_args': False, 'cli_implicit_flags': False, 'cli_kebab_case': False, 'cli_parse_args': None, 'cli_parse_none_str': None, 'cli_prefix': '', 'cli_prog_name': None, 'cli_shortcuts': None, 'cli_use_class_docs_for_groups': False, 'enable_decoding': True, 'env_file': '.env', 'env_file_encoding': 'utf-8', 'env_ignore_empty': True, 'env_nested_delimiter': None, 'env_nested_max_split': None, 'env_parse_enums': None, 'env_parse_none_str': None, 'env_prefix': '', 'env_prefix_target': 'variable', 'extra': 'ignore', 'json_file': None, 'json_file_encoding': None, 'nested_model_default_partial_update': False, 'protected_namespaces': ('model_validate', 'model_dump', 'settings_customise_sources'), 'secrets_dir': None, 'toml_file': None, 'validate_default': True, 'yaml_config_section': None, 'yaml_file': None, 'yaml_file_encoding': None}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class lalandre_core.config.DatabaseConfig(*, host=None, port=None, database=None, user=None, password=None)[source]¶

Bases: BaseModel

PostgreSQL database configuration.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

host (str | None)
port (int | None)
database (str | None)
user (str | None)
password (str | None)

property connection_string: str¶: Return a PostgreSQL connection string for the configured database.

model_config: ClassVar[ConfigDict] = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class lalandre_core.config.VectorConfig(*, host=None, port=None, api_key=None, collection_chunks=None, collection_acts=None, vector_size=1024, timeout=30, use_https=False)[source]¶

Bases: BaseModel

Qdrant configuration.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

host (str | None)
port (int | None)
api_key (str | None)
collection_chunks (str | None)
collection_acts (str | None)
vector_size (int)
timeout (int)
use_https (bool)

model_config: ClassVar[ConfigDict] = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class lalandre_core.config.GraphConfig(*, uri=None, user=None, password=None, database=None, max_connection_lifetime=3600, max_connection_pool_size=50, connection_timeout=30, strict_mode=False, acts_limit=10, relationships_limit=20, depth=2, cypher_timeout_seconds=30.0, cypher_max_rows=80, ranking_relation_weights=<factory>, ranking_default_relation_weight=0.3, community_relation_weights=<factory>, community_default_relation_weight=0.5, ranking_hop_decay=0.5, ranking_semantic_boost=0.3, ranking_relation_weight_factor=0.25, budget_semantic_share=0.6, budget_graph_share=0.3, budget_relation_share=0.1, map_reduce_threshold=24000, map_reduce_chunk_chars=5000, map_reduce_max_parallel=3, map_reduce_map_timeout=45.0, map_reduce_reduce_timeout=50.0, expansion_relation_types=<factory>, expansion_max_related_per_node=50, expansion_max_relationships_per_node=100, use_graph_in_rag=True, hybrid_enrichment_depth=2, use_communities_in_rag=True, community_central_act_title_chars=60, community_central_acts_display=3)[source]¶

Bases: BaseModel

Neo4j configuration.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

uri (str | None)
user (str | None)
password (str | None)
database (str | None)
max_connection_lifetime (int)
max_connection_pool_size (int)
connection_timeout (int)
strict_mode (bool)
acts_limit (int)
relationships_limit (int)
depth (int)
cypher_timeout_seconds (float)
cypher_max_rows (int)
ranking_relation_weights (Dict[str, float])
ranking_default_relation_weight (float)
community_relation_weights (Dict[str, float])
community_default_relation_weight (float)
ranking_hop_decay (float)
ranking_semantic_boost (float)
ranking_relation_weight_factor (float)
budget_semantic_share (float)
budget_graph_share (float)
budget_relation_share (float)
map_reduce_threshold (int)
map_reduce_chunk_chars (int)
map_reduce_max_parallel (int)
map_reduce_map_timeout (float)
map_reduce_reduce_timeout (float)
expansion_relation_types (list[str])
expansion_max_related_per_node (int)
expansion_max_relationships_per_node (int)
use_graph_in_rag (bool)
hybrid_enrichment_depth (int)
use_communities_in_rag (bool)
community_central_act_title_chars (int)
community_central_acts_display (int)

model_config: ClassVar[ConfigDict] = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class lalandre_core.config.TokenLimitsConfig(*, embedding_max_input_tokens=8192, chars_per_token=3.3, embedding_safety_ratio=0.9)[source]¶

Bases: BaseModel

Token limits for API models.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

embedding_max_input_tokens (int)
chars_per_token (float)
embedding_safety_ratio (float)

property embedding_max_chars: int¶: Max characters for embedding based on token limit.

model_config: ClassVar[ConfigDict] = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class lalandre_core.config.EmbeddingPresetConfig(*, preset_id, provider, model_name, device='cpu', label, enabled=True, indexing_enabled=True, queue_name=None, vector_size=1024)[source]¶

Bases: BaseModel

Named embedding runtime preset used for indexing and query-time routing.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

preset_id (str)
provider (str)
model_name (str)
device (str)
label (str)
enabled (bool)
indexing_enabled (bool)
queue_name (str | None)
vector_size (int)

resolved_queue_name()[source]¶

Return the queue name used by the embedding worker for this preset.

Return type:: str

model_config: ClassVar[ConfigDict] = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class lalandre_core.config.EmbeddingConfig(*, provider=None, model_name=None, batch_size=None, device=None, cache_dir=None, normalize_embeddings=True, enable_cache=True, cache_max_size=10000, redis_socket_timeout=2, cache_ttl_seconds=604800, retry_min_tokens=64, retry_fallback_threshold=96, retry_reduction_factor=0.7)[source]¶

Bases: BaseModel

Embedding model configuration.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

provider (str | None)
model_name (str | None)
batch_size (int | None)
device (str | None)
cache_dir (str | None)
normalize_embeddings (bool)
enable_cache (bool)
cache_max_size (int)
redis_socket_timeout (int)
cache_ttl_seconds (int)
retry_min_tokens (int)
retry_fallback_threshold (int)
retry_reduction_factor (float)

model_config: ClassVar[ConfigDict] = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class lalandre_core.config.ChunkingEmbeddingConfig(*, provider='mistral', model_name='mistral-embed', device='cpu')[source]¶

Bases: BaseModel

Embedding runtime used internally by the chunking algorithm.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

provider (str)
model_name (str)
device (str)

model_config: ClassVar[ConfigDict] = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class lalandre_core.config.SearchConfig(*, default_limit=10, default_mode='rag', default_search_mode='hybrid', default_granularity='all', default_embedding_preset='mistral', fulltext_language=None, bm25_normalization=32, fusion_method=None, lexical_weight=None, semantic_weight=None, candidate_multiplier=None, min_candidates=None, max_candidates=200, semantic_per_collection_oversampling=1.25, hnsw_ef=None, exact_search=False, query_expansion_enabled=True, query_expansion_max_variants=3, query_expansion_min_query_chars=24, intent_parser_enabled=False, intent_parser_provider=None, intent_parser_model=None, intent_parser_base_url=None, intent_parser_api_key=None, intent_parser_timeout_seconds=20.0, intent_parser_temperature=0.0, intent_parser_max_output_tokens=180, rerank_enabled=True, rerank_model='BAAI/bge-reranker-v2-m3', rerank_device='cpu', rerank_batch_size=4, rerank_max_candidates=5, rerank_max_chars=256, rerank_cache_dir=None, rerank_service_url=None, rerank_service_timeout_seconds=15.0, rerank_fallback_to_skip=True, rerank_circuit_failure_threshold=2, rerank_circuit_cooldown_seconds=30.0, score_threshold_default=0.15, relevance_gate_threshold=0.35, max_lexical_query_chars=200, fts_max_lexemes=12, dynamic_fusion_enabled=True, lexical_boost_factor=1.8, lexical_boost_max=0.75, result_cache_ttl_seconds=300, query_router_broad_query_min_chars=220, query_router_global_overview_min_top_k=10, query_router_citation_precision_min_top_k=7, query_router_relationship_focus_min_top_k=8, query_router_contextual_default_min_top_k=6, fusion_rrf_k=60, query_expansion_max_variants_cap=8, query_expansion_abbreviation_weight=0.96, query_expansion_keyword_focus_weight=0.92, query_expansion_reference_focus_weight=0.9, query_expansion_bilingual_weight=0.88, adaptive_score_drop_threshold=0.15, complementary_max_queries=2, complementary_top_k=5, compression_threshold_ratio=1.3, mmr_enabled=True, mmr_max_per_act=2, crag_enabled=False, crag_max_iterations=1, crag_skip_score_threshold=0.82, max_parallel_workers=4, query_parser_max_top_k=40, intent_parser_min_output_tokens=80, summary_min_chars=50)[source]¶

Bases: BaseModel

Search configuration.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

default_limit (int)
default_mode (str)
default_search_mode (str)
default_granularity (str)
default_embedding_preset (str)
fulltext_language (str | None)
bm25_normalization (int)
fusion_method (str | None)
lexical_weight (float | None)
semantic_weight (float | None)
candidate_multiplier (float | None)
min_candidates (int | None)
max_candidates (int)
semantic_per_collection_oversampling (float)
hnsw_ef (int | None)
exact_search (bool)
query_expansion_enabled (bool)
query_expansion_max_variants (int)
query_expansion_min_query_chars (int)
intent_parser_enabled (bool)
intent_parser_provider (str | None)
intent_parser_model (str | None)
intent_parser_base_url (str | None)
intent_parser_api_key (str | None)
intent_parser_timeout_seconds (float)
intent_parser_temperature (float)
intent_parser_max_output_tokens (int)
rerank_enabled (bool)
rerank_model (str)
rerank_device (str)
rerank_batch_size (int)
rerank_max_candidates (int)
rerank_max_chars (int)
rerank_cache_dir (str | None)
rerank_service_url (str | None)
rerank_service_timeout_seconds (float)
rerank_fallback_to_skip (bool)
rerank_circuit_failure_threshold (int)
rerank_circuit_cooldown_seconds (float)
score_threshold_default (float | None)
relevance_gate_threshold (float | None)
max_lexical_query_chars (int)
fts_max_lexemes (int)
dynamic_fusion_enabled (bool)
lexical_boost_factor (float)
lexical_boost_max (float)
result_cache_ttl_seconds (int)
query_router_broad_query_min_chars (int)
query_router_global_overview_min_top_k (int)
query_router_citation_precision_min_top_k (int)
query_router_relationship_focus_min_top_k (int)
query_router_contextual_default_min_top_k (int)
fusion_rrf_k (int)
query_expansion_max_variants_cap (int)
query_expansion_abbreviation_weight (float)
query_expansion_keyword_focus_weight (float)
query_expansion_reference_focus_weight (float)
query_expansion_bilingual_weight (float)
adaptive_score_drop_threshold (float | None)
complementary_max_queries (int)
complementary_top_k (int)
compression_threshold_ratio (float)
mmr_enabled (bool)
mmr_max_per_act (int)
crag_enabled (bool)
crag_max_iterations (int)
crag_skip_score_threshold (float)
max_parallel_workers (int)
query_parser_max_top_k (int)
intent_parser_min_output_tokens (int)
summary_min_chars (int)

model_config: ClassVar[ConfigDict] = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class lalandre_core.config.GenerationConfig(*, provider='mistral', model_name=None, temperature=None, max_tokens=8000, max_context_chars=20000, summarize_max_context_chars=60000, base_url=None, mistral_base_url='https://api.mistral.ai/v1', context_window=32000, api_key=None, timeout_seconds=45.0, lightweight_model_name=None, key_pool_max=10)[source]¶

Bases: BaseModel

LLM generation configuration.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

provider (str)
model_name (str | None)
temperature (float | None)
max_tokens (int)
max_context_chars (int)
summarize_max_context_chars (int)
base_url (str | None)
mistral_base_url (str)
context_window (int)
api_key (str | None)
timeout_seconds (float)
lightweight_model_name (str | None)
key_pool_max (int)

model_config: ClassVar[ConfigDict] = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class lalandre_core.config.ChunkingConfig(*, min_chunk_size, max_chunk_size, chunk_overlap=0, subdivision_max_chars=30000, extraction_max_chunk_chars=3200, breakpoint_percentile=90.0, breakpoint_max_threshold=1.0, sentence_window_size=1, embedding_batch_size=32, article_level_chunking=True, embedding=<factory>)[source]¶

Bases: BaseModel

Chunking configuration.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

min_chunk_size (int)
max_chunk_size (int)
chunk_overlap (int)
subdivision_max_chars (int)
extraction_max_chunk_chars (int)
breakpoint_percentile (float)
breakpoint_max_threshold (float)
sentence_window_size (int)
embedding_batch_size (int)
article_level_chunking (bool)
embedding (ChunkingEmbeddingConfig)

resolve_max_chunk_size(token_limits)[source]¶

Cap max_chunk_size so it never exceeds the global embedding token budget.

Each embedding model handles its own per-provider limit at embed time (split + weighted-average for oversized chunks). This guard only prevents chunks from exceeding the largest model’s hard ceiling.

Parameters:: token_limits (TokenLimitsConfig)
Return type:: int

model_config: ClassVar[ConfigDict] = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class lalandre_core.config.ContextBudgetConfig(*, rag_max_sources=10, rag_min_chars_per_source=200, rag_relation_lines=8, global_reports_share=0.45, global_sources_share=0.55, global_max_reports=4, global_min_cluster_size=2, global_max_evidence_per_report=3, global_max_source_docs=7, standard_relation_budget_fraction=0.15, global_graph_budget_fraction=0.1, community_top_relation_types=5, community_central_acts=3, content_preview_chars=200, snippet_preview_chars=300, fallback_preview_chars=180, compression_min_chars=3000, compression_min_budget=500)[source]¶

Bases: BaseModel

Token/character budgets used to compose RAG context blocks.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

rag_max_sources (int)
rag_min_chars_per_source (int)
rag_relation_lines (int)
global_reports_share (float)
global_sources_share (float)
global_max_reports (int)
global_min_cluster_size (int)
global_max_evidence_per_report (int)
global_max_source_docs (int)
standard_relation_budget_fraction (float)
global_graph_budget_fraction (float)
community_top_relation_types (int)
community_central_acts (int)
content_preview_chars (int)
snippet_preview_chars (int)
fallback_preview_chars (int)
compression_min_chars (int)
compression_min_budget (int)

normalized_global_shares()[source]¶

Return normalized (reports_share, sources_share) ratios for global mode.

Return type:: tuple[float, float]

model_config: ClassVar[ConfigDict] = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class lalandre_core.config.GatewayConfig(*, redis_host=None, redis_port=None, rag_service_url=None, embedding_service_url=None, rerank_service_url=None, allowed_origins=None, auto_bootstrap=False, job_ttl_seconds=None, bootstrap_lock_ttl_seconds=None, healthcheck_timeout_seconds=5.0, rag_proxy_timeout_seconds=300.0, rate_limit_query='20/minute', rate_limit_stream='15/minute', rate_limit_search='30/minute', rate_limit_jobs='10/minute', job_chunk_min_content_length=None, job_embed_batch_size=None, job_extract_min_confidence=None, job_extract_skip_existing_default=None)[source]¶

Bases: BaseModel

API Gateway configuration.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

redis_host (str | None)
redis_port (int | None)
rag_service_url (str | None)
embedding_service_url (str | None)
rerank_service_url (str | None)
allowed_origins (list[str] | None)
auto_bootstrap (bool)
job_ttl_seconds (int | None)
bootstrap_lock_ttl_seconds (int | None)
healthcheck_timeout_seconds (float)
rag_proxy_timeout_seconds (float)
rate_limit_query (str)
rate_limit_stream (str)
rate_limit_search (str)
rate_limit_jobs (str)
job_chunk_min_content_length (int | None)
job_embed_batch_size (int | None)
job_extract_min_confidence (float | None)
job_extract_skip_existing_default (bool | None)

model_config: ClassVar[ConfigDict] = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class lalandre_core.config.ExtractionConfidenceConfig(*, base=0.75, non_cites_bonus=0.03, explicit_resolution_bonus=0.1, alias_resolution_bonus=0.05, normalize_fallback_score=0.0, fuzzy_min_factor=0.75, evidence_min_chars=20, evidence_bonus=0.02, max_confidence=0.95)[source]¶

Bases: BaseModel

Tuning knobs for post-extraction confidence scoring.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

base (float)
non_cites_bonus (float)
explicit_resolution_bonus (float)
alias_resolution_bonus (float)
normalize_fallback_score (float)
fuzzy_min_factor (float)
evidence_min_chars (int)
evidence_bonus (float)
max_confidence (float)

model_config: ClassVar[ConfigDict] = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class lalandre_core.config.ExtractionConfig(*, llm_provider='mistral', llm_model='mistral-small-latest', llm_base_url='https://api.mistral.ai/v1', llm_timeout_seconds=120.0, llm_temperature=0.0, llm_max_output_tokens=1024, llm_min_output_tokens=80, llm_system_prompt='You are an EU/FR legal relation extractor. Return valid JSON only.', llm_min_evidence_chars=8, llm_min_rationale_chars=24, llm_max_parallel_chunks=2, llm_chunk_cache_size=256, validation_enabled=True, min_evidence_chars=28, min_description_chars=240, entity_linker_fuzzy_threshold=0.89, entity_linker_fuzzy_min_gap=0.03, entity_linker_fuzzy_limit=2, entity_linker_min_alias_chars=6, confidence=<factory>, max_evidence_chars=420)[source]¶

Bases: BaseModel

Extraction LLM behavior configuration.

Two-stage filtering: - llm_min_* fields apply during raw LLM output parsing (first pass). - min_evidence_chars applies during post-extraction validation (second pass).

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

llm_provider (str)
llm_model (str)
llm_base_url (str)
llm_timeout_seconds (float)
llm_temperature (float)
llm_max_output_tokens (int)
llm_min_output_tokens (int)
llm_system_prompt (str)
llm_min_evidence_chars (int)
llm_min_rationale_chars (int)
llm_max_parallel_chunks (int)
llm_chunk_cache_size (int)
validation_enabled (bool)
min_evidence_chars (int)
min_description_chars (int)
entity_linker_fuzzy_threshold (float)
entity_linker_fuzzy_min_gap (float)
entity_linker_fuzzy_limit (int)
entity_linker_min_alias_chars (int)
confidence (ExtractionConfidenceConfig)
max_evidence_chars (int)

model_config: ClassVar[ConfigDict] = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class lalandre_core.config.WorkersConfig(*, brpop_timeout_seconds=1, chunk_db_commit_batch_size=10, embed_worker_max_batch_size=32, embed_qdrant_upsert_batch_size=1000, auto_embed_reconcile=True, auto_embed_reconcile_interval=300, auto_embed_reconcile_ttl=600, auto_chunk_reconcile=True, auto_chunk_reconcile_interval=300, auto_chunk_reconcile_ttl=600, auto_extract_reconcile=True, auto_extract_reconcile_interval=600, auto_extract_reconcile_ttl=600, extract_metrics_port=9107, embed_metrics_port=9108, chunk_metrics_port=9109, extract_stale_timeout_minutes=60, community_resolution=1.0, community_min_size=2)[source]¶

Bases: BaseModel

Worker runtime tuning.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

brpop_timeout_seconds (int)
chunk_db_commit_batch_size (int)
embed_worker_max_batch_size (int)
embed_qdrant_upsert_batch_size (int)
auto_embed_reconcile (bool)
auto_embed_reconcile_interval (int)
auto_embed_reconcile_ttl (int)
auto_chunk_reconcile (bool)
auto_chunk_reconcile_interval (int)
auto_chunk_reconcile_ttl (int)
auto_extract_reconcile (bool)
auto_extract_reconcile_interval (int)
auto_extract_reconcile_ttl (int)
extract_metrics_port (int)
embed_metrics_port (int)
chunk_metrics_port (int)
extract_stale_timeout_minutes (int)
community_resolution (float)
community_min_size (int)

model_config: ClassVar[ConfigDict] = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class lalandre_core.config.LalandreConfig(*, database=<factory>, vector=<factory>, graph=<factory>, token_limits=<factory>, embedding=<factory>, embedding_presets=<factory>, search=<factory>, generation=<factory>, chunking, context_budget=<factory>, gateway=<factory>, extraction=<factory>, workers=<factory>, models_cache_dir=None)[source]¶

Bases: BaseModel

Main configuration class.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

database (DatabaseConfig)
vector (VectorConfig)
graph (GraphConfig)
token_limits (TokenLimitsConfig)
embedding (EmbeddingConfig)
embedding_presets (list[EmbeddingPresetConfig])
search (SearchConfig)
generation (GenerationConfig)
chunking (ChunkingConfig)
context_budget (ContextBudgetConfig)
gateway (GatewayConfig)
extraction (ExtractionConfig)
workers (WorkersConfig)
models_cache_dir (str | None)

enabled_embedding_presets()[source]¶

Return the embedding presets that are enabled for runtime use.

Return type:: list[EmbeddingPresetConfig]

indexing_enabled_embedding_presets()[source]¶

Return the embedding presets that are enabled for indexing workflows.

Return type:: list[EmbeddingPresetConfig]

get_embedding_preset(preset_id)[source]¶

Return one embedding preset by ID, or None when it is unknown.

Parameters:: preset_id (str | None)
Return type:: EmbeddingPresetConfig | None

get_default_embedding_preset()[source]¶

Return the default enabled embedding preset for query-time operations.

Return type:: EmbeddingPresetConfig

classmethod from_env()[source]¶

Load configuration from environment variables

Return type:: LalandreConfig

model_config: ClassVar[ConfigDict] = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

lalandre_core.config.get_env_settings()[source]¶

Get or create the environment settings instance.

Return type:: EnvSettings

lalandre_core.config.get_config()[source]¶

Get or create global configuration instance.

Return type:: LalandreConfig

lalandre_core.config.reset_config()[source]¶

Invalidate the config singleton so the next call to get_config() reloads from disk.

Return type:: None

lalandre_core.config.get_postgres_connection_string()[source]¶

Get PostgreSQL connection string.

Return type:: str

lalandre_core.config.get_gateway_config()[source]¶

Get API Gateway configuration with required values enforced.

Return type:: GatewayConfig

`lalandre_core.embedding_presets`¶

Source: packages/lalandre_core/lalandre_core/embedding_presets.py

Helpers for embedding preset resolution across services.

lalandre_core.embedding_presets.list_embedding_presets(*, enabled_only=False, indexing_only=False)[source]¶

Return configured embedding presets.

Parameters:

enabled_only (bool)
indexing_only (bool)

Return type:

list[EmbeddingPresetConfig]

lalandre_core.embedding_presets.get_embedding_preset(preset_id, *, enabled_only=False, indexing_only=False)[source]¶

Resolve a preset by ID.

Parameters:

preset_id (str | None)
enabled_only (bool)
indexing_only (bool)

Return type:

EmbeddingPresetConfig | None

lalandre_core.embedding_presets.get_default_embedding_preset()[source]¶

Return the configured default embedding preset.

Return type:: EmbeddingPresetConfig

lalandre_core.embedding_presets.get_default_embedding_preset_id()[source]¶

Return the ID of the default embedding preset.

Return type:: str

lalandre_core.embedding_presets.resolve_embedding_preset_or_default(preset_id)[source]¶

Resolve a preset, falling back to the configured default when missing/invalid.

Parameters:: preset_id (str | None)
Return type:: EmbeddingPresetConfig

lalandre_core.embedding_presets.resolve_embed_queue_name(preset_id=None)[source]¶

Return the queue name for a preset, defaulting to the configured default preset.

Parameters:: preset_id (str | None)
Return type:: str

lalandre_core.embedding_presets.resolve_worker_embedding_preset(preset_id=None, *, env_var='EMBEDDING_PRESET_ID')[source]¶

Resolve the preset bound to the current embedding worker.

Parameters:

preset_id (str | None)
env_var (str)

Return type:

EmbeddingPresetConfig

`lalandre_core.http`¶

Source: packages/lalandre_core/lalandre_core/http/__init__.py

HTTP helpers shared across Lalandre services.

`lalandre_core.http.llm_client`¶

Source: packages/lalandre_core/lalandre_core/http/llm_client.py

Shared HTTP client for compact JSON-oriented LLM calls.

lalandre_core.http.llm_client.coerce_json_object(value)[source]¶

Safely coerce a runtime value to a JSON object with string keys.

Parameters:: value (Any)
Return type:: Dict[str, Any] | None

class lalandre_core.http.llm_client.JSONHTTPLLMClient(provider, model, base_url, timeout_seconds, max_output_tokens, temperature, api_key=None, system_prompt='Return valid JSON only.', error_preview_chars=240)[source]¶

Bases: object

Thin HTTP client for OpenAI-compatible JSON generation.

Parameters:

provider (str)
model (str)
base_url (str)
timeout_seconds (float)
max_output_tokens (int)
temperature (float)
api_key (str | None)
system_prompt (str)
error_preview_chars (int)

generate(prompt)[source]¶

Generate a JSON-formatted completion payload as raw string.

Parameters:: prompt (str)
Return type:: str

class lalandre_core.http.llm_client.SharedKeyPoolJSONHTTPLLMClient(*, key_pool, clients_by_key)[source]¶

Bases: object

Dispatch JSON HTTP LLM calls through a shared API key pool.

Parameters:

key_pool (APIKeyPool)
clients_by_key (Mapping[str, JSONHTTPLLMClient])

classmethod from_key_pool(*, key_pool, provider, model, base_url, timeout_seconds, max_output_tokens, temperature, system_prompt='Return valid JSON only.', error_preview_chars=240)[source]¶

Build one JSON HTTP client per API key and wrap them in a shared pool.

Parameters:

key_pool (APIKeyPool)
provider (str)
model (str)
base_url (str)
timeout_seconds (float)
max_output_tokens (int)
temperature (float)
system_prompt (str)
error_preview_chars (int)

Return type:

SharedKeyPoolJSONHTTPLLMClient

generate(prompt)[source]¶

Generate one JSON response using the next key selected by the pool.

Parameters:: prompt (str)
Return type:: str

`lalandre_core.http.middleware`¶

Source: packages/lalandre_core/lalandre_core/http/middleware.py

Reusable HTTP instrumentation middleware factory for FastAPI services.

lalandre_core.http.middleware.make_http_instrumentation_middleware(observe_fn)[source]¶

Build a middleware that records per-request latency and status metrics.

Parameters:: observe_fn (Callable[[...], None])
Return type:: Callable[[Request, Callable[[Request], Awaitable[Response]]], Any]

`lalandre_core.linking`¶

Source: packages/lalandre_core/lalandre_core/linking/__init__.py

Entity linking: resolve legal references to canonical CELEX identifiers.

`lalandre_core.linking.entity_linker`¶

Source: packages/lalandre_core/lalandre_core/linking/entity_linker.py

Local entity linking utilities for legal acts (UE/France).

class lalandre_core.linking.entity_linker.ActAliasEntry(celex, title, aliases=(), act_id=None, eli=None, acronyms=())[source]¶

Bases: object

Canonical act entry and its known alias forms.

Parameters:

celex (str)
title (str)
aliases (tuple[str, ...])
act_id (int | None)
eli (str | None)
acronyms (tuple[str, ...])

eli: str | None = None¶: Optional European Legislation Identifier URI for interop with ELI-aware systems.

acronyms: tuple[str, ...] = ()¶: Short acronyms (DORA, CRR, MAR, …) that bypass min_alias_chars.

class lalandre_core.linking.entity_linker.LinkResolution(celex, score, method, matched_text, act_id=None, subdivision_id=None, article_number=None, eli=None)[source]¶

Bases: object

Resolved reference returned by the entity linker.

Parameters:

celex (str)
score (float)
method (str)
matched_text (str)
act_id (int | None)
subdivision_id (int | None)
article_number (str | None)
eli (str | None)

eli: str | None = None¶: Canonical ELI URI of the resolved act (propagated from the matching entry).

class lalandre_core.linking.entity_linker.LegalEntityLinker(entries, *, fuzzy_threshold, fuzzy_min_gap, fuzzy_limit=2, min_alias_chars, article_lookup=None)[source]¶

Bases: object

Resolve legal references to canonical CELEX-like identifiers.

Parameters:

entries (Iterable[ActAliasEntry])
fuzzy_threshold (float)
fuzzy_min_gap (float)
fuzzy_limit (int)
min_alias_chars (int)
article_lookup (Callable[[int, str], int | None] | None)

property alias_count: int¶: Return the number of normalized aliases indexed by the linker.

classmethod derive_acronyms(title)[source]¶

Extract short acronyms from a title’s parenthesised content.

Returns a tuple of strings like ("DORA",) for a title that contains Digital Operational Resilience Regulation (DORA). The caller passes this to ActAliasEntry.acronyms so the linker can match them without applying min_alias_chars.

Parameters:: title (str)
Return type:: tuple[str, …]

classmethod derive_aliases(title, *, eli=None, official_journal_reference=None, form_number=None)[source]¶

Derive stable alias candidates from act metadata fields.

Parameters:

title (str)
eli (str | None)
official_journal_reference (str | None)
form_number (str | None)

Return type:

tuple[str, …]

resolve(reference)[source]¶

Resolve a free-text legal reference to a canonical CELEX identifier.

Parameters:: reference (str)
Return type:: LinkResolution | None

resolve_with_article(reference, article_number=None, *, article_lookup=None)[source]¶

Resolve a reference and optionally enrich with a subdivision_id for an article.

If article_number is provided and the act is known, try to resolve the corresponding subdivision id via article_lookup (or the linker’s default one). Falls back to returning the base resolution if the article can’t be resolved.

Parameters:

reference (str)
article_number (str | None)
article_lookup (Callable[[int, str], int | None] | None)

Return type:

LinkResolution | None

`lalandre_core.linking.heuristics`¶

Source: packages/lalandre_core/lalandre_core/linking/heuristics.py

Shared heuristics and regex patterns for legal entity linking. Centralized here to keep extraction, RAG, and validation rules in sync. Values are package-local (not config-driven) to avoid leaking concerns.

lalandre_core.linking.heuristics.is_generic_target(target)[source]¶

Return whether target is too generic to resolve as a concrete act.

Parameters:: target (str)
Return type:: bool

lalandre_core.linking.heuristics.looks_like_identifier(target)[source]¶

Return whether target resembles an explicit legal identifier.

Parameters:: target (str)
Return type:: bool

`lalandre_core.linking.ner_client`¶

Source: packages/lalandre_core/lalandre_core/linking/ner_client.py

Tiny HTTP client for the dedicated NER service.

The NER service exposes a single POST /detect endpoint that runs GLiNER in zero-shot mode. The client is deliberately minimal: it does one synchronous request per call, surfaces a typed result, and never raises on network errors — it returns an empty span list and lets the caller log/skip.

Designed to be reused outside RAG (extraction pipeline, evaluation scripts).

class lalandre_core.linking.ner_client.NerClient(base_url, *, timeout_seconds=5.0, default_threshold=0.5, default_entity_types=None)[source]¶

Bases: object

Thin HTTP wrapper around ner-service /detect.

Failure modes (network error, non-2xx, malformed JSON) are swallowed and logged at WARNING level; the call returns [] in that case so the caller can degrade gracefully.

Parameters:

base_url (str)
timeout_seconds (float)
default_threshold (float)
default_entity_types (Optional[Iterable[str]])

property base_url: str¶: Return the base URL of the configured NER service.

detect(text, *, entity_types=None, threshold=None)[source]¶

Call POST /detect and return the matched spans (empty on any failure).

Parameters:

text (str)
entity_types (Iterable[str] | None)
threshold (float | None)

Return type:

List[NerSpan]

class lalandre_core.linking.ner_client.NerSpan(text, start, end, type, score)[source]¶

Bases: object

A single span detected by the NER service.

Parameters:

text (str)
start (int)
end (int)
type (str)
score (float)

`lalandre_core.llm`¶

Source: packages/lalandre_core/lalandre_core/llm/__init__.py

Shared LLM utilities for provider normalization and ChatModel construction.

`lalandre_core.llm.langchain`¶

Source: packages/lalandre_core/lalandre_core/llm/langchain.py

Unified LangChain ChatModel factory.

class lalandre_core.llm.langchain.PooledChatModel(models)[source]¶

Bases: Runnable

Round-robin wrapper over multiple ChatModel instances.

Parameters:: models (List[Any])

invoke(input, config=None, **kwargs)[source]¶

Invoke the next model in the pool with one request.

Parameters:

input (Any)
config (Any)
kwargs (Any)

Return type:

Any

stream(input, config=None, **kwargs)[source]¶

Stream one response from the next model in the pool.

Parameters:

input (Any)
config (Any)
kwargs (Any)

Return type:

Iterator[Any]

batch(inputs, config=None, **kwargs)[source]¶

Process one batch with the next model in the pool.

Parameters:

inputs (List[Any])
config (Any)
kwargs (Any)

Return type:

List[Any]

class lalandre_core.llm.langchain.SharedKeyPoolChatModel(*, key_pool, models_by_key)[source]¶

Bases: Runnable

Dispatch each LangChain call through a shared API key pool.

Parameters:

key_pool (APIKeyPool)
models_by_key (Mapping[str, Any])

invoke(input, config=None, **kwargs)[source]¶

Invoke the model selected by the shared API key pool.

Parameters:

input (Any)
config (Any)
kwargs (Any)

Return type:

Any

stream(input, config=None, **kwargs)[source]¶

Stream a response from the model selected by the shared API key pool.

Parameters:

input (Any)
config (Any)
kwargs (Any)

Return type:

Iterator[Any]

batch(inputs, config=None, **kwargs)[source]¶

Execute a batch call through the model selected by the shared API key pool.

Parameters:

inputs (List[Any])
config (Any)
kwargs (Any)

Return type:

List[Any]

lalandre_core.llm.langchain.build_chat_model(*, provider, model, api_key, base_url='', temperature=0.0, max_tokens=None, timeout_seconds=None)[source]¶

Build a LangChain ChatModel (ChatMistralAI or ChatOpenAI).

Returns the raw ChatModel instance — callers can wrap it (e.g. LangchainLLMWrapper, StrOutputParser) as needed.

Parameters:

provider (str)
model (str)
api_key (str)
base_url (str)
temperature (float)
max_tokens (int | None)
timeout_seconds (float | None)

Return type:

Any

`lalandre_core.llm.providers`¶

Source: packages/lalandre_core/lalandre_core/llm/providers.py

Shared LLM provider utilities: normalization, URL resolution, API key resolution.

lalandre_core.llm.providers.normalize_provider(provider)[source]¶

Normalize provider name: strip, lowercase, openai → openai_compatible.

Parameters:: provider (str)
Return type:: str

lalandre_core.llm.providers.normalize_base_url(*, provider, base_url)[source]¶

Normalize base URL: strip, rstrip /.

Parameters:

provider (str)
base_url (str)

Return type:

str

lalandre_core.llm.providers.resolve_api_key(*, provider, api_key=None, mistral_api_key=None, allow_empty=False)[source]¶

Resolve API key with priority: mistral_api_key > api_key > error.

Parameters:

provider (str)
api_key (str | None)
mistral_api_key (str | None)
allow_empty (bool)

Return type:

str

`lalandre_core.llm.structured`¶

Source: packages/lalandre_core/lalandre_core/llm/structured.py

Shared helpers for running PydanticAI structured-output agents.

Extracted from lalandre_rag.agentic.tools so that any package (extraction, RAG, summaries) can reuse the same FunctionModel bridge without depending on the RAG layer.

lalandre_core.llm.structured.json_payload_from_text(raw)[source]¶

Extract a JSON object from potentially noisy LLM text.

Parameters:: raw (str)
Return type:: dict[str, Any] | None

lalandre_core.llm.structured.build_structured_prompt(*, messages, agent_info)[source]¶

Build a single text prompt from PydanticAI messages + output schema.

Parameters:

messages (list[Annotated[ModelRequest | ModelResponse, Discriminator(discriminator=kind, custom_error_type=None, custom_error_message=None, custom_error_context=None)]])
agent_info (AgentInfo)

Return type:

str

lalandre_core.llm.structured.to_text_generator(llm_or_generate)[source]¶

Normalize an LLM object or callable into a simple str -> str function.

Parameters:: llm_or_generate (Any)
Return type:: Callable[[str], str]

lalandre_core.llm.structured.run_structured_agent(*, agent, prompt, llm_or_generate, model_name)[source]¶

Run a PydanticAI agent using a FunctionModel bridge to any LLM.

Returns (output, retries) where retries is the number of output-validation retries triggered.

Parameters:

agent (Agent[Any, T])
prompt (str)
llm_or_generate (Any)
model_name (str)

Return type:

tuple[T, int]

`lalandre_core.logging_setup`¶

Source: packages/lalandre_core/lalandre_core/logging_setup.py

Shared logging configuration for Lalandre workers.

lalandre_core.logging_setup.setup_worker_logging()[source]¶

Configure root logging with structlog.

Reads LOG_FORMAT env var: ‘json’ for structured JSON output, anything else for human-readable console output.

Return type:: None

`lalandre_core.models`¶

Source: packages/lalandre_core/lalandre_core/models/__init__.py

Data models

`lalandre_core.models.act_metadata`¶

Source: packages/lalandre_core/lalandre_core/models/act_metadata.py

Pydantic model for key-value metadata attached to one act.

class lalandre_core.models.act_metadata.ActMetadata(*, id=None, act_id, key, value, created_at=None)[source]¶

Bases: BaseModel

Represent one metadata entry linked to a legal act.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

id (int | None)
act_id (int)
key (str)
value (str)
created_at (datetime | None)

model_config: ClassVar[ConfigDict] = {'from_attributes': True}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

`lalandre_core.models.act_relations`¶

Source: packages/lalandre_core/lalandre_core/models/act_relations.py

Pydantic model for relationships extracted between legal acts.

class lalandre_core.models.act_relations.ActRelations(*, id=None, source_act_id, target_act_id=None, target_celex=None, relation_type, source_subdivision_id=None, target_subdivision_id=None, effect_date=None, description=None, evidence=None, rationale=None, resolution_method=None, resolution_score=None, target_reference=None, confidence=None, source=None, validated=False, synced_to_neo4j_at=None, is_resolved=True, created_at=None)[source]¶

Bases: BaseModel

Represent one typed relationship between two legal acts.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

id (int | None)
source_act_id (int)
target_act_id (int | None)
target_celex (str | None)
relation_type (RelationType)
source_subdivision_id (int | None)
target_subdivision_id (int | None)
effect_date (datetime | None)
description (str | None)
evidence (str | None)
rationale (str | None)
resolution_method (str | None)
resolution_score (float | None)
target_reference (str | None)
confidence (float | None)
source (str | None)
validated (bool | None)
synced_to_neo4j_at (datetime | None)
is_resolved (bool | None)
created_at (datetime | None)

model_config: ClassVar[ConfigDict] = {'from_attributes': True}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

`lalandre_core.models.act_subjects`¶

Source: packages/lalandre_core/lalandre_core/models/act_subjects.py

Pydantic model for the act-to-subject association table.

class lalandre_core.models.act_subjects.ActSubjects(*, act_id, subject_id)[source]¶

Bases: BaseModel

Represent one link between an act and a subject matter.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

act_id (int)
subject_id (int)

model_config: ClassVar[ConfigDict] = {'from_attributes': True}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

`lalandre_core.models.acts`¶

Source: packages/lalandre_core/lalandre_core/models/acts.py

Pydantic model for top-level legal act records.

class lalandre_core.models.acts.Acts(*, id=None, celex, eli=None, act_type, title, language, adoption_date=None, force_date=None, end_date=None, official_journal_reference=None, sector=None, level=None, form_number=None, url_eurlex=None, created_at=None, updated_at=None, last_synced_at=None, content_hash=None, sync_status='pending', extracted_at=None, extraction_status='pending')[source]¶

Bases: BaseModel

Represent one legal act stored in the core domain model.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

id (int | None)
celex (str)
eli (str | None)
act_type (ActType)
title (str)
language (LanguageCode)
adoption_date (datetime | None)
force_date (datetime | None)
end_date (datetime | None)
official_journal_reference (str | None)
sector (int | None)
level (int | None)
form_number (str | None)
url_eurlex (str | None)
created_at (datetime | None)
updated_at (datetime | None)
last_synced_at (datetime | None)
content_hash (str | None)
sync_status (str | None)
extracted_at (datetime | None)
extraction_status (str | None)

model_config: ClassVar[ConfigDict] = {'from_attributes': True}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

`lalandre_core.models.chunks`¶

Source: packages/lalandre_core/lalandre_core/models/chunks.py

Pydantic model for chunk records derived from subdivisions.

class lalandre_core.models.chunks.Chunks(*, id=None, subdivision_id, chunk_index, content, char_start, char_end, token_count=None, chunk_metadata=None, created_at=None)[source]¶

Bases: BaseModel

Represent one persisted chunk of subdivision content.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

id (int | None)
subdivision_id (int)
chunk_index (int)
content (str)
char_start (int)
char_end (int)
token_count (int | None)
chunk_metadata (dict[str, Any] | None)
created_at (datetime | None)

model_config: ClassVar[ConfigDict] = {'from_attributes': True}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

`lalandre_core.models.embedding_state`¶

Source: packages/lalandre_core/lalandre_core/models/embedding_state.py

Pydantic model tracking the embedding status of stored objects.

class lalandre_core.models.embedding_state.EmbeddingState(*, id=None, object_type, object_id, provider, model_name, vector_size, content_hash, embedded_at=None)[source]¶

Bases: BaseModel

Represent one embedding state snapshot for an act, chunk, or subdivision.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

id (int | None)
object_type (Literal['subdivision', 'chunk', 'act'])
object_id (int)
provider (str)
model_name (str)
vector_size (int)
content_hash (str)
embedded_at (datetime | None)

model_config: ClassVar[ConfigDict] = {'from_attributes': True}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

`lalandre_core.models.subdivisions`¶

Source: packages/lalandre_core/lalandre_core/models/subdivisions.py

Pydantic model for hierarchical subdivisions inside one act.

class lalandre_core.models.subdivisions.Subdivisions(*, id=None, act_id, version_id=None, parent_id=None, subdivision_type, number=None, title=None, content, content_hash=None, sequence_order, hierarchy_path, depth=0, created_at=None)[source]¶

Bases: BaseModel

Represent one structured subdivision extracted from an act.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

id (int | None)
act_id (int)
version_id (int | None)
parent_id (int | None)
subdivision_type (SubdivisionType)
number (str | None)
title (str | None)
content (str)
content_hash (str | None)
sequence_order (int)
hierarchy_path (str)
depth (int)
created_at (datetime | None)

model_config: ClassVar[ConfigDict] = {'from_attributes': True}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

`lalandre_core.models.subject_matters`¶

Source: packages/lalandre_core/lalandre_core/models/subject_matters.py

Pydantic model for EuroVoc subject matter records.

class lalandre_core.models.subject_matters.SubjectMatters(*, id=None, eurovoc_code, label_en, label_fr=None, parent_code=None)[source]¶

Bases: BaseModel

Represent one subject matter entry used to classify acts.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

id (int | None)
eurovoc_code (str)
label_en (str)
label_fr (str | None)
parent_code (str | None)

model_config: ClassVar[ConfigDict] = {'from_attributes': True}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

`lalandre_core.models.types`¶

Source: packages/lalandre_core/lalandre_core/models/types/__init__.py

Type enumerations

`lalandre_core.models.types.act_type`¶

Source: packages/lalandre_core/lalandre_core/models/types/act_type.py

Enumeration of legal act categories handled by the platform.

class lalandre_core.models.types.act_type.ActType(*values)[source]¶

Bases: Enum

Enumerate the supported categories of legal acts.

`lalandre_core.models.types.language_code`¶

Source: packages/lalandre_core/lalandre_core/models/types/language_code.py

Enumeration of language codes supported by the core models.

class lalandre_core.models.types.language_code.LanguageCode(*values)[source]¶

Bases: Enum

Enumerate the language codes handled by the platform.

`lalandre_core.models.types.relation_type`¶

Source: packages/lalandre_core/lalandre_core/models/types/relation_type.py

Enumeration of supported relationship types between legal acts.

class lalandre_core.models.types.relation_type.RelationType(*values)[source]¶

Bases: Enum

Types of relationships between legal acts.

`lalandre_core.models.types.subdivision_type`¶

Source: packages/lalandre_core/lalandre_core/models/types/subdivision_type.py

Enumeration of structured subdivision kinds extracted from acts.

class lalandre_core.models.types.subdivision_type.SubdivisionType(*values)[source]¶

Bases: Enum

Enumerate the subdivision types recognized by the ingestion pipeline.

`lalandre_core.models.types.version_type`¶

Source: packages/lalandre_core/lalandre_core/models/types/version_type.py

Enumeration of act version categories stored by the platform.

class lalandre_core.models.types.version_type.VersionType(*values)[source]¶

Bases: Enum

Enumerate the supported categories of act versions.

`lalandre_core.models.versions`¶

Source: packages/lalandre_core/lalandre_core/models/versions.py

Pydantic model for version records associated with one act.

class lalandre_core.models.versions.Versions(*, id=None, act_id, version_number, version_type, version_date, source_url=None, is_current=False, created_at=None)[source]¶

Bases: BaseModel

Represent one dated version of a legal act.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

id (int | None)
act_id (int)
version_number (int)
version_type (VersionType)
version_date (datetime)
source_url (str | None)
is_current (bool)
created_at (datetime | None)

model_config: ClassVar[ConfigDict] = {'from_attributes': True}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

`lalandre_core.queue`¶

Source: packages/lalandre_core/lalandre_core/queue/__init__.py

Shared Redis queue helpers for chunking, embedding, and extraction workers.

`lalandre_core.queue.dispatch_all`¶

Source: packages/lalandre_core/lalandre_core/queue/dispatch_all.py

Generic dispatch-all-acts helper shared by workers.

lalandre_core.queue.dispatch_all.dispatch_all_act_jobs(*, runtime, job_id, queue_name, job_type, label, acts, build_params, skip_filter=None, error_label='Processing')[source]¶

Iterate over acts and enqueue one per-act job with progress tracking.

Parameters:

runtime (QueueRuntime) – Queue runtime containing the Redis client and TTL policy.
job_id (str) – Parent job identifier used for status updates.
queue_name (str) – Redis queue that receives the child jobs.
job_type (str) – Job type string for the child jobs, such as "chunk_act".
label (str) – Human-readable label used in log and status messages.
acts (list[Any]) – Iterable of act objects exposing a celex attribute.
build_params (Callable[[Any], dict[str, Any]]) – Callable receiving one CELEX value and returning job params.
skip_filter (Callable[[Any], bool] | None) – Optional predicate returning True when one act should be skipped.
error_label (str) – Verb phrase used in failure messages, such as "Chunking".

Return type:

None

`lalandre_core.queue.job_queue`¶

Source: packages/lalandre_core/lalandre_core/queue/job_queue.py

Shared Redis queue helpers for worker services.

Centralizes job enqueue/dedup/status operations used by chunking, embedding, and extraction workers.

class lalandre_core.queue.job_queue.QueueRuntime(redis_client, job_ttl_seconds)[source]¶

Bases: object

Base runtime data required by queue helper functions.

Parameters:

redis_client (Any)
job_ttl_seconds (int)

class lalandre_core.queue.job_queue.JobPayload[source]¶

Bases: TypedDict

Typed representation of one serialized Redis job payload.

lalandre_core.queue.job_queue.update_job_status(runtime, job_id, status, message=None, progress=None, ttl=None)[source]¶

Update job status metadata in Redis.

Parameters:

runtime (QueueRuntime)
job_id (str)
status (str)
message (str | None)
progress (int | float | None)
ttl (int | None)

Return type:

None

lalandre_core.queue.job_queue.job_already_queued(runtime, *, queue_name, job_type, celex=None)[source]¶

Check whether a matching job is already queued.

If celex is provided, deduplication is scoped to that CELEX value.

Parameters:

runtime (QueueRuntime)
queue_name (str)
job_type (str)
celex (str | None)

Return type:

bool

lalandre_core.queue.job_queue.enqueue_job(runtime, *, queue_name, job_type, params, dedupe_celex=None)[source]¶

Push a job into queue + initialize status hash, with optional dedupe.

Parameters:

runtime (QueueRuntime)
queue_name (str)
job_type (str)
params (dict[str, Any])
dedupe_celex (str | None)

Return type:

str | None

`lalandre_core.queue.reconcile`¶

Source: packages/lalandre_core/lalandre_core/queue/reconcile.py

shared Redis lock pattern for worker auto-reconcile.

Each worker calls with_reconcile_lock to acquire a distributed Redis lock, run its domain-specific reconcile check, and release the lock.

lalandre_core.queue.reconcile.with_reconcile_lock(redis_client, lock_key, lock_ttl, action)[source]¶

Acquire a Redis NX lock, execute action, then release.

If the lock is already held, returns silently.
If Redis is unreachable, logs a warning and returns.
The lock is always released in the finally block.

Parameters:

redis_client (Any)
lock_key (str)
lock_ttl (int)
action (Callable[[], None])

Return type:

None

`lalandre_core.queue.worker_config`¶

Source: packages/lalandre_core/lalandre_core/queue/worker_config.py

Shared config accessor helpers for Redis-based workers.

lalandre_core.queue.worker_config.require_gateway_config(field)[source]¶

Read a mandatory gateway config field; raise if None.

Parameters:: field (str)
Return type:: Any

lalandre_core.queue.worker_config.get_reconcile_params(prefix)[source]¶

Return (enabled, ttl, interval) for a worker reconcile loop.

prefix is the worker kind, e.g. "embed", "chunk", "extract".

Parameters:: prefix (str)
Return type:: tuple[bool, int, int]

`lalandre_core.queue.worker_loop`¶

Source: packages/lalandre_core/lalandre_core/queue/worker_loop.py

worker-loop utilities for Redis-based job workers.

Provides reusable building blocks (functions, not a class hierarchy)

class lalandre_core.queue.worker_loop.BaseRuntimeParams(redis_client, job_ttl_seconds, brpop_timeout_seconds)[source]¶

Bases: object

Common parameters resolved identically by every worker.

Parameters:

redis_client (Any)
job_ttl_seconds (int)
brpop_timeout_seconds (int)

lalandre_core.queue.worker_loop.resolve_base_runtime_params(*, redis_host, redis_port, job_ttl_seconds, brpop_timeout_seconds)[source]¶

Resolve Redis client + base tunables shared by all workers.

Parameters:

redis_host (str | None)
redis_port (int | None)
job_ttl_seconds (int | None)
brpop_timeout_seconds (int)

Return type:

BaseRuntimeParams

lalandre_core.queue.worker_loop.parse_job_payload(job_data)[source]¶

Extract (job_id, job_type, params) from a raw job dict.

Returns empty strings when required fields are missing or have the wrong type so that callers can validate cheaply.

Parameters:: job_data (dict[str, Any])
Return type:: tuple[str, str, dict[str, Any]]

lalandre_core.queue.worker_loop.instrumented_process_job(*, runtime, job_data, dispatch, observe_execution, observe_error)[source]¶

Parse, dispatch, and instrument a single job.

Parameters:

runtime (Any) – Worker runtime instance passed to the dispatched handlers.
job_data (dict[str, Any]) – Raw deserialized job payload fetched from Redis.
dispatch (dict[str, Callable[[Any, str, dict[str, Any]], None]]) – Mapping of job_type to handler(runtime, job_id, params).
observe_execution (Callable[[...], None]) – Observer called in finally with execution metrics.
observe_error (Callable[[...], None]) – Observer called when the handler raises an exception.

Return type:

None

lalandre_core.queue.worker_loop.run_worker_loop(*, queue_name, worker_name, redis_client, brpop_timeout_seconds, process_job, reconcile_callback=None, reconcile_interval_seconds=0, reconcile_hour_start=22, reconcile_hour_end=24)[source]¶

Generic BRPOP loop shared by all workers.

Parameters:

queue_name (str) – Redis list to BRPOP from.
worker_name (str) – Human-readable worker name used in log messages.
redis_client (Any) – Synchronous Redis client backing the worker loop.
brpop_timeout_seconds (int) – Polling timeout passed to BRPOP.
process_job (Callable[[dict[str, Any]], None]) – Callback invoked with each deserialized payload.
reconcile_callback (Callable[[], None] | None) – Optional reconciliation callback executed periodically.
reconcile_interval_seconds (int) – Seconds between two reconciliation runs.
reconcile_hour_start (int) – UTC hour at which the reconciliation window opens.
reconcile_hour_end (int) – UTC hour at which the reconciliation window closes.

Return type:

None

`lalandre_core.queue.worker_metrics`¶

Source: packages/lalandre_core/lalandre_core/queue/worker_metrics.py

Reusable Prometheus metrics factory for Redis-based workers.

class lalandre_core.queue.worker_metrics.WorkerMetrics(observe_execution, observe_error)[source]¶

Bases: object

Pre-built Prometheus instruments + observe helpers for a worker.

Parameters:

observe_execution (Callable[[...], None])
observe_error (Callable[[...], None])

lalandre_core.queue.worker_metrics.build_worker_metrics(worker_name, valid_job_types)[source]¶

Create Prometheus counters/histograms for a worker and return observe helpers.

Parameters:

worker_name (str) – Short name used in metric names, such as "embedding".
valid_job_types (set[str]) – Whitelist of known job type labels for normalization.

Return type:

WorkerMetrics

`lalandre_core.redis_client`¶

Source: packages/lalandre_core/lalandre_core/redis_client.py

Redis client factory helpers for services.

lalandre_core.redis_client.create_sync_redis_client(*, host, port, decode_responses=True)[source]¶

Build a typed synchronous Redis client used by workers.

Parameters:

host (str)
port (int)
decode_responses (bool)

Return type:

Redis

`lalandre_core.repositories`¶

Source: packages/lalandre_core/lalandre_core/repositories/__init__.py

Repository Base Abstractions

`lalandre_core.repositories.base`¶

Source: packages/lalandre_core/lalandre_core/repositories/base/__init__.py

Base repository abstractions

`lalandre_core.repositories.base.exceptions`¶

Source: packages/lalandre_core/lalandre_core/repositories/base/exceptions.py

Repository exceptions

exception lalandre_core.repositories.base.exceptions.RepositoryError[source]¶

Bases: Exception

Base exception for repository errors

exception lalandre_core.repositories.base.exceptions.DatabaseConnectionError[source]¶

Bases: RepositoryError

Raised when connection to database fails

exception lalandre_core.repositories.base.exceptions.DatabaseOperationError[source]¶

Bases: RepositoryError

Raised when a database operation fails

`lalandre_core.repositories.base.repository`¶

Source: packages/lalandre_core/lalandre_core/repositories/base/repository.py

Base repository abstraction

class lalandre_core.repositories.base.repository.BaseRepository[source]¶

Bases: ABC

Abstract base class for all repositories

abstractmethod close()[source]¶: Close database connection and cleanup resources

abstractmethod health_check()[source]¶

Verify database connectivity and readiness

Return type:: bool

`lalandre_core.repositories.common`¶

Source: packages/lalandre_core/lalandre_core/repositories/common/__init__.py

Common repository helpers.

`lalandre_core.repositories.common.payload_builder`¶

Source: packages/lalandre_core/lalandre_core/repositories/common/payload_builder.py

Build Qdrant payloads from JSON schemas.

class lalandre_core.repositories.common.payload_builder.PayloadBuilder(loader=None)[source]¶

Bases: object

Schema-driven payload builder.

Parameters:: loader (PayloadSchemaLoader | None)

build_subdivision_payload(subdivision_data, act_data, version_data=None, metadata=None)[source]¶

Build payload for subdivision embeddings.

Parameters:

subdivision_data (Dict[str, Any])
act_data (Dict[str, Any])
version_data (Dict[str, Any] | None)
metadata (Dict[str, str] | None)

Return type:

Dict[str, Any]

build_chunk_payload(chunk_data, subdivision_data, act_data)[source]¶

Build payload for chunk embeddings.

Parameters:

chunk_data (Dict[str, Any])
subdivision_data (Dict[str, Any])
act_data (Dict[str, Any])

Return type:

Dict[str, Any]

build_act_payload(act_data, full_text, subjects=None, metadata=None)[source]¶

Build payload for whole-act embeddings (one vector per act).

Parameters:

act_data (Dict[str, Any])
full_text (str)
subjects (list[dict[str, Any]] | None)
metadata (Dict[str, str] | None)

Return type:

Dict[str, Any]

`lalandre_core.repositories.common.schema_loader`¶

Source: packages/lalandre_core/lalandre_core/repositories/common/schema_loader.py

Load JSON payload schemas and render payloads.

class lalandre_core.repositories.common.schema_loader.PayloadSchemaLoader(schema_file=None)[source]¶

Bases: object

Loads and applies payload schemas.

Initialize loader (defaults to payload_schemas.json).

Parameters:: schema_file (str | PathLike[str] | None)

get_schema(schema_name)[source]¶

Fetch a schema by name.

Parameters:: schema_name (str)
Return type:: dict[str, object]

build_payload_from_schema(schema_name, context, transformers=None)[source]¶

Render a payload from schema + context.

Parameters:

schema_name (str)
context (dict[str, Any])
transformers (dict[str, Callable[[Any], Any]] | None)

Return type:

dict[str, Any]

`lalandre_core.runtime_values`¶

Source: packages/lalandre_core/lalandre_core/runtime_values.py

Small coercion helpers shared across service entrypoints.

lalandre_core.runtime_values.require_int(value, setting_name)[source]¶

Return an int or raise a clear configuration error.

Parameters:

value (int | None)
setting_name (str)

Return type:

int

lalandre_core.runtime_values.require_float(value, setting_name)[source]¶

Return a float or raise a clear configuration error.

Parameters:

value (float | None)
setting_name (str)

Return type:

float

lalandre_core.runtime_values.require_bool(value, setting_name)[source]¶

Return a bool or raise a clear configuration error.

Parameters:

value (bool | None)
setting_name (str)

Return type:

bool

`lalandre_core.utils`¶

Source: packages/lalandre_core/lalandre_core/utils/__init__.py

Utility functions Common helpers used across the project

`lalandre_core.utils.api_key_pool`¶

Source: packages/lalandre_core/lalandre_core/utils/api_key_pool.py

API Key Pool Manager Distributes API calls across multiple keys using round-robin strategy.

class lalandre_core.utils.api_key_pool.APIKeyPool(keys)[source]¶

Bases: object

Thread-safe container for API keys with round-robin distribution.

Keys are loaded from environment variables following the pattern:: BASE_VAR, BASE_VAR_2, BASE_VAR_3, …, BASE_VAR_{max_keys}

Parameters:: keys (List[str])

classmethod from_env(base_env_var='MISTRAL_API_KEY', max_keys=10, start_index=1)[source]¶

Load keys from environment variables.

Looks for: - {base_env_var} (index 1, main key) - {base_env_var}_2, …, {base_env_var}_{max_keys}

start_index and max_keys control the range: indices [start_index .. max_keys] are loaded. This allows splitting keys between services (e.g. 1-5 for RAG, 6-10 for workers).

Parameters:

base_env_var (str)
max_keys (int)
start_index (int)

Return type:

APIKeyPool

next_key()[source]¶

Return the next key in round-robin order (thread-safe).

Return type:: str

`lalandre_core.utils.celex_utils`¶

Source: packages/lalandre_core/lalandre_core/utils/celex_utils.py

CELEX Utility Functions

lalandre_core.utils.celex_utils.normalize_celex(celex)[source]¶

Handles various input formats and normalizes to the standard CELEX format. Removes spaces, handles EUR-Lex format conversions.

Examples

>>> normalize_celex('32016R0679')
'32016R0679'
>>> normalize_celex(' 32016 R 0679 ')
'32016R0679'
>>> normalize_celex('(UE) 2016/679')
'32016R0679'
>>> normalize_celex('(CE) n° 1219/2011')
'32011R1219'
>>> normalize_celex('Directive 2003/41/CE')
'32003L0041'
>>> normalize_celex('AMF-RG-L1-20250331')
'AMF-RG-L1-20250331'
>>> normalize_celex('AMF-SANCTION-SanctionAMF2026-01-20260112')
'AMF-SAN-2026-01'

Parameters:: celex (str)
Return type:: str

lalandre_core.utils.celex_utils.is_eurlex_celex(celex)[source]¶

Return True iff celex follows the EUR-Lex standard format.

EUR-Lex CELEXes start with a sector digit followed by the 4-digit year and a document-type letter (e.g. 32016R0679). All other sources (AMF-, EBA-, EIOPA-, ESMA-, LEGITEXT…) use alphabetical prefixes.

Parameters:: celex (str)
Return type:: bool

lalandre_core.utils.celex_utils.is_legifrance_celex(celex)[source]¶

Return True iff celex identifies a Légifrance document.

Légifrance CELEXes start with LEGITEXT (e.g. LEGITEXT000006072026 or LEGITEXT000006072026:LEGISCTA000006154980).

Parameters:: celex (str)
Return type:: bool

lalandre_core.utils.celex_utils.is_valid_celex(celex)[source]¶

Return True if celex looks like a recognisable CELEX identifier.

Parameters:: celex (str)
Return type:: bool

`lalandre_core.utils.collection_utils`¶

Source: packages/lalandre_core/lalandre_core/utils/collection_utils.py

Collection utilities. Helpers for de-duplicating lists of dictionaries.

lalandre_core.utils.collection_utils.deduplicate_dicts_by_id(items, id_key='id')[source]¶

Return items with duplicates removed based on a single key.

Parameters:

items (Iterable[Dict[str, Any] | None])
id_key (str)

Return type:

List[Dict[str, Any]]

lalandre_core.utils.collection_utils.deduplicate_dicts_by_tuple_key(items, keys)[source]¶

Return items with duplicates removed based on a tuple of keys.

Parameters:

items (Iterable[Dict[str, Any] | None])
keys (Sequence[str])

Return type:

List[Dict[str, Any]]

`lalandre_core.utils.date_utils`¶

Source: packages/lalandre_core/lalandre_core/utils/date_utils.py

Date Utility Functions Centralized date formatting and conversion utilities

lalandre_core.utils.date_utils.format_date(date_value)[source]¶

Format a date to ISO 8601 string format

Handles multiple input types: - datetime objects - date objects - ISO strings (pass-through) - None (returns None)

Parameters:: date_value (Any) – Date to format (datetime, date, str, or None)
Returns:: MM:SS) or None
Return type:: ISO format string (YYYY-MM-DD or YYYY-MM-DDTHH

Examples

>>> format_date(datetime(2016, 4, 27))
'2016-04-27T00:00:00'
>>> format_date(date(2016, 4, 27))
'2016-04-27'
>>> format_date("2016-04-27")
'2016-04-27'
>>> format_date(None)
None

lalandre_core.utils.date_utils.to_timestamp(date_value)[source]¶

Convert a date to Unix timestamp (seconds since epoch)

Handles multiple input types, same like the previous function

Parameters:: date_value (Any) – Date to convert (datetime, date, str, or None)
Returns:: Unix timestamp (int) or None
Return type:: int | None

Examples

>>> to_timestamp(datetime(2016, 4, 27, 12, 0, 0))
1461758400  # (approximate, depends on timezone)
>>> to_timestamp("2016-04-27")
1461715200
>>> to_timestamp(None)
None

lalandre_core.utils.date_utils.convert_dates_to_strings(props, date_fields)[source]¶

Convert datetime objects to strings in a dictionary for database storage

Useful for Neo4j, or other databases that require date strings. Mutates the input dictionary in place.

Parameters:

props (dict[str, Any]) – Dictionary of properties (will be modified)
date_fields (list[str]) – List of field names that contain dates

Returns:

Modified properties dict with dates as ISO format strings

Return type:

dict[str, Any]

Examples

>>> data = {'created_at': datetime(2024, 1, 1), 'name': 'Test'}
>>> convert_dates_to_strings(data, ['created_at'])
{'created_at': '2024-01-01T00:00:00', 'name': 'Test'}

`lalandre_core.utils.metrics_utils`¶

Source: packages/lalandre_core/lalandre_core/utils/metrics_utils.py

Shared Prometheus metric helpers reused across services.

lalandre_core.utils.metrics_utils.status_class(status_code)[source]¶

Collapse an HTTP status code into its class label such as 2xx.

Parameters:: status_code (int)
Return type:: str

lalandre_core.utils.metrics_utils.normalize_label(value)[source]¶

Normalize arbitrary metric label values into a lowercase token.

Parameters:: value (Any)
Return type:: str

lalandre_core.utils.metrics_utils.normalize_search_mode(mode)[source]¶

Normalize a search mode label to one of the supported metric values.

Parameters:: mode (str | None)
Return type:: str

lalandre_core.utils.metrics_utils.normalize_granularity(granularity)[source]¶

Normalize a granularity label to one of the metric-safe values.

Parameters:: granularity (str | None)
Return type:: str

lalandre_core.utils.metrics_utils.classify_error(exc_or_reason)[source]¶

Classify an error into (provider, error_type) for metrics labeling.

Parameters:: exc_or_reason (Any)
Return type:: tuple[str, str]

`lalandre_core.utils.mode_aliases`¶

Source: packages/lalandre_core/lalandre_core/utils/mode_aliases.py

RAG query mode aliases.

Single source of truth for legacy mode names → canonical mode mapping. Used by both api-gateway and rag-service to resolve mode aliases consistently.

lalandre_core.utils.mode_aliases.resolve_mode_alias(mode)[source]¶

Return the canonical mode name, resolving legacy aliases.

Parameters:: mode (str)
Return type:: str

`lalandre_core.utils.parse_utils`¶

Source: packages/lalandre_core/lalandre_core/utils/parse_utils.py

Generic parsing utilities.

lalandre_core.utils.parse_utils.extract_json_object(text)[source]¶

Try to extract a JSON object from text (which may contain markdown fences).

Returns the first valid dict found, or None.

Parameters:: text (str)
Return type:: Dict[str, Any] | None

lalandre_core.utils.parse_utils.as_dict(value)[source]¶

Return value if it is a dict, else an empty dict.

Parameters:: value (Any)
Return type:: Dict[str, Any]

lalandre_core.utils.parse_utils.as_optional_dict(value)[source]¶

Return value if it is a dict, else None.

Parameters:: value (Any)
Return type:: Dict[str, Any] | None

lalandre_core.utils.parse_utils.as_str(value, *, default='')[source]¶

Coerce value to str.

Parameters:

value (Any)
default (str)

Return type:

str

lalandre_core.utils.parse_utils.as_document_list(value)[source]¶

Return only the dict items from value (must be a list).

Parameters:: value (Any)
Return type:: List[Dict[str, Any]]

lalandre_core.utils.parse_utils.to_optional_int(value)[source]¶

Convert value to int when possible, else None.

Parameters:: value (Any)
Return type:: int | None

lalandre_core.utils.parse_utils.sanitize_error_text(error, *, max_chars=220)[source]¶

Return a truncated, safe string representation of error.

Parameters:

error (Exception)
max_chars (int)

Return type:

str

lalandre_core.utils.parse_utils.coerce_bool(value, default)[source]¶

Return value if it is already a bool, else default.

Parameters:

value (Any)
default (bool)

Return type:

bool

lalandre_core.utils.parse_utils.coerce_float(value, default)[source]¶

Return value cast to float when it is numeric, else default.

Parameters:

value (Any)
default (float)

Return type:

float

`lalandre_core.utils.regulatory_level`¶

Source: packages/lalandre_core/lalandre_core/utils/regulatory_level.py

Regulatory level inference from act metadata.

EU financial regulation follows a 3-level hierarchy:: 1 (L1) — Framework legislation (Regulations, Directives) adopted by Parliament/Council 2 (L2) — Delegated/implementing acts (RTS, ITS) adopted by the Commission 3 (L3) — Supervisory guidance (Guidelines, Q&A, Recommendations) by ESA (EBA/ESMA/EIOPA)

lalandre_core.utils.regulatory_level.infer_regulatory_level(celex, act_type, title=None, form_number=None)[source]¶

Infer regulatory level from act metadata.

Returns 1 (L1), 2 (L2), 3 (L3), or None (outside scope).

Parameters:

celex (str)
act_type (str)
title (str | None)
form_number (str | None)

Return type:

int | None

lalandre_core.utils.regulatory_level.level_to_label(level)[source]¶

Convert numeric level to display label: 1→’L1’, 2→’L2’, 3→’L3’.

Parameters:: level (int | None)
Return type:: str | None

`lalandre_core.utils.shared_key_pool`¶

Source: packages/lalandre_core/lalandre_core/utils/shared_key_pool.py

Helpers for dispatching calls across a shared API key pool.

lalandre_core.utils.shared_key_pool.build_clients_by_key(*, key_pool, factory)[source]¶

Build one client instance per key in the shared pool.

Parameters:

key_pool (APIKeyPool)
factory (Callable[[str], T])

Return type:

Dict[str, T]

class lalandre_core.utils.shared_key_pool.SharedKeyPoolProxy(*, key_pool, clients_by_key)[source]¶

Bases: object

Delegate each callable access to the next client selected by key_pool.

Parameters:

key_pool (APIKeyPool)
clients_by_key (Mapping[str, Any])

`lalandre_core.utils.text_utils`¶

Source: packages/lalandre_core/lalandre_core/utils/text_utils.py

Text normalization utilities.

lalandre_core.utils.text_utils.strip_accents(text)[source]¶

Remove diacritics (accents) from text via NFKD decomposition.

Parameters:: text (str)
Return type:: str

lalandre_core.utils.text_utils.normalize_text(text)[source]¶

Lowercase, strip accents, collapse whitespace, remove special chars.

Keeps: word characters, whitespace, slashes, parens, dots, colons, hyphens.

Parameters:: text (str)
Return type:: str

Core API¶

lalandre_core¶

lalandre_core.config¶

lalandre_core.embedding_presets¶

lalandre_core.http¶

lalandre_core.http.llm_client¶

lalandre_core.http.middleware¶

lalandre_core.linking¶

lalandre_core.linking.entity_linker¶

lalandre_core.linking.heuristics¶

lalandre_core.linking.ner_client¶

lalandre_core.llm¶

lalandre_core.llm.langchain¶

lalandre_core.llm.providers¶

lalandre_core.llm.structured¶

lalandre_core.logging_setup¶

lalandre_core.models¶

lalandre_core.models.act_metadata¶

lalandre_core.models.act_relations¶

lalandre_core.models.act_subjects¶

lalandre_core.models.acts¶

lalandre_core.models.chunks¶

lalandre_core.models.embedding_state¶

lalandre_core.models.subdivisions¶

lalandre_core.models.subject_matters¶

lalandre_core.models.types¶

lalandre_core.models.types.act_type¶

lalandre_core.models.types.language_code¶

lalandre_core.models.types.relation_type¶

lalandre_core.models.types.subdivision_type¶

lalandre_core.models.types.version_type¶

lalandre_core.models.versions¶

lalandre_core.queue¶

lalandre_core.queue.dispatch_all¶

lalandre_core.queue.job_queue¶

lalandre_core.queue.reconcile¶

lalandre_core.queue.worker_config¶

lalandre_core.queue.worker_loop¶

lalandre_core.queue.worker_metrics¶

lalandre_core.redis_client¶

lalandre_core.repositories¶

lalandre_core.repositories.base¶

lalandre_core.repositories.base.exceptions¶

lalandre_core.repositories.base.repository¶

lalandre_core.repositories.common¶

lalandre_core.repositories.common.payload_builder¶

lalandre_core.repositories.common.schema_loader¶

lalandre_core.runtime_values¶

lalandre_core.utils¶

lalandre_core.utils.api_key_pool¶

lalandre_core.utils.celex_utils¶

lalandre_core.utils.collection_utils¶

lalandre_core.utils.date_utils¶

lalandre_core.utils.metrics_utils¶

lalandre_core.utils.mode_aliases¶

lalandre_core.utils.parse_utils¶

lalandre_core.utils.regulatory_level¶

lalandre_core.utils.shared_key_pool¶

lalandre_core.utils.text_utils¶

`lalandre_core`¶

`lalandre_core.config`¶

`lalandre_core.embedding_presets`¶

`lalandre_core.http`¶

`lalandre_core.http.llm_client`¶

`lalandre_core.http.middleware`¶

`lalandre_core.linking`¶

`lalandre_core.linking.entity_linker`¶

`lalandre_core.linking.heuristics`¶

`lalandre_core.linking.ner_client`¶

`lalandre_core.llm`¶

`lalandre_core.llm.langchain`¶

`lalandre_core.llm.providers`¶

`lalandre_core.llm.structured`¶

`lalandre_core.logging_setup`¶

`lalandre_core.models`¶

`lalandre_core.models.act_metadata`¶

`lalandre_core.models.act_relations`¶

`lalandre_core.models.act_subjects`¶

`lalandre_core.models.acts`¶

`lalandre_core.models.chunks`¶

`lalandre_core.models.embedding_state`¶

`lalandre_core.models.subdivisions`¶

`lalandre_core.models.subject_matters`¶

`lalandre_core.models.types`¶

`lalandre_core.models.types.act_type`¶

`lalandre_core.models.types.language_code`¶

`lalandre_core.models.types.relation_type`¶

`lalandre_core.models.types.subdivision_type`¶

`lalandre_core.models.types.version_type`¶

`lalandre_core.models.versions`¶

`lalandre_core.queue`¶

`lalandre_core.queue.dispatch_all`¶

`lalandre_core.queue.job_queue`¶

`lalandre_core.queue.reconcile`¶

`lalandre_core.queue.worker_config`¶

`lalandre_core.queue.worker_loop`¶

`lalandre_core.queue.worker_metrics`¶

`lalandre_core.redis_client`¶

`lalandre_core.repositories`¶

`lalandre_core.repositories.base`¶

`lalandre_core.repositories.base.exceptions`¶

`lalandre_core.repositories.base.repository`¶

`lalandre_core.repositories.common`¶

`lalandre_core.repositories.common.payload_builder`¶

`lalandre_core.repositories.common.schema_loader`¶

`lalandre_core.runtime_values`¶

`lalandre_core.utils`¶

`lalandre_core.utils.api_key_pool`¶

`lalandre_core.utils.celex_utils`¶

`lalandre_core.utils.collection_utils`¶

`lalandre_core.utils.date_utils`¶

`lalandre_core.utils.metrics_utils`¶

`lalandre_core.utils.mode_aliases`¶

`lalandre_core.utils.parse_utils`¶

`lalandre_core.utils.regulatory_level`¶

`lalandre_core.utils.shared_key_pool`¶

`lalandre_core.utils.text_utils`¶