Core API

Note

This page is generated automatically from the repository’s maintained Python module inventory.

Shared runtime configuration, repositories, HTTP helpers, queue primitives, and utilities.

lalandre_core

Source: packages/lalandre_core/lalandre_core/__init__.py

Core shared utilities for Lalandre services.

lalandre_core.config

Source: packages/lalandre_core/lalandre_core/config.py

Configuration management for project

class lalandre_core.config.EnvSettings(_case_sensitive=None, _nested_model_default_partial_update=None, _env_prefix=None, _env_prefix_target=None, _env_file=PosixPath('.'), _env_file_encoding=None, _env_ignore_empty=None, _env_nested_delimiter=None, _env_nested_max_split=None, _env_parse_none_str=None, _env_parse_enums=None, _cli_prog_name=None, _cli_parse_args=None, _cli_settings_source=None, _cli_parse_none_str=None, _cli_hide_none_type=None, _cli_avoid_json=None, _cli_enforce_required=None, _cli_use_class_docs_for_groups=None, _cli_exit_on_error=None, _cli_prefix=None, _cli_flag_prefix_char=None, _cli_implicit_flags=None, _cli_ignore_unknown_args=None, _cli_kebab_case=None, _cli_shortcuts=None, _secrets_dir=None, _build_sources=None, *, APP_CONFIG_FILE=None, APP_CONFIG_OVERRIDE_FILE=None, GATEWAY_ALLOWED_ORIGINS=None, DB_PASSWORD=None, QDRANT_API_KEY=None, NEO4J_PASSWORD=None, LLM_API_KEY=None, SEARCH_INTENT_PARSER_API_KEY=None, MISTRAL_API_KEY=None, MISTRAL_API_KEY_2=None, MISTRAL_API_KEY_3=None, MISTRAL_API_KEY_4=None, MISTRAL_API_KEY_5=None, MISTRAL_API_KEY_6=None, MISTRAL_API_KEY_7=None, MISTRAL_API_KEY_8=None, MISTRAL_API_KEY_9=None, MISTRAL_API_KEY_10=None)[source]

Bases: BaseSettings

Environment-backed settings loaded before the YAML application config.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • _case_sensitive (bool | None)

  • _nested_model_default_partial_update (bool | None)

  • _env_prefix (str | None)

  • _env_prefix_target (EnvPrefixTarget | None)

  • _env_file (DotenvType | None)

  • _env_file_encoding (str | None)

  • _env_ignore_empty (bool | None)

  • _env_nested_delimiter (str | None)

  • _env_nested_max_split (int | None)

  • _env_parse_none_str (str | None)

  • _env_parse_enums (bool | None)

  • _cli_prog_name (str | None)

  • _cli_parse_args (bool | list[str] | tuple[str, ...] | None)

  • _cli_settings_source (CliSettingsSource[Any] | None)

  • _cli_parse_none_str (str | None)

  • _cli_hide_none_type (bool | None)

  • _cli_avoid_json (bool | None)

  • _cli_enforce_required (bool | None)

  • _cli_use_class_docs_for_groups (bool | None)

  • _cli_exit_on_error (bool | None)

  • _cli_prefix (str | None)

  • _cli_flag_prefix_char (str | None)

  • _cli_implicit_flags (bool | Literal['dual', 'toggle'] | None)

  • _cli_ignore_unknown_args (bool | None)

  • _cli_kebab_case (bool | Literal['all', 'no_enums'] | None)

  • _cli_shortcuts (Mapping[str, str | list[str]] | None)

  • _secrets_dir (PathType | None)

  • _build_sources (tuple[tuple[PydanticBaseSettingsSource, ...], dict[str, Any]] | None)

  • APP_CONFIG_FILE (str | None)

  • APP_CONFIG_OVERRIDE_FILE (str | None)

  • GATEWAY_ALLOWED_ORIGINS (str | None)

  • DB_PASSWORD (str | None)

  • QDRANT_API_KEY (str | None)

  • NEO4J_PASSWORD (str | None)

  • LLM_API_KEY (str | None)

  • SEARCH_INTENT_PARSER_API_KEY (str | None)

  • MISTRAL_API_KEY (str | None)

  • MISTRAL_API_KEY_2 (str | None)

  • MISTRAL_API_KEY_3 (str | None)

  • MISTRAL_API_KEY_4 (str | None)

  • MISTRAL_API_KEY_5 (str | None)

  • MISTRAL_API_KEY_6 (str | None)

  • MISTRAL_API_KEY_7 (str | None)

  • MISTRAL_API_KEY_8 (str | None)

  • MISTRAL_API_KEY_9 (str | None)

  • MISTRAL_API_KEY_10 (str | None)

model_config: ClassVar[SettingsConfigDict] = {'arbitrary_types_allowed': True, 'case_sensitive': False, 'cli_avoid_json': False, 'cli_enforce_required': False, 'cli_exit_on_error': True, 'cli_flag_prefix_char': '-', 'cli_hide_none_type': False, 'cli_ignore_unknown_args': False, 'cli_implicit_flags': False, 'cli_kebab_case': False, 'cli_parse_args': None, 'cli_parse_none_str': None, 'cli_prefix': '', 'cli_prog_name': None, 'cli_shortcuts': None, 'cli_use_class_docs_for_groups': False, 'enable_decoding': True, 'env_file': '.env', 'env_file_encoding': 'utf-8', 'env_ignore_empty': True, 'env_nested_delimiter': None, 'env_nested_max_split': None, 'env_parse_enums': None, 'env_parse_none_str': None, 'env_prefix': '', 'env_prefix_target': 'variable', 'extra': 'ignore', 'json_file': None, 'json_file_encoding': None, 'nested_model_default_partial_update': False, 'protected_namespaces': ('model_validate', 'model_dump', 'settings_customise_sources'), 'secrets_dir': None, 'toml_file': None, 'validate_default': True, 'yaml_config_section': None, 'yaml_file': None, 'yaml_file_encoding': None}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class lalandre_core.config.DatabaseConfig(*, host=None, port=None, database=None, user=None, password=None)[source]

Bases: BaseModel

PostgreSQL database configuration.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • host (str | None)

  • port (int | None)

  • database (str | None)

  • user (str | None)

  • password (str | None)

property connection_string: str

Return a PostgreSQL connection string for the configured database.

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class lalandre_core.config.VectorConfig(*, host=None, port=None, api_key=None, collection_chunks=None, collection_acts=None, vector_size=1024, timeout=30, use_https=False)[source]

Bases: BaseModel

Qdrant configuration.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • host (str | None)

  • port (int | None)

  • api_key (str | None)

  • collection_chunks (str | None)

  • collection_acts (str | None)

  • vector_size (int)

  • timeout (int)

  • use_https (bool)

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class lalandre_core.config.GraphConfig(*, uri=None, user=None, password=None, database=None, max_connection_lifetime=3600, max_connection_pool_size=50, connection_timeout=30, strict_mode=False, acts_limit=10, relationships_limit=20, depth=2, cypher_timeout_seconds=30.0, cypher_max_rows=80, ranking_relation_weights=<factory>, ranking_default_relation_weight=0.3, community_relation_weights=<factory>, community_default_relation_weight=0.5, ranking_hop_decay=0.5, ranking_semantic_boost=0.3, ranking_relation_weight_factor=0.25, budget_semantic_share=0.6, budget_graph_share=0.3, budget_relation_share=0.1, map_reduce_threshold=24000, map_reduce_chunk_chars=5000, map_reduce_max_parallel=3, map_reduce_map_timeout=45.0, map_reduce_reduce_timeout=50.0, expansion_relation_types=<factory>, expansion_max_related_per_node=50, expansion_max_relationships_per_node=100, use_graph_in_rag=True, hybrid_enrichment_depth=2, use_communities_in_rag=True, community_central_act_title_chars=60, community_central_acts_display=3)[source]

Bases: BaseModel

Neo4j configuration.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • uri (str | None)

  • user (str | None)

  • password (str | None)

  • database (str | None)

  • max_connection_lifetime (int)

  • max_connection_pool_size (int)

  • connection_timeout (int)

  • strict_mode (bool)

  • acts_limit (int)

  • relationships_limit (int)

  • depth (int)

  • cypher_timeout_seconds (float)

  • cypher_max_rows (int)

  • ranking_relation_weights (Dict[str, float])

  • ranking_default_relation_weight (float)

  • community_relation_weights (Dict[str, float])

  • community_default_relation_weight (float)

  • ranking_hop_decay (float)

  • ranking_semantic_boost (float)

  • ranking_relation_weight_factor (float)

  • budget_semantic_share (float)

  • budget_graph_share (float)

  • budget_relation_share (float)

  • map_reduce_threshold (int)

  • map_reduce_chunk_chars (int)

  • map_reduce_max_parallel (int)

  • map_reduce_map_timeout (float)

  • map_reduce_reduce_timeout (float)

  • expansion_relation_types (list[str])

  • expansion_max_related_per_node (int)

  • expansion_max_relationships_per_node (int)

  • use_graph_in_rag (bool)

  • hybrid_enrichment_depth (int)

  • use_communities_in_rag (bool)

  • community_central_act_title_chars (int)

  • community_central_acts_display (int)

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class lalandre_core.config.TokenLimitsConfig(*, embedding_max_input_tokens=8192, chars_per_token=3.3, embedding_safety_ratio=0.9)[source]

Bases: BaseModel

Token limits for API models.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • embedding_max_input_tokens (int)

  • chars_per_token (float)

  • embedding_safety_ratio (float)

property embedding_max_chars: int

Max characters for embedding based on token limit.

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class lalandre_core.config.EmbeddingPresetConfig(*, preset_id, provider, model_name, device='cpu', label, enabled=True, indexing_enabled=True, queue_name=None, vector_size=1024)[source]

Bases: BaseModel

Named embedding runtime preset used for indexing and query-time routing.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • preset_id (str)

  • provider (str)

  • model_name (str)

  • device (str)

  • label (str)

  • enabled (bool)

  • indexing_enabled (bool)

  • queue_name (str | None)

  • vector_size (int)

resolved_queue_name()[source]

Return the queue name used by the embedding worker for this preset.

Return type:

str

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class lalandre_core.config.EmbeddingConfig(*, provider=None, model_name=None, batch_size=None, device=None, cache_dir=None, normalize_embeddings=True, enable_cache=True, cache_max_size=10000, redis_socket_timeout=2, cache_ttl_seconds=604800, retry_min_tokens=64, retry_fallback_threshold=96, retry_reduction_factor=0.7)[source]

Bases: BaseModel

Embedding model configuration.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • provider (str | None)

  • model_name (str | None)

  • batch_size (int | None)

  • device (str | None)

  • cache_dir (str | None)

  • normalize_embeddings (bool)

  • enable_cache (bool)

  • cache_max_size (int)

  • redis_socket_timeout (int)

  • cache_ttl_seconds (int)

  • retry_min_tokens (int)

  • retry_fallback_threshold (int)

  • retry_reduction_factor (float)

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class lalandre_core.config.ChunkingEmbeddingConfig(*, provider='mistral', model_name='mistral-embed', device='cpu')[source]

Bases: BaseModel

Embedding runtime used internally by the chunking algorithm.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • provider (str)

  • model_name (str)

  • device (str)

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class lalandre_core.config.SearchConfig(*, default_limit=10, default_mode='rag', default_search_mode='hybrid', default_granularity='all', default_embedding_preset='mistral', fulltext_language=None, bm25_normalization=32, fusion_method=None, lexical_weight=None, semantic_weight=None, candidate_multiplier=None, min_candidates=None, max_candidates=200, semantic_per_collection_oversampling=1.25, hnsw_ef=None, exact_search=False, query_expansion_enabled=True, query_expansion_max_variants=3, query_expansion_min_query_chars=24, intent_parser_enabled=False, intent_parser_provider=None, intent_parser_model=None, intent_parser_base_url=None, intent_parser_api_key=None, intent_parser_timeout_seconds=20.0, intent_parser_temperature=0.0, intent_parser_max_output_tokens=180, rerank_enabled=True, rerank_model='BAAI/bge-reranker-v2-m3', rerank_device='cpu', rerank_batch_size=4, rerank_max_candidates=5, rerank_max_chars=256, rerank_cache_dir=None, rerank_service_url=None, rerank_service_timeout_seconds=15.0, rerank_fallback_to_skip=True, rerank_circuit_failure_threshold=2, rerank_circuit_cooldown_seconds=30.0, score_threshold_default=0.15, relevance_gate_threshold=0.35, max_lexical_query_chars=200, fts_max_lexemes=12, dynamic_fusion_enabled=True, lexical_boost_factor=1.8, lexical_boost_max=0.75, result_cache_ttl_seconds=300, query_router_broad_query_min_chars=220, query_router_global_overview_min_top_k=10, query_router_citation_precision_min_top_k=7, query_router_relationship_focus_min_top_k=8, query_router_contextual_default_min_top_k=6, fusion_rrf_k=60, query_expansion_max_variants_cap=8, query_expansion_abbreviation_weight=0.96, query_expansion_keyword_focus_weight=0.92, query_expansion_reference_focus_weight=0.9, query_expansion_bilingual_weight=0.88, adaptive_score_drop_threshold=0.15, complementary_max_queries=2, complementary_top_k=5, compression_threshold_ratio=1.3, mmr_enabled=True, mmr_max_per_act=2, crag_enabled=False, crag_max_iterations=1, crag_skip_score_threshold=0.82, max_parallel_workers=4, query_parser_max_top_k=40, intent_parser_min_output_tokens=80, summary_min_chars=50)[source]

Bases: BaseModel

Search configuration.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • default_limit (int)

  • default_mode (str)

  • default_search_mode (str)

  • default_granularity (str)

  • default_embedding_preset (str)

  • fulltext_language (str | None)

  • bm25_normalization (int)

  • fusion_method (str | None)

  • lexical_weight (float | None)

  • semantic_weight (float | None)

  • candidate_multiplier (float | None)

  • min_candidates (int | None)

  • max_candidates (int)

  • semantic_per_collection_oversampling (float)

  • hnsw_ef (int | None)

  • exact_search (bool)

  • query_expansion_enabled (bool)

  • query_expansion_max_variants (int)

  • query_expansion_min_query_chars (int)

  • intent_parser_enabled (bool)

  • intent_parser_provider (str | None)

  • intent_parser_model (str | None)

  • intent_parser_base_url (str | None)

  • intent_parser_api_key (str | None)

  • intent_parser_timeout_seconds (float)

  • intent_parser_temperature (float)

  • intent_parser_max_output_tokens (int)

  • rerank_enabled (bool)

  • rerank_model (str)

  • rerank_device (str)

  • rerank_batch_size (int)

  • rerank_max_candidates (int)

  • rerank_max_chars (int)

  • rerank_cache_dir (str | None)

  • rerank_service_url (str | None)

  • rerank_service_timeout_seconds (float)

  • rerank_fallback_to_skip (bool)

  • rerank_circuit_failure_threshold (int)

  • rerank_circuit_cooldown_seconds (float)

  • score_threshold_default (float | None)

  • relevance_gate_threshold (float | None)

  • max_lexical_query_chars (int)

  • fts_max_lexemes (int)

  • dynamic_fusion_enabled (bool)

  • lexical_boost_factor (float)

  • lexical_boost_max (float)

  • result_cache_ttl_seconds (int)

  • query_router_broad_query_min_chars (int)

  • query_router_global_overview_min_top_k (int)

  • query_router_citation_precision_min_top_k (int)

  • query_router_relationship_focus_min_top_k (int)

  • query_router_contextual_default_min_top_k (int)

  • fusion_rrf_k (int)

  • query_expansion_max_variants_cap (int)

  • query_expansion_abbreviation_weight (float)

  • query_expansion_keyword_focus_weight (float)

  • query_expansion_reference_focus_weight (float)

  • query_expansion_bilingual_weight (float)

  • adaptive_score_drop_threshold (float | None)

  • complementary_max_queries (int)

  • complementary_top_k (int)

  • compression_threshold_ratio (float)

  • mmr_enabled (bool)

  • mmr_max_per_act (int)

  • crag_enabled (bool)

  • crag_max_iterations (int)

  • crag_skip_score_threshold (float)

  • max_parallel_workers (int)

  • query_parser_max_top_k (int)

  • intent_parser_min_output_tokens (int)

  • summary_min_chars (int)

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class lalandre_core.config.GenerationConfig(*, provider='mistral', model_name=None, temperature=None, max_tokens=8000, max_context_chars=20000, summarize_max_context_chars=60000, base_url=None, mistral_base_url='https://api.mistral.ai/v1', context_window=32000, api_key=None, timeout_seconds=45.0, lightweight_model_name=None, key_pool_max=10)[source]

Bases: BaseModel

LLM generation configuration.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • provider (str)

  • model_name (str | None)

  • temperature (float | None)

  • max_tokens (int)

  • max_context_chars (int)

  • summarize_max_context_chars (int)

  • base_url (str | None)

  • mistral_base_url (str)

  • context_window (int)

  • api_key (str | None)

  • timeout_seconds (float)

  • lightweight_model_name (str | None)

  • key_pool_max (int)

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class lalandre_core.config.ChunkingConfig(*, min_chunk_size, max_chunk_size, chunk_overlap=0, subdivision_max_chars=30000, extraction_max_chunk_chars=3200, breakpoint_percentile=90.0, breakpoint_max_threshold=1.0, sentence_window_size=1, embedding_batch_size=32, article_level_chunking=True, embedding=<factory>)[source]

Bases: BaseModel

Chunking configuration.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • min_chunk_size (int)

  • max_chunk_size (int)

  • chunk_overlap (int)

  • subdivision_max_chars (int)

  • extraction_max_chunk_chars (int)

  • breakpoint_percentile (float)

  • breakpoint_max_threshold (float)

  • sentence_window_size (int)

  • embedding_batch_size (int)

  • article_level_chunking (bool)

  • embedding (ChunkingEmbeddingConfig)

resolve_max_chunk_size(token_limits)[source]

Cap max_chunk_size so it never exceeds the global embedding token budget.

Each embedding model handles its own per-provider limit at embed time (split + weighted-average for oversized chunks). This guard only prevents chunks from exceeding the largest model’s hard ceiling.

Parameters:

token_limits (TokenLimitsConfig)

Return type:

int

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class lalandre_core.config.ContextBudgetConfig(*, rag_max_sources=10, rag_min_chars_per_source=200, rag_relation_lines=8, global_reports_share=0.45, global_sources_share=0.55, global_max_reports=4, global_min_cluster_size=2, global_max_evidence_per_report=3, global_max_source_docs=7, standard_relation_budget_fraction=0.15, global_graph_budget_fraction=0.1, community_top_relation_types=5, community_central_acts=3, content_preview_chars=200, snippet_preview_chars=300, fallback_preview_chars=180, compression_min_chars=3000, compression_min_budget=500)[source]

Bases: BaseModel

Token/character budgets used to compose RAG context blocks.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • rag_max_sources (int)

  • rag_min_chars_per_source (int)

  • rag_relation_lines (int)

  • global_reports_share (float)

  • global_sources_share (float)

  • global_max_reports (int)

  • global_min_cluster_size (int)

  • global_max_evidence_per_report (int)

  • global_max_source_docs (int)

  • standard_relation_budget_fraction (float)

  • global_graph_budget_fraction (float)

  • community_top_relation_types (int)

  • community_central_acts (int)

  • content_preview_chars (int)

  • snippet_preview_chars (int)

  • fallback_preview_chars (int)

  • compression_min_chars (int)

  • compression_min_budget (int)

normalized_global_shares()[source]

Return normalized (reports_share, sources_share) ratios for global mode.

Return type:

tuple[float, float]

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class lalandre_core.config.GatewayConfig(*, redis_host=None, redis_port=None, rag_service_url=None, embedding_service_url=None, rerank_service_url=None, allowed_origins=None, auto_bootstrap=False, job_ttl_seconds=None, bootstrap_lock_ttl_seconds=None, healthcheck_timeout_seconds=5.0, rag_proxy_timeout_seconds=300.0, rate_limit_query='20/minute', rate_limit_stream='15/minute', rate_limit_search='30/minute', rate_limit_jobs='10/minute', job_chunk_min_content_length=None, job_embed_batch_size=None, job_extract_min_confidence=None, job_extract_skip_existing_default=None)[source]

Bases: BaseModel

API Gateway configuration.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • redis_host (str | None)

  • redis_port (int | None)

  • rag_service_url (str | None)

  • embedding_service_url (str | None)

  • rerank_service_url (str | None)

  • allowed_origins (list[str] | None)

  • auto_bootstrap (bool)

  • job_ttl_seconds (int | None)

  • bootstrap_lock_ttl_seconds (int | None)

  • healthcheck_timeout_seconds (float)

  • rag_proxy_timeout_seconds (float)

  • rate_limit_query (str)

  • rate_limit_stream (str)

  • rate_limit_search (str)

  • rate_limit_jobs (str)

  • job_chunk_min_content_length (int | None)

  • job_embed_batch_size (int | None)

  • job_extract_min_confidence (float | None)

  • job_extract_skip_existing_default (bool | None)

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class lalandre_core.config.ExtractionConfidenceConfig(*, base=0.75, non_cites_bonus=0.03, explicit_resolution_bonus=0.1, alias_resolution_bonus=0.05, normalize_fallback_score=0.0, fuzzy_min_factor=0.75, evidence_min_chars=20, evidence_bonus=0.02, max_confidence=0.95)[source]

Bases: BaseModel

Tuning knobs for post-extraction confidence scoring.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • base (float)

  • non_cites_bonus (float)

  • explicit_resolution_bonus (float)

  • alias_resolution_bonus (float)

  • normalize_fallback_score (float)

  • fuzzy_min_factor (float)

  • evidence_min_chars (int)

  • evidence_bonus (float)

  • max_confidence (float)

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class lalandre_core.config.ExtractionConfig(*, llm_provider='mistral', llm_model='mistral-small-latest', llm_base_url='https://api.mistral.ai/v1', llm_timeout_seconds=120.0, llm_temperature=0.0, llm_max_output_tokens=1024, llm_min_output_tokens=80, llm_system_prompt='You are an EU/FR legal relation extractor. Return valid JSON only.', llm_min_evidence_chars=8, llm_min_rationale_chars=24, llm_max_parallel_chunks=2, llm_chunk_cache_size=256, validation_enabled=True, min_evidence_chars=28, min_description_chars=240, entity_linker_fuzzy_threshold=0.89, entity_linker_fuzzy_min_gap=0.03, entity_linker_fuzzy_limit=2, entity_linker_min_alias_chars=6, confidence=<factory>, max_evidence_chars=420)[source]

Bases: BaseModel

Extraction LLM behavior configuration.

Two-stage filtering: - llm_min_* fields apply during raw LLM output parsing (first pass). - min_evidence_chars applies during post-extraction validation (second pass).

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • llm_provider (str)

  • llm_model (str)

  • llm_base_url (str)

  • llm_timeout_seconds (float)

  • llm_temperature (float)

  • llm_max_output_tokens (int)

  • llm_min_output_tokens (int)

  • llm_system_prompt (str)

  • llm_min_evidence_chars (int)

  • llm_min_rationale_chars (int)

  • llm_max_parallel_chunks (int)

  • llm_chunk_cache_size (int)

  • validation_enabled (bool)

  • min_evidence_chars (int)

  • min_description_chars (int)

  • entity_linker_fuzzy_threshold (float)

  • entity_linker_fuzzy_min_gap (float)

  • entity_linker_fuzzy_limit (int)

  • entity_linker_min_alias_chars (int)

  • confidence (ExtractionConfidenceConfig)

  • max_evidence_chars (int)

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class lalandre_core.config.WorkersConfig(*, brpop_timeout_seconds=1, chunk_db_commit_batch_size=10, embed_worker_max_batch_size=32, embed_qdrant_upsert_batch_size=1000, auto_embed_reconcile=True, auto_embed_reconcile_interval=300, auto_embed_reconcile_ttl=600, auto_chunk_reconcile=True, auto_chunk_reconcile_interval=300, auto_chunk_reconcile_ttl=600, auto_extract_reconcile=True, auto_extract_reconcile_interval=600, auto_extract_reconcile_ttl=600, extract_metrics_port=9107, embed_metrics_port=9108, chunk_metrics_port=9109, extract_stale_timeout_minutes=60, community_resolution=1.0, community_min_size=2)[source]

Bases: BaseModel

Worker runtime tuning.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • brpop_timeout_seconds (int)

  • chunk_db_commit_batch_size (int)

  • embed_worker_max_batch_size (int)

  • embed_qdrant_upsert_batch_size (int)

  • auto_embed_reconcile (bool)

  • auto_embed_reconcile_interval (int)

  • auto_embed_reconcile_ttl (int)

  • auto_chunk_reconcile (bool)

  • auto_chunk_reconcile_interval (int)

  • auto_chunk_reconcile_ttl (int)

  • auto_extract_reconcile (bool)

  • auto_extract_reconcile_interval (int)

  • auto_extract_reconcile_ttl (int)

  • extract_metrics_port (int)

  • embed_metrics_port (int)

  • chunk_metrics_port (int)

  • extract_stale_timeout_minutes (int)

  • community_resolution (float)

  • community_min_size (int)

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class lalandre_core.config.LalandreConfig(*, database=<factory>, vector=<factory>, graph=<factory>, token_limits=<factory>, embedding=<factory>, embedding_presets=<factory>, search=<factory>, generation=<factory>, chunking, context_budget=<factory>, gateway=<factory>, extraction=<factory>, workers=<factory>, models_cache_dir=None)[source]

Bases: BaseModel

Main configuration class.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
enabled_embedding_presets()[source]

Return the embedding presets that are enabled for runtime use.

Return type:

list[EmbeddingPresetConfig]

indexing_enabled_embedding_presets()[source]

Return the embedding presets that are enabled for indexing workflows.

Return type:

list[EmbeddingPresetConfig]

get_embedding_preset(preset_id)[source]

Return one embedding preset by ID, or None when it is unknown.

Parameters:

preset_id (str | None)

Return type:

EmbeddingPresetConfig | None

get_default_embedding_preset()[source]

Return the default enabled embedding preset for query-time operations.

Return type:

EmbeddingPresetConfig

classmethod from_env()[source]

Load configuration from environment variables

Return type:

LalandreConfig

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

lalandre_core.config.get_env_settings()[source]

Get or create the environment settings instance.

Return type:

EnvSettings

lalandre_core.config.get_config()[source]

Get or create global configuration instance.

Return type:

LalandreConfig

lalandre_core.config.reset_config()[source]

Invalidate the config singleton so the next call to get_config() reloads from disk.

Return type:

None

lalandre_core.config.get_postgres_connection_string()[source]

Get PostgreSQL connection string.

Return type:

str

lalandre_core.config.get_gateway_config()[source]

Get API Gateway configuration with required values enforced.

Return type:

GatewayConfig

lalandre_core.embedding_presets

Source: packages/lalandre_core/lalandre_core/embedding_presets.py

Helpers for embedding preset resolution across services.

lalandre_core.embedding_presets.list_embedding_presets(*, enabled_only=False, indexing_only=False)[source]

Return configured embedding presets.

Parameters:
  • enabled_only (bool)

  • indexing_only (bool)

Return type:

list[EmbeddingPresetConfig]

lalandre_core.embedding_presets.get_embedding_preset(preset_id, *, enabled_only=False, indexing_only=False)[source]

Resolve a preset by ID.

Parameters:
  • preset_id (str | None)

  • enabled_only (bool)

  • indexing_only (bool)

Return type:

EmbeddingPresetConfig | None

lalandre_core.embedding_presets.get_default_embedding_preset()[source]

Return the configured default embedding preset.

Return type:

EmbeddingPresetConfig

lalandre_core.embedding_presets.get_default_embedding_preset_id()[source]

Return the ID of the default embedding preset.

Return type:

str

lalandre_core.embedding_presets.resolve_embedding_preset_or_default(preset_id)[source]

Resolve a preset, falling back to the configured default when missing/invalid.

Parameters:

preset_id (str | None)

Return type:

EmbeddingPresetConfig

lalandre_core.embedding_presets.resolve_embed_queue_name(preset_id=None)[source]

Return the queue name for a preset, defaulting to the configured default preset.

Parameters:

preset_id (str | None)

Return type:

str

lalandre_core.embedding_presets.resolve_worker_embedding_preset(preset_id=None, *, env_var='EMBEDDING_PRESET_ID')[source]

Resolve the preset bound to the current embedding worker.

Parameters:
  • preset_id (str | None)

  • env_var (str)

Return type:

EmbeddingPresetConfig

lalandre_core.http

Source: packages/lalandre_core/lalandre_core/http/__init__.py

HTTP helpers shared across Lalandre services.

lalandre_core.http.llm_client

Source: packages/lalandre_core/lalandre_core/http/llm_client.py

Shared HTTP client for compact JSON-oriented LLM calls.

lalandre_core.http.llm_client.coerce_json_object(value)[source]

Safely coerce a runtime value to a JSON object with string keys.

Parameters:

value (Any)

Return type:

Dict[str, Any] | None

class lalandre_core.http.llm_client.JSONHTTPLLMClient(provider, model, base_url, timeout_seconds, max_output_tokens, temperature, api_key=None, system_prompt='Return valid JSON only.', error_preview_chars=240)[source]

Bases: object

Thin HTTP client for OpenAI-compatible JSON generation.

Parameters:
  • provider (str)

  • model (str)

  • base_url (str)

  • timeout_seconds (float)

  • max_output_tokens (int)

  • temperature (float)

  • api_key (str | None)

  • system_prompt (str)

  • error_preview_chars (int)

generate(prompt)[source]

Generate a JSON-formatted completion payload as raw string.

Parameters:

prompt (str)

Return type:

str

class lalandre_core.http.llm_client.SharedKeyPoolJSONHTTPLLMClient(*, key_pool, clients_by_key)[source]

Bases: object

Dispatch JSON HTTP LLM calls through a shared API key pool.

Parameters:
classmethod from_key_pool(*, key_pool, provider, model, base_url, timeout_seconds, max_output_tokens, temperature, system_prompt='Return valid JSON only.', error_preview_chars=240)[source]

Build one JSON HTTP client per API key and wrap them in a shared pool.

Parameters:
  • key_pool (APIKeyPool)

  • provider (str)

  • model (str)

  • base_url (str)

  • timeout_seconds (float)

  • max_output_tokens (int)

  • temperature (float)

  • system_prompt (str)

  • error_preview_chars (int)

Return type:

SharedKeyPoolJSONHTTPLLMClient

generate(prompt)[source]

Generate one JSON response using the next key selected by the pool.

Parameters:

prompt (str)

Return type:

str

lalandre_core.http.middleware

Source: packages/lalandre_core/lalandre_core/http/middleware.py

Reusable HTTP instrumentation middleware factory for FastAPI services.

lalandre_core.http.middleware.make_http_instrumentation_middleware(observe_fn)[source]

Build a middleware that records per-request latency and status metrics.

Parameters:

observe_fn (Callable[[...], None])

Return type:

Callable[[Request, Callable[[Request], Awaitable[Response]]], Any]

lalandre_core.linking

Source: packages/lalandre_core/lalandre_core/linking/__init__.py

Entity linking: resolve legal references to canonical CELEX identifiers.

lalandre_core.linking.entity_linker

Source: packages/lalandre_core/lalandre_core/linking/entity_linker.py

Local entity linking utilities for legal acts (UE/France).

class lalandre_core.linking.entity_linker.ActAliasEntry(celex, title, aliases=(), act_id=None, eli=None, acronyms=())[source]

Bases: object

Canonical act entry and its known alias forms.

Parameters:
  • celex (str)

  • title (str)

  • aliases (tuple[str, ...])

  • act_id (int | None)

  • eli (str | None)

  • acronyms (tuple[str, ...])

eli: str | None = None

Optional European Legislation Identifier URI for interop with ELI-aware systems.

acronyms: tuple[str, ...] = ()

Short acronyms (DORA, CRR, MAR, …) that bypass min_alias_chars.

class lalandre_core.linking.entity_linker.LinkResolution(celex, score, method, matched_text, act_id=None, subdivision_id=None, article_number=None, eli=None)[source]

Bases: object

Resolved reference returned by the entity linker.

Parameters:
  • celex (str)

  • score (float)

  • method (str)

  • matched_text (str)

  • act_id (int | None)

  • subdivision_id (int | None)

  • article_number (str | None)

  • eli (str | None)

eli: str | None = None

Canonical ELI URI of the resolved act (propagated from the matching entry).

class lalandre_core.linking.entity_linker.LegalEntityLinker(entries, *, fuzzy_threshold, fuzzy_min_gap, fuzzy_limit=2, min_alias_chars, article_lookup=None)[source]

Bases: object

Resolve legal references to canonical CELEX-like identifiers.

Parameters:
  • entries (Iterable[ActAliasEntry])

  • fuzzy_threshold (float)

  • fuzzy_min_gap (float)

  • fuzzy_limit (int)

  • min_alias_chars (int)

  • article_lookup (Callable[[int, str], int | None] | None)

property alias_count: int

Return the number of normalized aliases indexed by the linker.

classmethod derive_acronyms(title)[source]

Extract short acronyms from a title’s parenthesised content.

Returns a tuple of strings like ("DORA",) for a title that contains Digital Operational Resilience Regulation (DORA). The caller passes this to ActAliasEntry.acronyms so the linker can match them without applying min_alias_chars.

Parameters:

title (str)

Return type:

tuple[str, …]

classmethod derive_aliases(title, *, eli=None, official_journal_reference=None, form_number=None)[source]

Derive stable alias candidates from act metadata fields.

Parameters:
  • title (str)

  • eli (str | None)

  • official_journal_reference (str | None)

  • form_number (str | None)

Return type:

tuple[str, …]

resolve(reference)[source]

Resolve a free-text legal reference to a canonical CELEX identifier.

Parameters:

reference (str)

Return type:

LinkResolution | None

resolve_with_article(reference, article_number=None, *, article_lookup=None)[source]

Resolve a reference and optionally enrich with a subdivision_id for an article.

If article_number is provided and the act is known, try to resolve the corresponding subdivision id via article_lookup (or the linker’s default one). Falls back to returning the base resolution if the article can’t be resolved.

Parameters:
  • reference (str)

  • article_number (str | None)

  • article_lookup (Callable[[int, str], int | None] | None)

Return type:

LinkResolution | None

lalandre_core.linking.heuristics

Source: packages/lalandre_core/lalandre_core/linking/heuristics.py

Shared heuristics and regex patterns for legal entity linking. Centralized here to keep extraction, RAG, and validation rules in sync. Values are package-local (not config-driven) to avoid leaking concerns.

lalandre_core.linking.heuristics.is_generic_target(target)[source]

Return whether target is too generic to resolve as a concrete act.

Parameters:

target (str)

Return type:

bool

lalandre_core.linking.heuristics.looks_like_identifier(target)[source]

Return whether target resembles an explicit legal identifier.

Parameters:

target (str)

Return type:

bool

lalandre_core.linking.ner_client

Source: packages/lalandre_core/lalandre_core/linking/ner_client.py

Tiny HTTP client for the dedicated NER service.

The NER service exposes a single POST /detect endpoint that runs GLiNER in zero-shot mode. The client is deliberately minimal: it does one synchronous request per call, surfaces a typed result, and never raises on network errors — it returns an empty span list and lets the caller log/skip.

Designed to be reused outside RAG (extraction pipeline, evaluation scripts).

class lalandre_core.linking.ner_client.NerClient(base_url, *, timeout_seconds=5.0, default_threshold=0.5, default_entity_types=None)[source]

Bases: object

Thin HTTP wrapper around ner-service /detect.

Failure modes (network error, non-2xx, malformed JSON) are swallowed and logged at WARNING level; the call returns [] in that case so the caller can degrade gracefully.

Parameters:
  • base_url (str)

  • timeout_seconds (float)

  • default_threshold (float)

  • default_entity_types (Optional[Iterable[str]])

property base_url: str

Return the base URL of the configured NER service.

detect(text, *, entity_types=None, threshold=None)[source]

Call POST /detect and return the matched spans (empty on any failure).

Parameters:
  • text (str)

  • entity_types (Iterable[str] | None)

  • threshold (float | None)

Return type:

List[NerSpan]

class lalandre_core.linking.ner_client.NerSpan(text, start, end, type, score)[source]

Bases: object

A single span detected by the NER service.

Parameters:
  • text (str)

  • start (int)

  • end (int)

  • type (str)

  • score (float)

lalandre_core.llm

Source: packages/lalandre_core/lalandre_core/llm/__init__.py

Shared LLM utilities for provider normalization and ChatModel construction.

lalandre_core.llm.langchain

Source: packages/lalandre_core/lalandre_core/llm/langchain.py

Unified LangChain ChatModel factory.

class lalandre_core.llm.langchain.PooledChatModel(models)[source]

Bases: Runnable

Round-robin wrapper over multiple ChatModel instances.

Parameters:

models (List[Any])

invoke(input, config=None, **kwargs)[source]

Invoke the next model in the pool with one request.

Parameters:
  • input (Any)

  • config (Any)

  • kwargs (Any)

Return type:

Any

stream(input, config=None, **kwargs)[source]

Stream one response from the next model in the pool.

Parameters:
  • input (Any)

  • config (Any)

  • kwargs (Any)

Return type:

Iterator[Any]

batch(inputs, config=None, **kwargs)[source]

Process one batch with the next model in the pool.

Parameters:
  • inputs (List[Any])

  • config (Any)

  • kwargs (Any)

Return type:

List[Any]

class lalandre_core.llm.langchain.SharedKeyPoolChatModel(*, key_pool, models_by_key)[source]

Bases: Runnable

Dispatch each LangChain call through a shared API key pool.

Parameters:
  • key_pool (APIKeyPool)

  • models_by_key (Mapping[str, Any])

invoke(input, config=None, **kwargs)[source]

Invoke the model selected by the shared API key pool.

Parameters:
  • input (Any)

  • config (Any)

  • kwargs (Any)

Return type:

Any

stream(input, config=None, **kwargs)[source]

Stream a response from the model selected by the shared API key pool.

Parameters:
  • input (Any)

  • config (Any)

  • kwargs (Any)

Return type:

Iterator[Any]

batch(inputs, config=None, **kwargs)[source]

Execute a batch call through the model selected by the shared API key pool.

Parameters:
  • inputs (List[Any])

  • config (Any)

  • kwargs (Any)

Return type:

List[Any]

lalandre_core.llm.langchain.build_chat_model(*, provider, model, api_key, base_url='', temperature=0.0, max_tokens=None, timeout_seconds=None)[source]

Build a LangChain ChatModel (ChatMistralAI or ChatOpenAI).

Returns the raw ChatModel instance — callers can wrap it (e.g. LangchainLLMWrapper, StrOutputParser) as needed.

Parameters:
  • provider (str)

  • model (str)

  • api_key (str)

  • base_url (str)

  • temperature (float)

  • max_tokens (int | None)

  • timeout_seconds (float | None)

Return type:

Any

lalandre_core.llm.providers

Source: packages/lalandre_core/lalandre_core/llm/providers.py

Shared LLM provider utilities: normalization, URL resolution, API key resolution.

lalandre_core.llm.providers.normalize_provider(provider)[source]

Normalize provider name: strip, lowercase, openai → openai_compatible.

Parameters:

provider (str)

Return type:

str

lalandre_core.llm.providers.normalize_base_url(*, provider, base_url)[source]

Normalize base URL: strip, rstrip /.

Parameters:
  • provider (str)

  • base_url (str)

Return type:

str

lalandre_core.llm.providers.resolve_api_key(*, provider, api_key=None, mistral_api_key=None, allow_empty=False)[source]

Resolve API key with priority: mistral_api_key > api_key > error.

Parameters:
  • provider (str)

  • api_key (str | None)

  • mistral_api_key (str | None)

  • allow_empty (bool)

Return type:

str

lalandre_core.llm.structured

Source: packages/lalandre_core/lalandre_core/llm/structured.py

Shared helpers for running PydanticAI structured-output agents.

Extracted from lalandre_rag.agentic.tools so that any package (extraction, RAG, summaries) can reuse the same FunctionModel bridge without depending on the RAG layer.

lalandre_core.llm.structured.json_payload_from_text(raw)[source]

Extract a JSON object from potentially noisy LLM text.

Parameters:

raw (str)

Return type:

dict[str, Any] | None

lalandre_core.llm.structured.build_structured_prompt(*, messages, agent_info)[source]

Build a single text prompt from PydanticAI messages + output schema.

Parameters:
  • messages (list[Annotated[ModelRequest | ModelResponse, Discriminator(discriminator=kind, custom_error_type=None, custom_error_message=None, custom_error_context=None)]])

  • agent_info (AgentInfo)

Return type:

str

lalandre_core.llm.structured.to_text_generator(llm_or_generate)[source]

Normalize an LLM object or callable into a simple str -> str function.

Parameters:

llm_or_generate (Any)

Return type:

Callable[[str], str]

lalandre_core.llm.structured.run_structured_agent(*, agent, prompt, llm_or_generate, model_name)[source]

Run a PydanticAI agent using a FunctionModel bridge to any LLM.

Returns (output, retries) where retries is the number of output-validation retries triggered.

Parameters:
  • agent (Agent[Any, T])

  • prompt (str)

  • llm_or_generate (Any)

  • model_name (str)

Return type:

tuple[T, int]

lalandre_core.logging_setup

Source: packages/lalandre_core/lalandre_core/logging_setup.py

Shared logging configuration for Lalandre workers.

lalandre_core.logging_setup.setup_worker_logging()[source]

Configure root logging with structlog.

Reads LOG_FORMAT env var: ‘json’ for structured JSON output, anything else for human-readable console output.

Return type:

None

lalandre_core.models

Source: packages/lalandre_core/lalandre_core/models/__init__.py

Data models

lalandre_core.models.act_metadata

Source: packages/lalandre_core/lalandre_core/models/act_metadata.py

Pydantic model for key-value metadata attached to one act.

class lalandre_core.models.act_metadata.ActMetadata(*, id=None, act_id, key, value, created_at=None)[source]

Bases: BaseModel

Represent one metadata entry linked to a legal act.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • id (int | None)

  • act_id (int)

  • key (str)

  • value (str)

  • created_at (datetime | None)

model_config: ClassVar[ConfigDict] = {'from_attributes': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

lalandre_core.models.act_relations

Source: packages/lalandre_core/lalandre_core/models/act_relations.py

Pydantic model for relationships extracted between legal acts.

class lalandre_core.models.act_relations.ActRelations(*, id=None, source_act_id, target_act_id=None, target_celex=None, relation_type, source_subdivision_id=None, target_subdivision_id=None, effect_date=None, description=None, evidence=None, rationale=None, resolution_method=None, resolution_score=None, target_reference=None, confidence=None, source=None, validated=False, synced_to_neo4j_at=None, is_resolved=True, created_at=None)[source]

Bases: BaseModel

Represent one typed relationship between two legal acts.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • id (int | None)

  • source_act_id (int)

  • target_act_id (int | None)

  • target_celex (str | None)

  • relation_type (RelationType)

  • source_subdivision_id (int | None)

  • target_subdivision_id (int | None)

  • effect_date (datetime | None)

  • description (str | None)

  • evidence (str | None)

  • rationale (str | None)

  • resolution_method (str | None)

  • resolution_score (float | None)

  • target_reference (str | None)

  • confidence (float | None)

  • source (str | None)

  • validated (bool | None)

  • synced_to_neo4j_at (datetime | None)

  • is_resolved (bool | None)

  • created_at (datetime | None)

model_config: ClassVar[ConfigDict] = {'from_attributes': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

lalandre_core.models.act_subjects

Source: packages/lalandre_core/lalandre_core/models/act_subjects.py

Pydantic model for the act-to-subject association table.

class lalandre_core.models.act_subjects.ActSubjects(*, act_id, subject_id)[source]

Bases: BaseModel

Represent one link between an act and a subject matter.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • act_id (int)

  • subject_id (int)

model_config: ClassVar[ConfigDict] = {'from_attributes': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

lalandre_core.models.acts

Source: packages/lalandre_core/lalandre_core/models/acts.py

Pydantic model for top-level legal act records.

class lalandre_core.models.acts.Acts(*, id=None, celex, eli=None, act_type, title, language, adoption_date=None, force_date=None, end_date=None, official_journal_reference=None, sector=None, level=None, form_number=None, url_eurlex=None, created_at=None, updated_at=None, last_synced_at=None, content_hash=None, sync_status='pending', extracted_at=None, extraction_status='pending')[source]

Bases: BaseModel

Represent one legal act stored in the core domain model.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • id (int | None)

  • celex (str)

  • eli (str | None)

  • act_type (ActType)

  • title (str)

  • language (LanguageCode)

  • adoption_date (datetime | None)

  • force_date (datetime | None)

  • end_date (datetime | None)

  • official_journal_reference (str | None)

  • sector (int | None)

  • level (int | None)

  • form_number (str | None)

  • url_eurlex (str | None)

  • created_at (datetime | None)

  • updated_at (datetime | None)

  • last_synced_at (datetime | None)

  • content_hash (str | None)

  • sync_status (str | None)

  • extracted_at (datetime | None)

  • extraction_status (str | None)

model_config: ClassVar[ConfigDict] = {'from_attributes': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

lalandre_core.models.chunks

Source: packages/lalandre_core/lalandre_core/models/chunks.py

Pydantic model for chunk records derived from subdivisions.

class lalandre_core.models.chunks.Chunks(*, id=None, subdivision_id, chunk_index, content, char_start, char_end, token_count=None, chunk_metadata=None, created_at=None)[source]

Bases: BaseModel

Represent one persisted chunk of subdivision content.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • id (int | None)

  • subdivision_id (int)

  • chunk_index (int)

  • content (str)

  • char_start (int)

  • char_end (int)

  • token_count (int | None)

  • chunk_metadata (dict[str, Any] | None)

  • created_at (datetime | None)

model_config: ClassVar[ConfigDict] = {'from_attributes': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

lalandre_core.models.embedding_state

Source: packages/lalandre_core/lalandre_core/models/embedding_state.py

Pydantic model tracking the embedding status of stored objects.

class lalandre_core.models.embedding_state.EmbeddingState(*, id=None, object_type, object_id, provider, model_name, vector_size, content_hash, embedded_at=None)[source]

Bases: BaseModel

Represent one embedding state snapshot for an act, chunk, or subdivision.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • id (int | None)

  • object_type (Literal['subdivision', 'chunk', 'act'])

  • object_id (int)

  • provider (str)

  • model_name (str)

  • vector_size (int)

  • content_hash (str)

  • embedded_at (datetime | None)

model_config: ClassVar[ConfigDict] = {'from_attributes': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

lalandre_core.models.subdivisions

Source: packages/lalandre_core/lalandre_core/models/subdivisions.py

Pydantic model for hierarchical subdivisions inside one act.

class lalandre_core.models.subdivisions.Subdivisions(*, id=None, act_id, version_id=None, parent_id=None, subdivision_type, number=None, title=None, content, content_hash=None, sequence_order, hierarchy_path, depth=0, created_at=None)[source]

Bases: BaseModel

Represent one structured subdivision extracted from an act.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • id (int | None)

  • act_id (int)

  • version_id (int | None)

  • parent_id (int | None)

  • subdivision_type (SubdivisionType)

  • number (str | None)

  • title (str | None)

  • content (str)

  • content_hash (str | None)

  • sequence_order (int)

  • hierarchy_path (str)

  • depth (int)

  • created_at (datetime | None)

model_config: ClassVar[ConfigDict] = {'from_attributes': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

lalandre_core.models.subject_matters

Source: packages/lalandre_core/lalandre_core/models/subject_matters.py

Pydantic model for EuroVoc subject matter records.

class lalandre_core.models.subject_matters.SubjectMatters(*, id=None, eurovoc_code, label_en, label_fr=None, parent_code=None)[source]

Bases: BaseModel

Represent one subject matter entry used to classify acts.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • id (int | None)

  • eurovoc_code (str)

  • label_en (str)

  • label_fr (str | None)

  • parent_code (str | None)

model_config: ClassVar[ConfigDict] = {'from_attributes': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

lalandre_core.models.types

Source: packages/lalandre_core/lalandre_core/models/types/__init__.py

Type enumerations

lalandre_core.models.types.act_type

Source: packages/lalandre_core/lalandre_core/models/types/act_type.py

Enumeration of legal act categories handled by the platform.

class lalandre_core.models.types.act_type.ActType(*values)[source]

Bases: Enum

Enumerate the supported categories of legal acts.

lalandre_core.models.types.language_code

Source: packages/lalandre_core/lalandre_core/models/types/language_code.py

Enumeration of language codes supported by the core models.

class lalandre_core.models.types.language_code.LanguageCode(*values)[source]

Bases: Enum

Enumerate the language codes handled by the platform.

lalandre_core.models.types.relation_type

Source: packages/lalandre_core/lalandre_core/models/types/relation_type.py

Enumeration of supported relationship types between legal acts.

class lalandre_core.models.types.relation_type.RelationType(*values)[source]

Bases: Enum

Types of relationships between legal acts.

lalandre_core.models.types.subdivision_type

Source: packages/lalandre_core/lalandre_core/models/types/subdivision_type.py

Enumeration of structured subdivision kinds extracted from acts.

class lalandre_core.models.types.subdivision_type.SubdivisionType(*values)[source]

Bases: Enum

Enumerate the subdivision types recognized by the ingestion pipeline.

lalandre_core.models.types.version_type

Source: packages/lalandre_core/lalandre_core/models/types/version_type.py

Enumeration of act version categories stored by the platform.

class lalandre_core.models.types.version_type.VersionType(*values)[source]

Bases: Enum

Enumerate the supported categories of act versions.

lalandre_core.models.versions

Source: packages/lalandre_core/lalandre_core/models/versions.py

Pydantic model for version records associated with one act.

class lalandre_core.models.versions.Versions(*, id=None, act_id, version_number, version_type, version_date, source_url=None, is_current=False, created_at=None)[source]

Bases: BaseModel

Represent one dated version of a legal act.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • id (int | None)

  • act_id (int)

  • version_number (int)

  • version_type (VersionType)

  • version_date (datetime)

  • source_url (str | None)

  • is_current (bool)

  • created_at (datetime | None)

model_config: ClassVar[ConfigDict] = {'from_attributes': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

lalandre_core.queue

Source: packages/lalandre_core/lalandre_core/queue/__init__.py

Shared Redis queue helpers for chunking, embedding, and extraction workers.

lalandre_core.queue.dispatch_all

Source: packages/lalandre_core/lalandre_core/queue/dispatch_all.py

Generic dispatch-all-acts helper shared by workers.

lalandre_core.queue.dispatch_all.dispatch_all_act_jobs(*, runtime, job_id, queue_name, job_type, label, acts, build_params, skip_filter=None, error_label='Processing')[source]

Iterate over acts and enqueue one per-act job with progress tracking.

Parameters:
  • runtime (QueueRuntime) – Queue runtime containing the Redis client and TTL policy.

  • job_id (str) – Parent job identifier used for status updates.

  • queue_name (str) – Redis queue that receives the child jobs.

  • job_type (str) – Job type string for the child jobs, such as "chunk_act".

  • label (str) – Human-readable label used in log and status messages.

  • acts (list[Any]) – Iterable of act objects exposing a celex attribute.

  • build_params (Callable[[Any], dict[str, Any]]) – Callable receiving one CELEX value and returning job params.

  • skip_filter (Callable[[Any], bool] | None) – Optional predicate returning True when one act should be skipped.

  • error_label (str) – Verb phrase used in failure messages, such as "Chunking".

Return type:

None

lalandre_core.queue.job_queue

Source: packages/lalandre_core/lalandre_core/queue/job_queue.py

Shared Redis queue helpers for worker services.

Centralizes job enqueue/dedup/status operations used by chunking, embedding, and extraction workers.

class lalandre_core.queue.job_queue.QueueRuntime(redis_client, job_ttl_seconds)[source]

Bases: object

Base runtime data required by queue helper functions.

Parameters:
  • redis_client (Any)

  • job_ttl_seconds (int)

class lalandre_core.queue.job_queue.JobPayload[source]

Bases: TypedDict

Typed representation of one serialized Redis job payload.

lalandre_core.queue.job_queue.update_job_status(runtime, job_id, status, message=None, progress=None, ttl=None)[source]

Update job status metadata in Redis.

Parameters:
  • runtime (QueueRuntime)

  • job_id (str)

  • status (str)

  • message (str | None)

  • progress (int | float | None)

  • ttl (int | None)

Return type:

None

lalandre_core.queue.job_queue.job_already_queued(runtime, *, queue_name, job_type, celex=None)[source]

Check whether a matching job is already queued.

If celex is provided, deduplication is scoped to that CELEX value.

Parameters:
  • runtime (QueueRuntime)

  • queue_name (str)

  • job_type (str)

  • celex (str | None)

Return type:

bool

lalandre_core.queue.job_queue.enqueue_job(runtime, *, queue_name, job_type, params, dedupe_celex=None)[source]

Push a job into queue + initialize status hash, with optional dedupe.

Parameters:
  • runtime (QueueRuntime)

  • queue_name (str)

  • job_type (str)

  • params (dict[str, Any])

  • dedupe_celex (str | None)

Return type:

str | None

lalandre_core.queue.reconcile

Source: packages/lalandre_core/lalandre_core/queue/reconcile.py

shared Redis lock pattern for worker auto-reconcile.

Each worker calls with_reconcile_lock to acquire a distributed Redis lock, run its domain-specific reconcile check, and release the lock.

lalandre_core.queue.reconcile.with_reconcile_lock(redis_client, lock_key, lock_ttl, action)[source]

Acquire a Redis NX lock, execute action, then release.

  • If the lock is already held, returns silently.

  • If Redis is unreachable, logs a warning and returns.

  • The lock is always released in the finally block.

Parameters:
  • redis_client (Any)

  • lock_key (str)

  • lock_ttl (int)

  • action (Callable[[], None])

Return type:

None

lalandre_core.queue.worker_config

Source: packages/lalandre_core/lalandre_core/queue/worker_config.py

Shared config accessor helpers for Redis-based workers.

lalandre_core.queue.worker_config.require_gateway_config(field)[source]

Read a mandatory gateway config field; raise if None.

Parameters:

field (str)

Return type:

Any

lalandre_core.queue.worker_config.get_reconcile_params(prefix)[source]

Return (enabled, ttl, interval) for a worker reconcile loop.

prefix is the worker kind, e.g. "embed", "chunk", "extract".

Parameters:

prefix (str)

Return type:

tuple[bool, int, int]

lalandre_core.queue.worker_loop

Source: packages/lalandre_core/lalandre_core/queue/worker_loop.py

worker-loop utilities for Redis-based job workers.

Provides reusable building blocks (functions, not a class hierarchy)

class lalandre_core.queue.worker_loop.BaseRuntimeParams(redis_client, job_ttl_seconds, brpop_timeout_seconds)[source]

Bases: object

Common parameters resolved identically by every worker.

Parameters:
  • redis_client (Any)

  • job_ttl_seconds (int)

  • brpop_timeout_seconds (int)

lalandre_core.queue.worker_loop.resolve_base_runtime_params(*, redis_host, redis_port, job_ttl_seconds, brpop_timeout_seconds)[source]

Resolve Redis client + base tunables shared by all workers.

Parameters:
  • redis_host (str | None)

  • redis_port (int | None)

  • job_ttl_seconds (int | None)

  • brpop_timeout_seconds (int)

Return type:

BaseRuntimeParams

lalandre_core.queue.worker_loop.parse_job_payload(job_data)[source]

Extract (job_id, job_type, params) from a raw job dict.

Returns empty strings when required fields are missing or have the wrong type so that callers can validate cheaply.

Parameters:

job_data (dict[str, Any])

Return type:

tuple[str, str, dict[str, Any]]

lalandre_core.queue.worker_loop.instrumented_process_job(*, runtime, job_data, dispatch, observe_execution, observe_error)[source]

Parse, dispatch, and instrument a single job.

Parameters:
  • runtime (Any) – Worker runtime instance passed to the dispatched handlers.

  • job_data (dict[str, Any]) – Raw deserialized job payload fetched from Redis.

  • dispatch (dict[str, Callable[[Any, str, dict[str, Any]], None]]) – Mapping of job_type to handler(runtime, job_id, params).

  • observe_execution (Callable[[...], None]) – Observer called in finally with execution metrics.

  • observe_error (Callable[[...], None]) – Observer called when the handler raises an exception.

Return type:

None

lalandre_core.queue.worker_loop.run_worker_loop(*, queue_name, worker_name, redis_client, brpop_timeout_seconds, process_job, reconcile_callback=None, reconcile_interval_seconds=0, reconcile_hour_start=22, reconcile_hour_end=24)[source]

Generic BRPOP loop shared by all workers.

Parameters:
  • queue_name (str) – Redis list to BRPOP from.

  • worker_name (str) – Human-readable worker name used in log messages.

  • redis_client (Any) – Synchronous Redis client backing the worker loop.

  • brpop_timeout_seconds (int) – Polling timeout passed to BRPOP.

  • process_job (Callable[[dict[str, Any]], None]) – Callback invoked with each deserialized payload.

  • reconcile_callback (Callable[[], None] | None) – Optional reconciliation callback executed periodically.

  • reconcile_interval_seconds (int) – Seconds between two reconciliation runs.

  • reconcile_hour_start (int) – UTC hour at which the reconciliation window opens.

  • reconcile_hour_end (int) – UTC hour at which the reconciliation window closes.

Return type:

None

lalandre_core.queue.worker_metrics

Source: packages/lalandre_core/lalandre_core/queue/worker_metrics.py

Reusable Prometheus metrics factory for Redis-based workers.

class lalandre_core.queue.worker_metrics.WorkerMetrics(observe_execution, observe_error)[source]

Bases: object

Pre-built Prometheus instruments + observe helpers for a worker.

Parameters:
  • observe_execution (Callable[[...], None])

  • observe_error (Callable[[...], None])

lalandre_core.queue.worker_metrics.build_worker_metrics(worker_name, valid_job_types)[source]

Create Prometheus counters/histograms for a worker and return observe helpers.

Parameters:
  • worker_name (str) – Short name used in metric names, such as "embedding".

  • valid_job_types (set[str]) – Whitelist of known job type labels for normalization.

Return type:

WorkerMetrics

lalandre_core.redis_client

Source: packages/lalandre_core/lalandre_core/redis_client.py

Redis client factory helpers for services.

lalandre_core.redis_client.create_sync_redis_client(*, host, port, decode_responses=True)[source]

Build a typed synchronous Redis client used by workers.

Parameters:
  • host (str)

  • port (int)

  • decode_responses (bool)

Return type:

Redis

lalandre_core.repositories

Source: packages/lalandre_core/lalandre_core/repositories/__init__.py

Repository Base Abstractions

lalandre_core.repositories.base

Source: packages/lalandre_core/lalandre_core/repositories/base/__init__.py

Base repository abstractions

lalandre_core.repositories.base.exceptions

Source: packages/lalandre_core/lalandre_core/repositories/base/exceptions.py

Repository exceptions

exception lalandre_core.repositories.base.exceptions.RepositoryError[source]

Bases: Exception

Base exception for repository errors

exception lalandre_core.repositories.base.exceptions.DatabaseConnectionError[source]

Bases: RepositoryError

Raised when connection to database fails

exception lalandre_core.repositories.base.exceptions.DatabaseOperationError[source]

Bases: RepositoryError

Raised when a database operation fails

lalandre_core.repositories.base.repository

Source: packages/lalandre_core/lalandre_core/repositories/base/repository.py

Base repository abstraction

class lalandre_core.repositories.base.repository.BaseRepository[source]

Bases: ABC

Abstract base class for all repositories

abstractmethod close()[source]

Close database connection and cleanup resources

abstractmethod health_check()[source]

Verify database connectivity and readiness

Return type:

bool

lalandre_core.repositories.common

Source: packages/lalandre_core/lalandre_core/repositories/common/__init__.py

Common repository helpers.

lalandre_core.repositories.common.payload_builder

Source: packages/lalandre_core/lalandre_core/repositories/common/payload_builder.py

Build Qdrant payloads from JSON schemas.

class lalandre_core.repositories.common.payload_builder.PayloadBuilder(loader=None)[source]

Bases: object

Schema-driven payload builder.

Parameters:

loader (PayloadSchemaLoader | None)

build_subdivision_payload(subdivision_data, act_data, version_data=None, metadata=None)[source]

Build payload for subdivision embeddings.

Parameters:
  • subdivision_data (Dict[str, Any])

  • act_data (Dict[str, Any])

  • version_data (Dict[str, Any] | None)

  • metadata (Dict[str, str] | None)

Return type:

Dict[str, Any]

build_chunk_payload(chunk_data, subdivision_data, act_data)[source]

Build payload for chunk embeddings.

Parameters:
  • chunk_data (Dict[str, Any])

  • subdivision_data (Dict[str, Any])

  • act_data (Dict[str, Any])

Return type:

Dict[str, Any]

build_act_payload(act_data, full_text, subjects=None, metadata=None)[source]

Build payload for whole-act embeddings (one vector per act).

Parameters:
  • act_data (Dict[str, Any])

  • full_text (str)

  • subjects (list[dict[str, Any]] | None)

  • metadata (Dict[str, str] | None)

Return type:

Dict[str, Any]

lalandre_core.repositories.common.schema_loader

Source: packages/lalandre_core/lalandre_core/repositories/common/schema_loader.py

Load JSON payload schemas and render payloads.

class lalandre_core.repositories.common.schema_loader.PayloadSchemaLoader(schema_file=None)[source]

Bases: object

Loads and applies payload schemas.

Initialize loader (defaults to payload_schemas.json).

Parameters:

schema_file (str | PathLike[str] | None)

get_schema(schema_name)[source]

Fetch a schema by name.

Parameters:

schema_name (str)

Return type:

dict[str, object]

build_payload_from_schema(schema_name, context, transformers=None)[source]

Render a payload from schema + context.

Parameters:
  • schema_name (str)

  • context (dict[str, Any])

  • transformers (dict[str, Callable[[Any], Any]] | None)

Return type:

dict[str, Any]

lalandre_core.runtime_values

Source: packages/lalandre_core/lalandre_core/runtime_values.py

Small coercion helpers shared across service entrypoints.

lalandre_core.runtime_values.require_int(value, setting_name)[source]

Return an int or raise a clear configuration error.

Parameters:
  • value (int | None)

  • setting_name (str)

Return type:

int

lalandre_core.runtime_values.require_float(value, setting_name)[source]

Return a float or raise a clear configuration error.

Parameters:
  • value (float | None)

  • setting_name (str)

Return type:

float

lalandre_core.runtime_values.require_bool(value, setting_name)[source]

Return a bool or raise a clear configuration error.

Parameters:
  • value (bool | None)

  • setting_name (str)

Return type:

bool

lalandre_core.utils

Source: packages/lalandre_core/lalandre_core/utils/__init__.py

Utility functions Common helpers used across the project

lalandre_core.utils.api_key_pool

Source: packages/lalandre_core/lalandre_core/utils/api_key_pool.py

API Key Pool Manager Distributes API calls across multiple keys using round-robin strategy.

class lalandre_core.utils.api_key_pool.APIKeyPool(keys)[source]

Bases: object

Thread-safe container for API keys with round-robin distribution.

Keys are loaded from environment variables following the pattern:

BASE_VAR, BASE_VAR_2, BASE_VAR_3, …, BASE_VAR_{max_keys}

Parameters:

keys (List[str])

classmethod from_env(base_env_var='MISTRAL_API_KEY', max_keys=10, start_index=1)[source]

Load keys from environment variables.

Looks for: - {base_env_var} (index 1, main key) - {base_env_var}_2, …, {base_env_var}_{max_keys}

start_index and max_keys control the range: indices [start_index .. max_keys] are loaded. This allows splitting keys between services (e.g. 1-5 for RAG, 6-10 for workers).

Parameters:
  • base_env_var (str)

  • max_keys (int)

  • start_index (int)

Return type:

APIKeyPool

next_key()[source]

Return the next key in round-robin order (thread-safe).

Return type:

str

lalandre_core.utils.celex_utils

Source: packages/lalandre_core/lalandre_core/utils/celex_utils.py

CELEX Utility Functions

lalandre_core.utils.celex_utils.normalize_celex(celex)[source]

Handles various input formats and normalizes to the standard CELEX format. Removes spaces, handles EUR-Lex format conversions.

Examples

>>> normalize_celex('32016R0679')
'32016R0679'
>>> normalize_celex(' 32016 R 0679 ')
'32016R0679'
>>> normalize_celex('(UE) 2016/679')
'32016R0679'
>>> normalize_celex('(CE) n° 1219/2011')
'32011R1219'
>>> normalize_celex('Directive 2003/41/CE')
'32003L0041'
>>> normalize_celex('AMF-RG-L1-20250331')
'AMF-RG-L1-20250331'
>>> normalize_celex('AMF-SANCTION-SanctionAMF2026-01-20260112')
'AMF-SAN-2026-01'
Parameters:

celex (str)

Return type:

str

lalandre_core.utils.celex_utils.is_eurlex_celex(celex)[source]

Return True iff celex follows the EUR-Lex standard format.

EUR-Lex CELEXes start with a sector digit followed by the 4-digit year and a document-type letter (e.g. 32016R0679). All other sources (AMF-, EBA-, EIOPA-, ESMA-, LEGITEXT…) use alphabetical prefixes.

Parameters:

celex (str)

Return type:

bool

lalandre_core.utils.celex_utils.is_legifrance_celex(celex)[source]

Return True iff celex identifies a Légifrance document.

Légifrance CELEXes start with LEGITEXT (e.g. LEGITEXT000006072026 or LEGITEXT000006072026:LEGISCTA000006154980).

Parameters:

celex (str)

Return type:

bool

lalandre_core.utils.celex_utils.is_valid_celex(celex)[source]

Return True if celex looks like a recognisable CELEX identifier.

Parameters:

celex (str)

Return type:

bool

lalandre_core.utils.collection_utils

Source: packages/lalandre_core/lalandre_core/utils/collection_utils.py

Collection utilities. Helpers for de-duplicating lists of dictionaries.

lalandre_core.utils.collection_utils.deduplicate_dicts_by_id(items, id_key='id')[source]

Return items with duplicates removed based on a single key.

Parameters:
  • items (Iterable[Dict[str, Any] | None])

  • id_key (str)

Return type:

List[Dict[str, Any]]

lalandre_core.utils.collection_utils.deduplicate_dicts_by_tuple_key(items, keys)[source]

Return items with duplicates removed based on a tuple of keys.

Parameters:
  • items (Iterable[Dict[str, Any] | None])

  • keys (Sequence[str])

Return type:

List[Dict[str, Any]]

lalandre_core.utils.date_utils

Source: packages/lalandre_core/lalandre_core/utils/date_utils.py

Date Utility Functions Centralized date formatting and conversion utilities

lalandre_core.utils.date_utils.format_date(date_value)[source]

Format a date to ISO 8601 string format

Handles multiple input types: - datetime objects - date objects - ISO strings (pass-through) - None (returns None)

Parameters:

date_value (Any) – Date to format (datetime, date, str, or None)

Returns:

MM:SS) or None

Return type:

ISO format string (YYYY-MM-DD or YYYY-MM-DDTHH

Examples

>>> format_date(datetime(2016, 4, 27))
'2016-04-27T00:00:00'
>>> format_date(date(2016, 4, 27))
'2016-04-27'
>>> format_date("2016-04-27")
'2016-04-27'
>>> format_date(None)
None
lalandre_core.utils.date_utils.to_timestamp(date_value)[source]

Convert a date to Unix timestamp (seconds since epoch)

Handles multiple input types, same like the previous function

Parameters:

date_value (Any) – Date to convert (datetime, date, str, or None)

Returns:

Unix timestamp (int) or None

Return type:

int | None

Examples

>>> to_timestamp(datetime(2016, 4, 27, 12, 0, 0))
1461758400  # (approximate, depends on timezone)
>>> to_timestamp("2016-04-27")
1461715200
>>> to_timestamp(None)
None
lalandre_core.utils.date_utils.convert_dates_to_strings(props, date_fields)[source]

Convert datetime objects to strings in a dictionary for database storage

Useful for Neo4j, or other databases that require date strings. Mutates the input dictionary in place.

Parameters:
  • props (dict[str, Any]) – Dictionary of properties (will be modified)

  • date_fields (list[str]) – List of field names that contain dates

Returns:

Modified properties dict with dates as ISO format strings

Return type:

dict[str, Any]

Examples

>>> data = {'created_at': datetime(2024, 1, 1), 'name': 'Test'}
>>> convert_dates_to_strings(data, ['created_at'])
{'created_at': '2024-01-01T00:00:00', 'name': 'Test'}

lalandre_core.utils.metrics_utils

Source: packages/lalandre_core/lalandre_core/utils/metrics_utils.py

Shared Prometheus metric helpers reused across services.

lalandre_core.utils.metrics_utils.status_class(status_code)[source]

Collapse an HTTP status code into its class label such as 2xx.

Parameters:

status_code (int)

Return type:

str

lalandre_core.utils.metrics_utils.normalize_label(value)[source]

Normalize arbitrary metric label values into a lowercase token.

Parameters:

value (Any)

Return type:

str

lalandre_core.utils.metrics_utils.normalize_search_mode(mode)[source]

Normalize a search mode label to one of the supported metric values.

Parameters:

mode (str | None)

Return type:

str

lalandre_core.utils.metrics_utils.normalize_granularity(granularity)[source]

Normalize a granularity label to one of the metric-safe values.

Parameters:

granularity (str | None)

Return type:

str

lalandre_core.utils.metrics_utils.classify_error(exc_or_reason)[source]

Classify an error into (provider, error_type) for metrics labeling.

Parameters:

exc_or_reason (Any)

Return type:

tuple[str, str]

lalandre_core.utils.mode_aliases

Source: packages/lalandre_core/lalandre_core/utils/mode_aliases.py

RAG query mode aliases.

Single source of truth for legacy mode names → canonical mode mapping. Used by both api-gateway and rag-service to resolve mode aliases consistently.

lalandre_core.utils.mode_aliases.resolve_mode_alias(mode)[source]

Return the canonical mode name, resolving legacy aliases.

Parameters:

mode (str)

Return type:

str

lalandre_core.utils.parse_utils

Source: packages/lalandre_core/lalandre_core/utils/parse_utils.py

Generic parsing utilities.

lalandre_core.utils.parse_utils.extract_json_object(text)[source]

Try to extract a JSON object from text (which may contain markdown fences).

Returns the first valid dict found, or None.

Parameters:

text (str)

Return type:

Dict[str, Any] | None

lalandre_core.utils.parse_utils.as_dict(value)[source]

Return value if it is a dict, else an empty dict.

Parameters:

value (Any)

Return type:

Dict[str, Any]

lalandre_core.utils.parse_utils.as_optional_dict(value)[source]

Return value if it is a dict, else None.

Parameters:

value (Any)

Return type:

Dict[str, Any] | None

lalandre_core.utils.parse_utils.as_str(value, *, default='')[source]

Coerce value to str.

Parameters:
  • value (Any)

  • default (str)

Return type:

str

lalandre_core.utils.parse_utils.as_document_list(value)[source]

Return only the dict items from value (must be a list).

Parameters:

value (Any)

Return type:

List[Dict[str, Any]]

lalandre_core.utils.parse_utils.to_optional_int(value)[source]

Convert value to int when possible, else None.

Parameters:

value (Any)

Return type:

int | None

lalandre_core.utils.parse_utils.sanitize_error_text(error, *, max_chars=220)[source]

Return a truncated, safe string representation of error.

Parameters:
  • error (Exception)

  • max_chars (int)

Return type:

str

lalandre_core.utils.parse_utils.coerce_bool(value, default)[source]

Return value if it is already a bool, else default.

Parameters:
  • value (Any)

  • default (bool)

Return type:

bool

lalandre_core.utils.parse_utils.coerce_float(value, default)[source]

Return value cast to float when it is numeric, else default.

Parameters:
  • value (Any)

  • default (float)

Return type:

float

lalandre_core.utils.regulatory_level

Source: packages/lalandre_core/lalandre_core/utils/regulatory_level.py

Regulatory level inference from act metadata.

EU financial regulation follows a 3-level hierarchy:

1 (L1) — Framework legislation (Regulations, Directives) adopted by Parliament/Council 2 (L2) — Delegated/implementing acts (RTS, ITS) adopted by the Commission 3 (L3) — Supervisory guidance (Guidelines, Q&A, Recommendations) by ESA (EBA/ESMA/EIOPA)

lalandre_core.utils.regulatory_level.infer_regulatory_level(celex, act_type, title=None, form_number=None)[source]

Infer regulatory level from act metadata.

Returns 1 (L1), 2 (L2), 3 (L3), or None (outside scope).

Parameters:
  • celex (str)

  • act_type (str)

  • title (str | None)

  • form_number (str | None)

Return type:

int | None

lalandre_core.utils.regulatory_level.level_to_label(level)[source]

Convert numeric level to display label: 1→’L1’, 2→’L2’, 3→’L3’.

Parameters:

level (int | None)

Return type:

str | None

lalandre_core.utils.shared_key_pool

Source: packages/lalandre_core/lalandre_core/utils/shared_key_pool.py

Helpers for dispatching calls across a shared API key pool.

lalandre_core.utils.shared_key_pool.build_clients_by_key(*, key_pool, factory)[source]

Build one client instance per key in the shared pool.

Parameters:
  • key_pool (APIKeyPool)

  • factory (Callable[[str], T])

Return type:

Dict[str, T]

class lalandre_core.utils.shared_key_pool.SharedKeyPoolProxy(*, key_pool, clients_by_key)[source]

Bases: object

Delegate each callable access to the next client selected by key_pool.

Parameters:
  • key_pool (APIKeyPool)

  • clients_by_key (Mapping[str, Any])

lalandre_core.utils.text_utils

Source: packages/lalandre_core/lalandre_core/utils/text_utils.py

Text normalization utilities.

lalandre_core.utils.text_utils.strip_accents(text)[source]

Remove diacritics (accents) from text via NFKD decomposition.

Parameters:

text (str)

Return type:

str

lalandre_core.utils.text_utils.normalize_text(text)[source]

Lowercase, strip accents, collapse whitespace, remove special chars.

Keeps: word characters, whitespace, slashes, parens, dots, colons, hyphens.

Parameters:

text (str)

Return type:

str