Embedding API¶
Note
This page is generated automatically from the repository’s maintained Python module inventory.
Embedding providers, payload preparation, and the embedding service layer.
lalandre_embedding¶
Source: packages/lalandre_embedding/lalandre_embedding/__init__.py
Embedding package public API.
lalandre_embedding.base¶
Source: packages/lalandre_embedding/lalandre_embedding/base.py
Base class for embedding providers
- class lalandre_embedding.base.EmbeddingProvider[source]¶
Bases:
ABCAbstract base class for embedding providers
- abstractmethod embed_text(text)[source]¶
Generate embedding for a single text
- Parameters:
text (str)
- Return type:
List[float]
- abstractmethod embed_batch(texts)[source]¶
Generate embeddings for multiple texts
- Parameters:
texts (List[str])
- Return type:
List[List[float]]
- abstractmethod get_vector_size()[source]¶
Get the dimension of the embedding vectors
- Return type:
int
lalandre_embedding.pipeline¶
Source: packages/lalandre_embedding/lalandre_embedding/pipeline.py
Reusable business logic for vector embedding.
Contains payload construction, batched embedding, mean-pooling, and incremental state tracking. The worker imports this module and delegates the heavy lifting here.
- class lalandre_embedding.pipeline.ChunkEmbeddingService(*args, **kwargs)[source]¶
Bases:
ProtocolEmbedding interface required by the chunk embedding pipeline.
- lalandre_embedding.pipeline.compute_payload_hash(payload)[source]¶
Stable hash of payload to detect changes across runs.
- Parameters:
payload (dict[str, Any])
- Return type:
str
- lalandre_embedding.pipeline.mean_pool_vectors(vectors)[source]¶
Return the element-wise mean of vectors, or
Noneif empty.- Parameters:
vectors (list[list[float]])
- Return type:
list[float] | None
- lalandre_embedding.pipeline.truncate_subdivision_content(content, max_chars)[source]¶
Truncate content for embedding if it exceeds max_chars.
- Parameters:
content (str)
max_chars (int)
- Return type:
str
- lalandre_embedding.pipeline.resolved_subdivision_embed_max_chars(config)[source]¶
Compute the max embedding chars from a
LalandreConfig.- Parameters:
config (Any)
- Return type:
int
- lalandre_embedding.pipeline.make_state_record(*, object_type, object_id, provider, model_name, vector_size, content_hash)[source]¶
Build one embedding-state upsert record.
- Parameters:
object_type (str)
object_id (int)
provider (str)
model_name (str)
vector_size (int)
content_hash (str)
- Return type:
dict[str, Any]
- lalandre_embedding.pipeline.extract_metadata(act)[source]¶
Extract metadata entries from an ORM act, or
None.- Parameters:
act (Any)
- Return type:
dict[str, str] | None
- lalandre_embedding.pipeline.build_subdivision_payload(payload_builder, subdivision, act, version, metadata, content_override=None)[source]¶
Assemble the Qdrant payload dict from ORM models for a subdivision.
- Parameters:
payload_builder (PayloadBuilder)
subdivision (Any)
act (Any)
version (Any | None)
metadata (dict[str, str] | None)
content_override (str | None)
- Return type:
dict[str, Any]
- lalandre_embedding.pipeline.build_chunk_payload(payload_builder, chunk, subdivision, act)[source]¶
Assemble the Qdrant payload dict from ORM models for a chunk.
- Parameters:
payload_builder (PayloadBuilder)
chunk (Any)
subdivision (Any)
act (Any)
- Return type:
dict[str, Any]
- lalandre_embedding.pipeline.build_act_document_payload(payload_builder, act, subdivisions, *, vector_method, chunk_count)[source]¶
Assemble the Qdrant payload dict from ORM models for a whole-act vector.
- Parameters:
payload_builder (PayloadBuilder)
act (Any)
subdivisions (list[Any])
vector_method (str)
chunk_count (int)
- Return type:
dict[str, Any]
- class lalandre_embedding.pipeline.EmbedBatchResult(points=<factory>, state_records=<factory>, delete_filters=<factory>, embedded_count=0, skipped_count=0)[source]¶
Bases:
objectResult of preparing one embedding batch — ready to be persisted by the caller.
- Parameters:
points (list[VectorPoint])
state_records (list[dict[str, Any]])
delete_filters (list[dict[str, Any]])
embedded_count (int)
skipped_count (int)
- lalandre_embedding.pipeline.prepare_subdivision_embeddings(*, act, subdivisions, version, metadata, embedding_service, payload_builder, state_map, batch_size, max_batch_size, max_chars, provider, model_name, vector_size, force=False)[source]¶
Prepare embedding batches for subdivisions. Pure logic — no DB access.
- Parameters:
act (Any) – The ORM act object (for payload building).
subdivisions (list[Any]) – ORM subdivision objects.
version (Any | None) – Current version for the act (or
None).metadata (dict[str, str] | None) – Act metadata dict (or
None).embedding_service (ChunkEmbeddingService) – Service to produce vectors.
payload_builder (PayloadBuilder) – Builds Qdrant payloads.
state_map (dict[int, str]) – Pre-fetched
{subdivision_id: content_hash}from embedding_state.max_batch_size (int) – Embedding batch sizing.
max_chars (int) – Truncation limit for subdivision content.
vector_size (int) – Embedding identity.
force (bool) – If
True, embed everything regardless of state_map.batch_size (int)
max_batch_size
provider (str)
model_name (str)
vector_size
- Returns:
One
EmbedBatchResultper batch, ready to be persisted.- Return type:
list[EmbedBatchResult]
- lalandre_embedding.pipeline.prepare_chunk_embeddings(*, chunks, act, embedding_service, payload_builder, state_map, batch_size, retrieval_segment_chars, retrieval_segment_overlap, provider, model_name, vector_size, force)[source]¶
Prepare embedding batches for chunks. Pure logic — no DB access.
- Parameters:
chunks (list[Any]) – List of
(chunk, subdivision)tuples.act (Any) – The ORM act object (for payload building).
embedding_service (ChunkEmbeddingService) – Service to produce vectors.
payload_builder (PayloadBuilder) – Builds Qdrant payloads.
state_map (dict[int, str]) – Pre-fetched
{chunk_id: content_hash}from embedding_state.batch_size (int) – Embedding batch sizing.
retrieval_segment_overlap (int) – Target segmentation budget for long article retrieval vectors.
vector_size (int) – Embedding identity.
force (bool) – If
True, embed everything regardless of state_map.retrieval_segment_chars (int)
retrieval_segment_overlap
provider (str)
model_name (str)
vector_size
- Returns:
One
EmbedBatchResultper batch, ready to be persisted.- Return type:
list[EmbedBatchResult]
- lalandre_embedding.pipeline.prepare_act_document_embedding(*, act, subdivisions, chunks, chunk_vectors, payload_builder, state_map, provider, model_name, vector_size, force=False)[source]¶
Prepare whole-act embedding via mean pooling. Pure logic — no DB access.
- Parameters:
act (Any) – The ORM act object.
subdivisions (list[Any]) – ORM subdivision objects for the act.
chunks (list[Any]) – ORM
(chunk, subdivision)tuples for the act.chunk_vectors (dict[int, list[float]]) – Pre-fetched
{chunk_id: vector}from Qdrant.payload_builder (PayloadBuilder) – Builds Qdrant payloads.
state_map (dict[int, str]) – Pre-fetched
{act_id: content_hash}from embedding_state.vector_size (int) – Embedding identity.
force (bool) – If
True, embed regardless of state_map.provider (str)
model_name (str)
vector_size
- Returns:
EmbedBatchResultwith a single point, orNoneif skipped.- Return type:
EmbedBatchResult | None
lalandre_embedding.providers¶
Source: packages/lalandre_embedding/lalandre_embedding/providers/__init__.py
Embedding provider implementations.
lalandre_embedding.providers.local¶
Source: packages/lalandre_embedding/lalandre_embedding/providers/local.py
Local embedding provider
- class lalandre_embedding.providers.local.LocalEmbedding(model_name=None, device=None, cache_dir=None, normalize_embeddings=True, enable_cache=None, cache_max_size=None)[source]¶
Bases:
EmbeddingProvider,SupportsCacheSizeLocal embedding provider using sentence-transformers with in-memory LRU cache
- Parameters:
model_name (str | None)
device (str | None)
cache_dir (str | None)
normalize_embeddings (bool)
enable_cache (bool | None)
cache_max_size (int | None)
- estimate_tokens(text)[source]¶
Estimate token usage with the underlying sentence-transformers tokenizer.
- Parameters:
text (str)
- Return type:
int | None
- get_max_input_tokens()[source]¶
Return the maximum sequence length supported by the local model.
- Return type:
int | None
- embed_text(text)[source]¶
Generate embedding with cache support
- Parameters:
text (str)
- Return type:
List[float]
- embed_batch(texts)[source]¶
Generate batch embeddings with cache support
- Parameters:
texts (List[str])
- Return type:
List[List[float]]
lalandre_embedding.providers.mistral¶
Source: packages/lalandre_embedding/lalandre_embedding/providers/mistral.py
Mistral embedding provider with multi-key round-robin support and Redis cache.
Token counting uses mistral-common with the v1 SentencePiece tokenizer
(the tokenizer family used by mistral-embed).
- class lalandre_embedding.providers.mistral.MistralEmbedding(model_name=None, redis_client=None, cache_ttl=None, key_pool=None)[source]¶
Bases:
EmbeddingProvider,SupportsNumKeysMistral embedding provider with multi-key round-robin support and Redis cache
- Parameters:
model_name (str | None)
redis_client (Any | None)
cache_ttl (int | None)
key_pool (APIKeyPool | None)
- get_max_input_tokens()[source]¶
Return the documented maximum input length for
mistral-embed.- Return type:
int | None
- estimate_tokens(text)[source]¶
Count tokens using the Mistral v1 SentencePiece tokenizer.
- Parameters:
text (str)
- Return type:
int | None
- embed_text(text)[source]¶
Generate embedding using next available client (round-robin) with cache
- Parameters:
text (str)
- Return type:
List[float]
- embed_batch(texts)[source]¶
Generate batch embeddings using next available client (round-robin) with cache
- Parameters:
texts (List[str])
- Return type:
List[List[float]]
lalandre_embedding.service¶
Source: packages/lalandre_embedding/lalandre_embedding/service.py
Embedding service.
Supports multiple providers (Mistral API, local sentence-transformers) with automatic token-limit guard, adaptive text splitting, and weighted-average aggregation for long documents.
- class lalandre_embedding.service.EmbeddingService(provider=None, model_name=None, api_key=None, device=None, cache_dir=None, normalize_embeddings=None, enable_cache=None, cache_max_size=None, key_pool=None)[source]¶
Bases:
EmbeddingProviderSupports multiple providers: Mistral and local models.
The service itself exposes the EmbeddingProvider interface so callers can benefit from token guards and adaptive splitting without reaching into the raw provider implementation.
Initialize embedding service
- Parameters:
provider (str | None)
model_name (str | None)
api_key (str | None)
device (str | None)
cache_dir (str | None)
normalize_embeddings (bool | None)
enable_cache (bool | None)
cache_max_size (int | None)
key_pool (Any | None)
- embed_text(text)[source]¶
Generate embedding for a single text :returns: List of floats representing the embedding vector
- Parameters:
text (str)
- Return type:
List[float]
- embed_batch(texts, batch_size=None)[source]¶
Generate embeddings for multiple texts efficiently
- Parameters:
texts (List[str]) – List of input texts
batch_size (int | None) – Number of texts to process at once (None = use config default)
- Returns:
List of embedding vectors
- Return type:
List[List[float]]
- estimate_tokens(text)[source]¶
Delegate to the internal token estimation chain (provider → char-ratio).
- Parameters:
text (str)
- Return type:
int | None