Models run inside custom extractors. The model registry handles downloading, caching, and serving — you just declare which model to use and the infrastructure shares it across all workers via Ray’s object store.Documentation Index
Fetch the complete documentation index at: https://docs.mixpeek.com/docs/llms.txt
Use this file to discover all available pages before exploring further.
Three Ways to Load Models
| Approach | When to Use |
|---|---|
| Built-in Models | Common tasks — embeddings, transcription, reranking. No code needed, just reference the feature URI. |
| HuggingFace Models | Any public HF model. Cached cluster-wide on first download. |
| Custom Models (Enterprise) | Your own fine-tuned weights uploaded as .tar.gz. Stored in S3, deployed to Ray. |
HuggingFace Models (Recommended)
UseLazyModelMixin in your extractor’s pipeline. Models load on first batch, not at actor creation, and are shared zero-copy across all workers.
LazyModelMixin Attributes
| Attribute | Type | Default | Description |
|---|---|---|---|
model_id | str | "" | HuggingFace model ID or namespace model ID |
model_class | str | "AutoModel" | Transformers model class name |
tokenizer_class | str | None | "AutoTokenizer" | Tokenizer class, or None to skip |
torch_dtype | str | "float32" | "float16", "float32", or "bfloat16" |
model_source | str | "huggingface" | "huggingface" or "namespace" |
self.get_model() to get a (model, tokenizer) tuple. Override _instantiate_model(cached_data) for non-standard architectures.
Custom Models (Enterprise)
Custom models require an Enterprise subscription. Contact sales to enable.
Model Versioning
Models are versioned independently. Deploy a new version alongside the existing one, test in staging, then shift traffic:mixpeek://my_extractor@2.0.0/my_embedding), so both versions can coexist.
Python SDK
Limits
| Limit | Value |
|---|---|
| Max models per namespace | 50 |
| Max archive size | 10 GB |
| Supported formats | pytorch, safetensors, onnx, huggingface |
Related
Custom Extractors
Package and deploy extractors that use these models.
Extractor Quickstart
Build a working extractor with model loading end-to-end.
Model API Reference
Upload, deploy, list, and delete model archives.
Self-Improving CV Pipeline
Full tutorial: deploy YOLO, annotate, fine-tune, redeploy.

