OminiX
Full-Stack Pure-Rust Local AI Platform
Run multimodal AI entirely on Apple Silicon. LLMs, image generation, voice cloning, and speech recognition — all in pure Rust with zero Python dependencies.
Explore on GitHub
Browse repositories, contribute, and build with OminiX.
github.com/OminiX-ai45 tok/s
LLM Inference (Qwen3-4B)
18x
Real-time ASR Speed
4x
Real-time TTS Synthesis
Benchmarked on Apple M3 Max (128GB)
The OminiX Stack
OminiX-MLX
Inference Engine
Safe Rust bindings to Apple MLX with 14 model crates. GPU-accelerated via Metal with unified memory and zero-copy CPU-GPU data transfer. Supports LLMs, ASR, TTS, and image generation.
OminiX-API
OpenAI-Compatible API Server
Drop-in replacement for OpenAI API running locally on your Mac. Dynamic model loading, WebSocket TTS streaming, and automatic model management — no restart required.
Why OminiX
Pure Rust, Zero Python
The entire stack is written in Rust. No Python runtime, no dependency hell. Single binary via cargo build --release.
Metal GPU Acceleration
Unified memory with zero-copy CPU-GPU data transfer. Lazy evaluation enables kernel fusion for maximum throughput on Apple Silicon.
Multimodal On-Device
LLMs, image generation (FLUX, Z-Image), voice cloning (GPT-SoVITS), and speech recognition (Paraformer) — all running locally.
Full Memory Safety
Rust's ownership model extends to GPU operations. Memory safety from inference kernels to HTTP server, eliminating entire classes of bugs.
OpenAI-Compatible
Drop-in replacement for OpenAI endpoints. Any app using the OpenAI API works with OminiX — just point it to localhost.
Open Source
Dual-licensed under MIT and Apache 2.0. Part of the Moxin open-source AI ecosystem. Fully transparent, community-driven.
Quick Start
Requires macOS 14.0+ (Sonoma) • Apple Silicon (M1/M2/M3/M4) • Rust 1.82+ • Xcode Command Line Tools
# Clone and build the API server
git clone https://github.com/OminiX-ai/OminiX-API.git
cd OminiX-API && cargo build --release
# Run with a language model
LLM_MODEL=mlx-community/Qwen3-4B-bf16 cargo run --release
# Run with all capabilities
PORT=8080 LLM_MODEL=mlx-community/Qwen3-4B-bf16 \
ASR_MODEL_DIR=./models/paraformer \
TTS_REF_AUDIO=./audio/reference.wav \
IMAGE_MODEL=zimage cargo run --release