OminiX
Full-Stack Pure-Rust Local AI Platform

Run multimodal AI entirely on Apple Silicon. LLMs, image generation, voice cloning, and speech recognition — all in pure Rust with zero Python dependencies.

Explore on GitHub

Browse repositories, contribute, and build with OminiX.

github.com/OminiX-ai

45 tok/s

LLM Inference (Qwen3-4B)

18x

Real-time ASR Speed

4x

Real-time TTS Synthesis

Benchmarked on Apple M3 Max (128GB)

The OminiX Stack

OminiX-MLX

Inference Engine

Safe Rust bindings to Apple MLX with 14 model crates. GPU-accelerated via Metal with unified memory and zero-copy CPU-GPU data transfer. Supports LLMs, ASR, TTS, and image generation.

Qwen2/3 GLM-4 Mixtral Mistral Paraformer GPT-SoVITS FLUX.2-klein Z-Image
View Repository →

OminiX-API

OpenAI-Compatible API Server

Drop-in replacement for OpenAI API running locally on your Mac. Dynamic model loading, WebSocket TTS streaming, and automatic model management — no restart required.

/v1/chat/completions /v1/audio/transcriptions /v1/images/generations /ws/v1/tts
View Repository →

OminiX Studio

Native Desktop Application

All-in-one desktop app built with Makepad. Chat with LLMs, generate images, clone voices, and transcribe speech from a single native interface. Connects to local or cloud backends.

Multi-provider Chat Image Generation Voice I/O MCP Tools
View Repository →

Why OminiX

Pure Rust, Zero Python

The entire stack is written in Rust. No Python runtime, no dependency hell. Single binary via cargo build --release.

Metal GPU Acceleration

Unified memory with zero-copy CPU-GPU data transfer. Lazy evaluation enables kernel fusion for maximum throughput on Apple Silicon.

Multimodal On-Device

LLMs, image generation (FLUX, Z-Image), voice cloning (GPT-SoVITS), and speech recognition (Paraformer) — all running locally.

Full Memory Safety

Rust's ownership model extends to GPU operations. Memory safety from inference kernels to HTTP server, eliminating entire classes of bugs.

OpenAI-Compatible

Drop-in replacement for OpenAI endpoints. Any app using the OpenAI API works with OminiX — just point it to localhost.

Open Source

Dual-licensed under MIT and Apache 2.0. Part of the Moxin open-source AI ecosystem. Fully transparent, community-driven.

Quick Start

Requires macOS 14.0+ (Sonoma) • Apple Silicon (M1/M2/M3/M4) • Rust 1.82+ • Xcode Command Line Tools

# Clone and build the API server

git clone https://github.com/OminiX-ai/OminiX-API.git

cd OminiX-API && cargo build --release

# Run with a language model

LLM_MODEL=mlx-community/Qwen3-4B-bf16 cargo run --release

# Run with all capabilities

PORT=8080 LLM_MODEL=mlx-community/Qwen3-4B-bf16 \

  ASR_MODEL_DIR=./models/paraformer \

  TTS_REF_AUDIO=./audio/reference.wav \

  IMAGE_MODEL=zimage cargo run --release