
Usage & Documentation
Get started with Moxin-LLM. Find guides for running inference, optimizing for deployment, and fine-tuning the model for your own applications.
Quick Start
Run Moxin-LLM in Minutes
Get up and running quickly using the Hugging Face `transformers` library. This example uses the `Moxin-7B-Instruct` model.
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "moxin-org/moxin-instruct-7b"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
# Format the prompt for instruction-following
prompt = "Can you explain the concept of regularization in machine learning?"
formatted_prompt = f"<|user|>\n{prompt}<|end|>\n<|assistant|>"
inputs = tokenizer(formatted_prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=200)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Model Guides
Prompting for Best Results
Using Moxin-7B-Instruct
The instruct model is fine-tuned for dialogue and following commands. For best results, structure your prompts as a conversation. The example above shows a standard format.
Using Moxin-7B-Reasoning
This model excels at chain-of-thought (CoT) tasks like math and logic. It was enhanced with Group Relative Policy Optimization (GRPO). To leverage its full potential, ask it to "think step-by-step" or "show its work."
Deployment & Optimization
High Performance on the Edge
Optimized for On-Device AI
Moxin-LLM is specifically designed for efficient performance on edge devices like PCs and mobile phones. This focus addresses the need for privacy and low-latency applications.
The OminiX Engine
For best performance, we recommend using our self-developed OminiX inference and fine-tuning engine. It is optimized for various edge hardware, including domestic NPUs.
Proven Efficiency
Our optimization techniques are powerful enough to deploy a 235B parameter model on a single notebook computer, achieving speeds of around 14 tokens per second.
Fine-Tuning Moxin-LLM
Leverage Moxin's complete openness to create your own specialized models. Our transparency with training data and scripts makes the fine-tuning process more efficient and effective.
Step 1: Start with Moxin-7B-Base
The `Moxin-7B-Base` model is the ideal starting point for any custom fine-tuning project.
Step 2: Prepare Your Custom Dataset
Collect and format your data for a specific task, such as robotics commands, professional translation terms, or any other domain-specific knowledge.
Step 3: Run the Fine-Tuning Process
Use standard open-source training scripts to fine-tune the model on your dataset. Our open approach ensures you have full control and visibility.
Step 4: Deploy Your Custom Model
Once trained, your specialized model is ready to be deployed, bringing powerful, customized AI to your specific application.