Welcome to our comprehensive guide on using the Exl2 quantized version of the MN-12B-Celeste-V1.9 model. In this article, we’ll walk you through everything from setting up to troubleshooting potential issues, all while making it as user-friendly as possible.
Understanding the MN-12B-Celeste Model
The MN-12B-Celeste-V1.9 model has been quantized for better performance and efficiency. Think of quantization like shrinking a large, complex book into a concise summary that still conveys the main ideas. This makes it easier to store and process without compromising too much on quality.
Getting Started: Downloading the Model
To get started, you will need to download the Exl2 quantized version of the model using either the async-hf-downloader or directly from the HuggingFace hub. Here’s how:
Method 1: Using async-hf-downloader
First, ensure you have the async-hf-downloader installed. Then, run the following command:
async-hf-downloader royallab/MN-12B-Celeste-V1.9-exl2 -r 6bpw -p MN-12B-Celeste-V1.9-exl2-6bpw
Method 2: Using HuggingFace Hub
Alternatively, open your terminal and run:
huggingface-cli download royallab/MN-12B-Celeste-V1.9-exl2 --revision 6bpw --local-dir MN-12B-Celeste-V1.9-exl2-6bpw
Setting Up TabbyAPI
Next, you’ll want to run the MN-12B-Celeste model using TabbyAPI. This is a FastAPI server tailored for ExllamaV2. Here’s how to set it up:
- Navigate to your TabbyAPI’s config.yml file.
- Set the model_name to
MN-12B-Celeste-V1.9-exl2-6bpw. - You can also launch TabbyAPI using the argument:
--model_name MN-12B-Celeste-V1.9-exl2-6bpwwhen starting it up.
Launch TabbyAPI inside your Python environment by executing either .start.bat (for Windows) or .start.sh (for Unix-based systems).
Troubleshooting
If you encounter issues while attempting to run this setup, consider the following troubleshooting tips:
- Model Not Found: Double-check the model name entered in
config.yml. Ensure that it matches exactly. - Download Errors: Verify your internet connection and try re-downloading the model using the provided commands.
- Runtime Issues: Ensure that your Python environment has all the necessary dependencies installed. Running
pip install -r requirements.txtin your TabbyAPI directory can help.
For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
Final Thoughts
Using the Exl2 quantized version of the MN-12B-Celeste-V1.9 model can significantly enhance your AI projects without demanding too much in terms of VRAM. Remember, the 6 bits per weight (6bpw) format is recommended for the best quality. As a bonus, quants greater than 6bpw won’t yield any improvements, so stick to this for optimal performance.
At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

