How to Use the MN-12B-Starcannon-v2 Exl2 Quantized Model

Aug 5, 2024 | Educational

homemayankDocumentsarticle-generation-using-llmresized_imagesreadme_16_45

Welcome to an exciting exploration of the MN-12B-Starcannon-v2 Exl2 quantized model. In this article, we will guide you through the process of downloading and running this model efficiently, while also addressing common issues you might encounter along the way.

Understanding the Quantized Model

Think of the MN-12B-Starcannon-v2 model like a beautifully designed vehicle—its performance is impressive, but how well it runs depends on the weight of the passengers. Similarly, this quantized version (Exl2) optimizes the model’s capabilities by adjusting how much “weight” (data) it carries based on your hardware’s capacity. The branches available offer different “bit per weight” options, making it flexible according to your VRAM limits.

main: Contains measurement files
4bpw: 4 bits per weight
5bpw: 5 bits per weight
6bpw: 6 bits per weight (Recommended for quality)

Downloading the Model

To get started, you’ll need to download the model using one of the two methods below:

Method 1: Using async-hf-downloader

This is a lightweight and asynchronous downloader specifically created for HuggingFace models.

./async-hf-downloader royallab/MN-12B-Starcannon-v2-exl2 -r 6bpw -p MN-12B-Starcannon-v2-exl2-6bpw

Method 2: Using HuggingFace Hub

For this method, make sure you have installed the huggingface_hub package using pip install huggingface_hub.

huggingface-cli download royallab/MN-12B-Starcannon-v2-exl2 --revision 6bpw --local-dir MN-12B-Starcannon-v2-exl2-6bpw

Running the Model in TabbyAPI

Once the model is downloaded, you’ll want to run it using TabbyAPI, a FastAPI server developed for efficiency.

Setting Up TabbyAPI

Open config.yml inside TabbyAPI.
Set model_name to MN-12B-Starcannon-v2-exl2-6bpw.
Alternatively, you can start TabbyAPI with an argument:

--model_name MN-12B-Starcannon-v2-exl2-6bpw
Or use the /v1/model/load endpoint.

Launch TabbyAPI by running the startup script:

./start.bat or ./start.sh

Troubleshooting

If you encounter issues while setting up or running the model, here are some common troubleshooting steps:

Problem: Unable to download the model?
Solution: Check that you have a stable internet connection and that the HuggingFace libraries are correctly installed.
Problem: Model not starting in TabbyAPI?
Solution: Ensure that you have correctly set the model name in the configuration file and that all required dependencies are installed.
Problem: Facing memory issues?
Solution: If you are using more than 6 bits per weight, try switching to 6bpw for improved VRAM efficiency. Quants greater than 6bpw will not provide any performance gains.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

Conclusion

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox