In this guide, we will walk you through how to work with the WizardLM-2-8x22B model, specifically focusing on its EXL2 quantization and perplexity scoring. Whether you are a beginner or an experienced developer, you will find this article user-friendly and illustrative.
What is WizardLM-2-8x22B?
WizardLM-2-8x22B is a sophisticated language model from Microsoft designed to handle complex text generation tasks. The EXL2 version enhances its performance through quantization, allowing it to be both efficient and effective.
Getting Started with EXL2 Version
The quants developed using the EXL2 framework were built with the exllamav2 version 0.0.18. If you are using an older version, you may face compatibility issues. To ensure smooth operation, update your Text Generation WebUI to the latest version.
Perplexity Scoring Explained
Perplexity scoring is an essential metric used to evaluate language models. It indicates how well a probability distribution predicts a sample and lower scores are preferable. Below is a table representing the perplexity scores associated with different quant levels of the EXL2 models:
Quant Level Perplexity Score
-------------------------------
7.0 4.5859
6.0 4.6252
5.5 4.6493
5.0 4.6937
4.5 4.8029
4.0 4.9372
3.5 5.1336
3.25 5.3636
3.0 5.5468
2.75 5.8255
2.5 6.3362
2.25 7.7763
Running the Perplexity Test
To evaluate the model’s perplexity, you can use the following bash script. This script serves as a guide and should help you obtain the perplexity scores you need.
#!/bin/bash
# Activate the conda environment
source ~/miniconda3/etc/profile.d/conda.sh
conda activate exllamav2
DATA_SET=root/wikitext/wikitext-2-v1.parquet
# Set the model name and bit size
MODEL_NAME=WizardLM-2-8x22B
BIT_PRECISIONS=(6.0 5.5 5.0 4.5 4.0 3.5 3.25 3.0 2.75 2.5 2.25)
# Print the markdown table header
echo "Quant Level Perplexity Score"
echo "-------------------------------"
for BIT_PRECISION in ${BIT_PRECISIONS[@]}
do
LOCAL_FOLDER=root/models/$MODEL_NAME/exl2_$BIT_PRECISION/bpw
REMOTE_FOLDER=Dracones/$MODEL_NAME/exl2_$BIT_PRECISION/bpw
if [ ! -d $LOCAL_FOLDER ]; then
huggingface-cli download --local-dir-use-symlinks=False --local-dir $LOCAL_FOLDER $REMOTE_FOLDER root/download.log 21
fi
output=$(python test_inference.py -m $LOCAL_FOLDER -gs 40,40,40,40 -ed $DATA_SET)
score=$(echo $output | grep -oP "Evaluation perplexity: K[\d.]+")
echo "$BIT_PRECISION $score"
done
Understanding the Perplexity Script
Imagine you’re a chef, and every day you prepare a different dish. The perplexity script is like your recipe book; it outlines the ingredients (data set and model name) and the steps needed to make the perfect dish (perplexity score). Each BIT_PRECISION is akin to tweaking your dish’s flavors; by adjusting these values, you create a savory result that’s well-optimized for your taste—or in this case, well-optimized for model performance!
Quantization Process
To create the quantization for the WizardLM-2-8x22B model, you can use the following bash script. This establishes the framework for quantizing the model based on varying BIT_PRECISION. Here’s how you can do that:
#!/bin/bash
# Activate the conda environment
source ~/miniconda3/etc/profile.d/conda.sh
conda activate exllamav2
# Set the model name and bit size
MODEL_NAME=WizardLM-2-8x22B
# Define variables
MODEL_DIR=/mnt/storage/models/$MODEL_NAME
OUTPUT_DIR=exl2_$MODEL_NAME
MEASUREMENT_FILE=measurements/$MODEL_NAME.json
# Create the measurement file if needed
if [ ! -f $MEASUREMENT_FILE ]; then
echo "Creating $MEASUREMENT_FILE"
# Create directories
if [ -d $OUTPUT_DIR ]; then
rm -r $OUTPUT_DIR
fi
mkdir $OUTPUT_DIR
python convert.py -i $MODEL_DIR -o $OUTPUT_DIR -nr -om $MEASUREMENT_FILE
fi
# Choose one of the below. Either create a single quant for testing or a batch of them.
BIT_PRECISIONS=(5.0 4.5 4.0 3.5 3.0 2.75 2.5 2.25)
for BIT_PRECISION in ${BIT_PRECISIONS[@]}
do
CONVERTED_FOLDER=models/$MODEL_NAME/exl2_$BIT_PRECISION/bpw
# If it doesn't already exist, make the quant
if [ ! -d $CONVERTED_FOLDER ]; then
echo "Creating $CONVERTED_FOLDER"
# Create directories
if [ -d $OUTPUT_DIR ]; then
rm -r $OUTPUT_DIR
fi
mkdir $OUTPUT_DIR
mkdir $CONVERTED_FOLDER
# Run conversion commands
python convert.py -i $MODEL_DIR -o $OUTPUT_DIR -nr -m $MEASUREMENT_FILE -b $BIT_PRECISION -cf $CONVERTED_FOLDER
fi
done
Common Troubleshooting Tips
If you encounter issues while using the WizardLM-2-8x22B model, consider the following troubleshooting steps:
- Ensure that your
exllamav2package is up-to-date. - Double-check the paths used in your scripts to ensure they point to the correct directories.
- Confirm that the model files have been downloaded properly without any interruptions.
- Review any error messages carefully, as they often provide clues to what went wrong.
- For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
Conclusion
By following this guide, you should now be equipped to work effectively with the WizardLM-2-8x22B model and understand its perplexity scoring system. At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.
