Rerankers are sophisticated models that assess the relevance of documents based on a given query, delivering a similarity score directly. Unlike conventional embedding models, which simply convert text into vectors, rerankers provide a score that can be interpreted to understand how relevant a document is to a specific query. In this guide, we’ll explore how to utilize the reranker models in FlagEmbedding effectively.
Getting Started with FlagEmbedding
To get started with FlagEmbedding, you’ll first need to ensure that you have it installed on your environment. Here’s how to do this:
pip install -U FlagEmbedding
Understanding Rerankers
Rerankers can be likened to expert judges in a literary contest. Imagine you have a panel of judges (rerankers) who read the same submissions (documents) and answer a question (query) about which pieces are most relevant. Each judge scores the submissions based on his or her understanding of the query. Similarly, the rerankers output a score indicating the relevance of the text based on the input query.
Using the Reranker
After installation, you can easily compute relevance scores with the following code snippet. The examples below demonstrate how to perform various operations with the rerankers.
Standard Reranker Usage
This section demonstrates how to use a standard reranker. Choosing the `bge-reranker-v2-m3` for example:
from FlagEmbedding import FlagReranker
reranker = FlagReranker("BAAIbge-reranker-v2-m3", use_fp16=True) # Enable fp16 for faster computation
score = reranker.compute_score([query, passage])
print(score)
Mapping Scores
If you want to normalize the scores between 0 and 1, set the `normalize` parameter to `True`:
score = reranker.compute_score([query, passage], normalize=True)
print(score)
Batch Processing
Rerankers can also handle multiple pairs at once. Here’s how to do that:
scores = reranker.compute_score([[query1, passage1], [query2, passage2]], normalize=True)
print(scores)
Fine-tuning the Reranker
Fine-tuning enhances model performance using a specific dataset. To streamline this process, ensure your data is formatted as JSON, structured as follows:
{
"query": "your query",
"pos": ["relevant text 1", "relevant text 2"],
"neg": ["irrelevant text 1", "irrelevant text 2"],
"prompt": "explain the relationship"
}
To fine-tune the model, execute the following command in your shell:
torchrun --nproc_per_node number_of_gpus -m FlagEmbedding.llm_reranker.finetune_for_instruction.run --output_dir path_to_save_model --model_name_or_path googlegemma-2b --train_data toy_finetune_data.jsonl --learning_rate 2e-4 --num_train_epochs 1 --per_device_train_batch_size 1
Troubleshooting
If you encounter issues during installation or execution, here are some troubleshooting tips:
- Ensure Dependencies: Make sure all dependencies are correctly installed by checking the package versions.
- Python Version: Verify that you are using a compatible version of Python as some libraries may require specific versions.
- Resources: Ensure your system has sufficient memory and processing power to run the models, particularly for LLM-based rerankers.
- Error Handling: If you encounter errors regarding model loading, ensure the model name is correctly specified and is available on Hugging Face’s Model Hub.
For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
Conclusion
Utilizing rerankers effectively can vastly improve search relevance in applications ranging from search engines to chatbots. By following the instructions in this guide, you should be well-equipped to implement and fine-tune rerankers using FlagEmbedding.
At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

