How to Use LLM2Vec: Transforming Language Models into Powerful Text Encoders

Apr 30, 2024 | Educational

LLM2Vec is an exciting tool that converts decoder-only large language models into efficient text encoders. This guide walks you through the process of setting it up, using it, and troubleshooting any issues you might encounter along the way.

What You Need to Get Started

Python: Ensure you have Python installed on your machine.
pip: The Python package manager should be available.
Access to GPU (optional but recommended): For enhanced performance, especially during model training.

Installation Steps

Follow these steps to install LLM2Vec:

bash
pip install llm2vec

Usage Walkthrough

Now that you have everything set up, let’s dive into using LLM2Vec. The usage can be thought of as making a sandwich: you need to layer the ingredients just right to create a delicious final product.

1. Import the Necessary Libraries

First, you need to bring in the required libraries:

python
from llm2vec import LLM2Vec
import torch
from transformers import AutoTokenizer, AutoModel, AutoConfig
from peft import PeftModel

2. Load the Model

Next, imagine you’re assembling your sandwich layers:

Start with the tokenizer as the base.
Layer on the configuration for the model.
Add the model itself.
Finally, introduce additional components like LoRA weights.

python
# Loading base Mistral model
tokenizer = AutoTokenizer.from_pretrained("McGill-NLPLLM2Vec-Meta-Llama-3-8B-Instruct-mntp")
config = AutoConfig.from_pretrained("McGill-NLPLLM2Vec-Meta-Llama-3-8B-Instruct-mntp", trust_remote_code=True)
model = AutoModel.from_pretrained("McGill-NLPLLM2Vec-Meta-Llama-3-8B-Instruct-mntp", 
    trust_remote_code=True, config=config, torch_dtype=torch.bfloat16, 
    device_map="cuda" if torch.cuda.is_available() else "cpu")
# Merging and unloading model weights
model = PeftModel.from_pretrained(model, "McGill-NLPLLM2Vec-Meta-Llama-3-8B-Instruct-mntp")
model = model.merge_and_unload()

3. Encode Queries and Documents

With the model loaded, you can start encoding your text. Think of this as cutting your sandwich to serve it! Use the LLM2Vec wrapper for encoding queries and documents.

python
# Wrapper for encoding and pooling operations
l2v = LLM2Vec(model, tokenizer, pooling_mode="mean", max_length=512)

# Encoding queries
instruction = "Given a web search query, retrieve relevant passages that answer the query:"
queries = [
    [instruction, "how much protein should a female eat"],
    [instruction, "summit define"],
]
q_reps = l2v.encode(queries)

# Encoding documents
documents = [
    "As a general guideline, the CDC's average requirement of protein for women ages 19 to 70 is 46 grams per day. But, as you can see from this chart, you'll need to increase that if you're expecting or training for a marathon.",
    "Definition of summit for English Language Learners: 1 the highest point of a mountain : the top of a mountain. : 2 the highest level. : 3 a meeting or series of meetings between the leaders of two or more governments."
]
d_reps = l2v.encode(documents)

4. Compute Cosine Similarity

The final step for understanding the similarity between your encoded queries and documents is akin to tasting your sandwich before serving:

python
# Compute cosine similarity
q_reps_norm = torch.nn.functional.normalize(q_reps, p=2, dim=1)
d_reps_norm = torch.nn.functional.normalize(d_reps, p=2, dim=1)
cos_sim = torch.mm(q_reps_norm, d_reps_norm.transpose(0, 1))
print(cos_sim)

Troubleshooting

Encountering issues? Here are some potential fixes:

Ensure that all library imports are correct and installed properly.
Check if your GPU is recognized by PyTorch by running:

python
    print(torch.cuda.is_available())

If you experience long loading times, consider merging models on a GPU for faster results.
Refer to the documentation for any updates or changes.
If you continue to experience difficulties, don’t hesitate to reach out. For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox