How to Use GTE-Qwen2-7B-Instruct for Text Embedding

Jun 30, 2024 | Educational

The GTE-Qwen2-7B-Instruct model is an advanced tool for text embedding, part of a suite of models that are designed to excel in tasks such as retrieval, classification, and clustering. In this guide, we’ll explore how to effectively utilize this model in your projects.

Model Overview

The GTE-Qwen2-7B-Instruct offers significant improvements over its predecessor by leveraging bidirectional attention mechanisms, enriched contextual understanding, and extensive training on diverse multilingual text datasets.

Getting Started

Requirements

To use this model, ensure you have the following libraries installed:

transformers>=4.39.2
flash_attn>=2.5.6

Setting Up Sentence Transformers

To start making embeddings, you will load the model as follows:

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("Alibaba-NLP/gte-Qwen2-7B-instruct", trust_remote_code=True)
model.max_seq_length = 8192

queries = [
    "how much protein should a female eat",
    "summit define",
]

documents = [
    "As a general guideline, the CDC's average requirement of protein for women ages 19 to 70 is 46 grams per day...",
    "Definition of summit for English Language Learners."
]

query_embeddings = model.encode(queries, prompt_name="query")
document_embeddings = model.encode(documents)
scores = (query_embeddings @ document_embeddings.T) * 100
print(scores.tolist())

Understanding the Code

Imagine you are a librarian (the model) trying to find the best books (documents) for some questions (queries). You first need to read the questions, understand what they mean in the context of your library, and then evaluate how well each book responds to those questions. The code above reflects this process:

Load the model: You import the model like inviting a librarian into your library.
Define queries and documents: Just like preparing questions from readers and the collection of books available.
Embedding creation: The model transforms both queries and documents into a mathematical representation (embeddings) for easier comparison.
Scoring: Finally, you assess how relevant each book is to the questions using a scoring system.

Advanced Usage with Transformers

If you want to implement more advanced functionalities like detailed instruction for tasks, take a look at this example:

import torch
import torch.nn.functional as F
from torch import Tensor
from transformers import AutoTokenizer, AutoModel

def last_token_pool(last_hidden_states: Tensor, attention_mask: Tensor) -> Tensor:
    ...
    
def get_detailed_instruct(task_description: str, query: str) -> str:
    return f'Instruct: {task_description}\nQuery: {query}'

task = 'Given a web search query, retrieve relevant passages that answer the query'
queries = [
    get_detailed_instruct(task, 'how much protein should a female eat'),
    get_detailed_instruct(task, 'summit define')
]

documents = [ 
    "As a general guideline...",
    "Definition of summit for English Language Learners."
]

input_texts = queries + documents
tokenizer = AutoTokenizer.from_pretrained('Alibaba-NLP/gte-Qwen2-7B-instruct', trust_remote_code=True)
model = AutoModel.from_pretrained('Alibaba-NLP/gte-Qwen2-7B-instruct', trust_remote_code=True)
max_length = 8192

batch_dict = tokenizer(input_texts, max_length=max_length, padding=True, truncation=True, return_tensors='pt')
outputs = model(**batch_dict)
embeddings = last_token_pool(outputs.last_hidden_state, batch_dict['attention_mask'])
embeddings = F.normalize(embeddings, p=2, dim=1)
scores = (embeddings[:2] @ embeddings[2:].T) * 100
print(scores.tolist())

Troubleshooting

Should you encounter issues during this process, here are some common troubleshooting tips:

Make sure the required libraries are installed and up to date.
If you receive errors related to resource limits, consider reducing the max_seq_length variable.
For questions about usage or improvements, refer to the model’s documentation.
Check if any URLs need changing in the code; alternatively, simply refresh your app to ensure the latest patches are in place.

Too many queries at once? Follow the model’s guidelines for rate limits.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

Conclusion

The GTE-Qwen2-7B-Instruct model is a powerful asset in the realm of text embeddings, providing tools for a variety of tasks. With clear instructions and adequate handling of its functionalities, you can leverage it to enhance your AI projects effectively.

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox