In the rapidly evolving world of AI, our ability to represent and retrieve text has become increasingly sophisticated, largely thanks to advanced embedding models like gte-large-en-v1.5. Developed by the Institute for Intelligent Computing at Alibaba Group, this model supports long context lengths and delivers state-of-the-art performance on benchmark evaluations.
Introduction
The gte-large-en-v1.5 is an upgrade of the previously established gte embeddings, capable of processing contexts up to 8192 tokens, positioning it as a pivotal player in text representation and retrieval tasks.
How to Get Started with the Model
To use the model, follow the code snippets below tailored for different programming environments:
Python with Transformers
# Requires transformers>=4.36.0
import torch.nn.functional as F
from transformers import AutoModel, AutoTokenizer
input_texts = [
    "what is the capital of China?",
    "how to implement quick sort in python?",
    "Beijing",
    "sorting algorithms"
]
model_path = 'Alibaba-NLP/gte-large-en-v1.5'
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModel.from_pretrained(model_path, trust_remote_code=True)
# Tokenize the input texts
batch_dict = tokenizer(input_texts, max_length=8192, padding=True, truncation=True, return_tensors='pt')
outputs = model(**batch_dict)
embeddings = outputs.last_hidden_state[:, 0]  # (Optionally) normalize embeddings
embeddings = F.normalize(embeddings, p=2, dim=1)
scores = (embeddings[:1] @ embeddings[1:].T) * 100
print(scores.tolist())Adjustments for Enhanced Performance
For better efficiency, it is recommended to install xformers and enable unpadding for acceleration. For detailed instructions, refer to the enable-unpadding-and-xformers guide.
Using the Model with Sentence-Transformers
# Requires sentence_transformers>=2.7.0
from sentence_transformers import SentenceTransformer
from sentence_transformers.util import cos_sim
sentences = ['That is a happy person', 'That is a very happy person']
model = SentenceTransformer('Alibaba-NLP/gte-large-en-v1.5', trust_remote_code=True)
embeddings = model.encode(sentences)
print(cos_sim(embeddings[0], embeddings[1]))Using the Model with Transformers.js
// npm i @xenova/transformers
import { pipeline, dot } from '@xenova/transformers';
// Create feature extraction pipeline
const extractor = await pipeline('feature-extraction', 'Alibaba-NLP/gte-large-en-v1.5', {
    quantized: false, // Comment out this line to use the quantized version
});
// Generate sentence embeddings
const sentences = [
    "what is the capital of China?",
    "how to implement quick sort in python?",
    "Beijing",
    "sorting algorithms"
];
const output = await extractor(sentences, { normalize: true, pooling: 'cls' });
// Compute similarity scores
const [source_embeddings, ...document_embeddings] = output.tolist();
const similarities = document_embeddings.map(x => 100 * dot(source_embeddings, x));
console.log(similarities); // [41.86, 77.07, 37.03]Understanding the Code: An Analogy
Think of using the gte-large-en-v1.5 model as setting up a grand library filled with books (the data). Each book is meticulously indexed (tokenization) so that when a question, like “What is the capital of China?” is asked, the librarian (the model) can promptly find and present the right information without having to go through every single book in the library.
Troubleshooting
- If you encounter tokenization issues, ensure that your input texts are all in a compatible format.
- For installation problems with transformers, verify that your Python environment is correctly set up, and the correct version is being used.
- If you notice performance issues, consider enabling unpadding and using xformers for acceleration.
For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
Conclusion
At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

