How to Use the UAE-Large-V1 Sentence Embedding Model

Jul 31, 2024 | Educational

Welcome to the world of AnglE, where we provide powerful sentence embeddings using the UAE-Large-V1 model. In this blog, we’ll guide you through how to effectively utilize this model for various tasks. Let’s dive in!

1. Installing Angle Package

Before you start, ensure you have the Angle package installed. You can do this easily by executing the following command:

python -m pip install -U angle-emb

2. Using UAE-Large-V1 for Different Tasks

There are two main types of tasks you can perform with UAE-Large-V1: Non-Retrieval Tasks and Retrieval Tasks. Let’s break them down:

A. Non-Retrieval Tasks

For this type of task, there’s no need for any prompts. Follow these steps:

  • Import the necessary modules.
  • Load the model with the specified pooling strategy.
  • Encode your sentences.
  • Compute the cosine similarity between the resulting vectors.

Here’s the code to help you set this up:


from angle_emb import AnglE
from angle_emb.utils import cosine_similarity

angle = AnglE.from_pretrained('WhereIsAIUAE-Large-V1', pooling_strategy='cls').cuda()
doc_vecs = angle.encode([
    'The weather is great!',
    'The weather is very good!',
    'I am going to bed'
], normalize_embedding=True)

for i, dv1 in enumerate(doc_vecs):
    for dv2 in doc_vecs[i + 1:]:
        print(cosine_similarity(dv1, dv2))

B. Retrieval Tasks

For retrieval tasks, you need to utilize a specific prompt for your queries. Here’s how:

  • Import the necessary modules.
  • Prepare your query with the appropriate prompt.
  • Encode your documents as shown below.

Here’s an example:


from angle_emb import AnglE, Prompts
from angle_emb.utils import cosine_similarity

angle = AnglE.from_pretrained('WhereIsAIUAE-Large-V1', pooling_strategy='cls').cuda()
qv = angle.encode(Prompts.C.format(text='What is the weather?'))
doc_vecs = angle.encode([
    'The weather is great!',
    'It is rainy today.',
    'I am going to bed'
])

for dv in doc_vecs:
    print(cosine_similarity(qv[0], dv))

3. Using Sentence Transformer’s Implementation

In addition to the previous method, you can utilize Sentence Transformer for encoding. Here’s how:


from angle_emb import Prompts
from sentence_transformers import SentenceTransformer

model = SentenceTransformer('WhereIsAIUAE-Large-V1').cuda()
qv = model.encode(Prompts.C.format(text='What is the weather?'))
doc_vecs = model.encode([
    'The weather is great!',
    'It is rainy today.',
    'I am going to bed'
])

for dv in doc_vecs:
    print(1 - spatial.distance.cosine(qv, dv))

Troubleshooting Steps

If you encounter issues while using the UAE-Large-V1 model, here are some steps you can take:

  • Ensure that all required libraries are installed and updated.
  • Double-check your prompts for retrieval tasks to make sure they conform to the expected format.
  • If your model fails to load, verify that the name specified in the from_pretrained function is correct.
  • Check if your GPU is correctly configured and accessible.
  • Look for errors or warnings in your console, as they often provide hints about what went wrong.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

Final Thoughts

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox