Stella-PL Retrieval

Oct 28, 2024 | Educational

Welcome to the world of sentence similarity and text retrieval! In this blog, we will guide you through how to use the Stella-PL retrieval model, specifically fine-tuned for Polish information retrieval tasks. This powerful model utilizes encoders to process and understand text, making it a vital tool for your next project.

Understanding the Model

Imagine you’re exploring a vast library filled with books. In this analogy, the Stella-PL model serves as your expert librarian, efficiently locating books that best answer your inquiries. Working with a multilingual knowledge distillation method and a sizeable corpus, this model has been specially developed to understand Polish text and retrieve relevant passages effectively.

  • First Step: The model was adapted for Polish using a multilingual knowledge distillation method with a diverse corpus of 20 million Polish-English text pairs.
  • Second Step: Fine-tuning with a contrastive loss on a dataset of 1.4 million queries enables the model to learn which texts resonate together.
  • Dimensionality: The encoder then transforms these texts into 1024 dimensional vectors, ensuring efficient information retrieval.

This is particularly useful for tasks such as semantic similarity and clustering, where understanding the nuances of language is crucial.

Getting Started with Usage

To get rolling with the model, use sentences that follow specific formats. Here’s how to go about it:

  • For retrieval, start your queries with: Instruct: Given a web search query, retrieve relevant passages that answer the query. nQuery:
  • For symmetric tasks related to semantic similarity, prefix both texts with: Instruct: Retrieve semantically similar text. nQuery:

Here’s a practical snippet to help you get started:

from sentence_transformers import SentenceTransformer
from sentence_transformers.util import cos_sim

model = SentenceTransformer(
    'sdadasstella-pl-retrieval',
    trust_remote_code=True,
    device='cuda',
    model_kwargs={'attn_implementation': 'flash_attention_2', 'trust_remote_code': True}
)

model.bfloat16()

query_prefix = "Instruct: Given a web search query, retrieve relevant passages that answer the query.\nQuery: "
queries = [query_prefix + "Jak dożyć 100 lat?"]
answers = [
    "Trzeba zdrowo się odżywiać i uprawiać sport.",
    "Trzeba pić alkohol, imprezować i jeździć szybkimi autami.",
    "Gdy trwała kampania politycy zapewniali, że rozprawią się z zakazem niedzielnego handlu."
]

queries_emb = model.encode(queries, convert_to_tensor=True, show_progress_bar=False)
answers_emb = model.encode(answers, convert_to_tensor=True, show_progress_bar=False)
best_answer = cos_sim(queries_emb, answers_emb).argmax().item()
print(answers[best_answer])

Troubleshooting Tips

If you encounter any issues while using the model, here are some troubleshooting steps to consider:

  • Ensure you have set trust_remote_code=True when loading the model.
  • For improved performance, it’s advisable to enable Flash Attention 2 by setting attn_implementation to this option.
  • If you face memory issues, consider experimenting with your batch size or running on a different device.

For more insights, updates, or to collaborate on AI development projects, stay connected with **fxis.ai**.

Evaluation Results

The efficiency of the model can be quantified with an achievement of NDCG@10 of 62.32 on the Polish Information Retrieval Benchmark. For further reading, check the PIRB Leaderboard for detailed results.

Conclusion

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox