Welcome to the world of sentence similarity and text retrieval! In this blog, we will guide you through how to use the Stella-PL retrieval model, specifically fine-tuned for Polish information retrieval tasks. This powerful model utilizes encoders to process and understand text, making it a vital tool for your next project.
Understanding the Model
Imagine you’re exploring a vast library filled with books. In this analogy, the Stella-PL model serves as your expert librarian, efficiently locating books that best answer your inquiries. Working with a multilingual knowledge distillation method and a sizeable corpus, this model has been specially developed to understand Polish text and retrieve relevant passages effectively.
- First Step: The model was adapted for Polish using a multilingual knowledge distillation method with a diverse corpus of 20 million Polish-English text pairs.
- Second Step: Fine-tuning with a contrastive loss on a dataset of 1.4 million queries enables the model to learn which texts resonate together.
- Dimensionality: The encoder then transforms these texts into 1024 dimensional vectors, ensuring efficient information retrieval.
This is particularly useful for tasks such as semantic similarity and clustering, where understanding the nuances of language is crucial.
Getting Started with Usage
To get rolling with the model, use sentences that follow specific formats. Here’s how to go about it:
- For retrieval, start your queries with: Instruct: Given a web search query, retrieve relevant passages that answer the query. nQuery:
- For symmetric tasks related to semantic similarity, prefix both texts with: Instruct: Retrieve semantically similar text. nQuery:
Here’s a practical snippet to help you get started:
from sentence_transformers import SentenceTransformer
from sentence_transformers.util import cos_sim
model = SentenceTransformer(
'sdadasstella-pl-retrieval',
trust_remote_code=True,
device='cuda',
model_kwargs={'attn_implementation': 'flash_attention_2', 'trust_remote_code': True}
)
model.bfloat16()
query_prefix = "Instruct: Given a web search query, retrieve relevant passages that answer the query.\nQuery: "
queries = [query_prefix + "Jak dożyć 100 lat?"]
answers = [
"Trzeba zdrowo się odżywiać i uprawiać sport.",
"Trzeba pić alkohol, imprezować i jeździć szybkimi autami.",
"Gdy trwała kampania politycy zapewniali, że rozprawią się z zakazem niedzielnego handlu."
]
queries_emb = model.encode(queries, convert_to_tensor=True, show_progress_bar=False)
answers_emb = model.encode(answers, convert_to_tensor=True, show_progress_bar=False)
best_answer = cos_sim(queries_emb, answers_emb).argmax().item()
print(answers[best_answer])
Troubleshooting Tips
If you encounter any issues while using the model, here are some troubleshooting steps to consider:
- Ensure you have set
trust_remote_code=True
when loading the model. - For improved performance, it’s advisable to enable Flash Attention 2 by setting
attn_implementation
to this option. - If you face memory issues, consider experimenting with your batch size or running on a different device.
For more insights, updates, or to collaborate on AI development projects, stay connected with **fxis.ai**.
Evaluation Results
The efficiency of the model can be quantified with an achievement of NDCG@10 of 62.32 on the Polish Information Retrieval Benchmark. For further reading, check the PIRB Leaderboard for detailed results.
Conclusion
At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.