
Introduction
Dmeta-embedding is a cross-domain, out-of-the-box model designed for various applications such as search engines, question answering (QA), intelligent customer service, and more. It excels in embedding tasks and ranks second on the MTEB Chinese leaderboard.
How to Use Dmeta-embedding
The Dmeta-embedding model can be easily integrated into your projects using popular frameworks like Sentence-Transformers, Langchain, and Hugging Face Transformers. Here’s a step-by-step guide on how to implement it with these frameworks:
1. Using Sentence-Transformers
To load and perform inference using Dmeta-embedding via Sentence-Transformers, follow these steps:
pip install -U sentence-transformers
from sentence_transformers import SentenceTransformer
texts1 = ["example text 1", "example text 2"]
texts2 = ["another example text 1", "another example text 2", "another example text 3"]
model = SentenceTransformer('DMetaSoul/dmeta-embedding')
embs1 = model.encode(texts1, normalize_embeddings=True)
embs2 = model.encode(texts2, normalize_embeddings=True)
similarity = embs1 @ embs2.T
print(similarity)
for i in range(len(texts1)):
scores = []
for j in range(len(texts2)):
scores.append([texts2[j], similarity[i][j]])
scores = sorted(scores, key=lambda x: x[1], reverse=True)
print(f"{texts1[i]}: {scores}")
2. Using Langchain
To integrate with Langchain, execute the following:
pip install -U langchain
import torch
import numpy as np
from langchain.embeddings import HuggingFaceEmbeddings
model_name = 'DMetaSoul/dmeta-embedding'
model_kwargs = {"device": "cuda" if torch.cuda.is_available() else "cpu"}
encode_kwargs = {"normalize_embeddings": True}
model = HuggingFaceEmbeddings(model_name=model_name, model_kwargs=model_kwargs, encode_kwargs=encode_kwargs)
texts1 = ["example text 1", "example text 2"]
texts2 = ["another example text 1", "another example text 2", "another example text 3"]
embs1 = model.embed_documents(texts1)
embs2 = model.embed_documents(texts2)
embs1, embs2 = np.array(embs1), np.array(embs2)
similarity = embs1 @ embs2.T
print(similarity)
for i in range(len(texts1)):
scores = []
for j in range(len(texts2)):
scores.append([texts2[j], similarity[i][j]])
scores = sorted(scores, key=lambda x: x[1], reverse=True)
print(f"{texts1[i]}: {scores}")
3. Using Hugging Face Transformers
This method involves more detailed control over embeddings:
pip install -U transformers
import torch
from transformers import AutoTokenizer, AutoModel
def mean_pooling(model_output, attention_mask):
token_embeddings = model_output[0]
input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)
texts1 = ["example text 1", "example text 2"]
texts2 = ["another example text 1", "another example text 2", "another example text 3"]
tokenizer = AutoTokenizer.from_pretrained('DMetaSoul/dmeta-embedding')
model = AutoModel.from_pretrained('DMetaSoul/dmeta-embedding')
model.eval()
with torch.no_grad():
inputs1 = tokenizer(texts1, padding=True, truncation=True, return_tensors='pt')
inputs2 = tokenizer(texts2, padding=True, truncation=True, return_tensors='pt')
model_output1 = model(**inputs1)
model_output2 = model(**inputs2)
embs1 = mean_pooling(model_output1, inputs1['attention_mask'])
embs2 = mean_pooling(model_output2, inputs2['attention_mask'])
similarity = embs1 @ embs2.T
print(similarity)
for i in range(len(texts1)):
scores = []
for j in range(len(texts2)):
scores.append([texts2[j], similarity[i][j]])
scores = sorted(scores, key=lambda x: x[1], reverse=True)
print(f"{texts1[i]}: {scores}")
Understanding Dmeta-embedding
Dmeta-embedding is like a Swiss army knife for languages—useful for various tasks ranging from search engines to QA systems. Imagine you’re preparing a buffet where each dish represents a different type of data. Instead of using one tool for each dish, a Swiss army knife allows you to switch between tools quickly and effectively. Dmeta-embedding removes the complexity of switching between various data types and formats by offering a unified solution that adapts to your needs.
Troubleshooting
If you encounter issues while using Dmeta-embedding, consider the following troubleshooting tips:
- Ensure that your environment is properly set up with the necessary libraries installed.
- Check for compatibility with the framework versions you are using (e.g., Sentence-Transformers, Langchain).
- Verify the inputs you’re feeding into the model—incorrect formats can lead to errors.
If problems persist, feel free to reach out via discussion forum or email support. For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

