Snowflakes Arctic-embed-m-v1.5

Aug 1, 2024 | Educational

homemayankDocumentsarticle-generation-using-llmresized_imagesreadme_26_40

News This Model Usage FAQ Contact License Acknowledgement

News

07/18/2024: Release of snowflake-arctic-embed-m-v1.5, capable of producing highly compressible embedding vectors that preserve quality even when squished as small as 128 bytes per vector. Details about the development of this model are available in the launch post on the Snowflake engineering blog.

05/10/2024: Release of the technical report on Arctic Embed (View Report).

04/16/2024: Original release of the snowflake-arctic-embed family of text embedding models.

This Model

This model is an updated version of snowflake-arctic-embed-m (View Model), designed to improve embedding vector compressibility. This model achieves a slightly higher performance overall without compression and retains quality even down to 128 byte embedding vectors through a combination of Matryoshka Representation Learning (MRL) and uniform scalar quantization.

Model Performance:

Snowflake Arctic-embed-m-v1.5: 55.14 NDCG @ 10
Snowflake Arctic-embed-m: 54.91 NDCG @ 10

Compared to several other models trained with MRL, snowflake-arctic-embed-m-v1.5 retains a higher degree of quality and delivers better retrieval quality on the MTEB Retrieval benchmark.

Usage

Using Sentence Transformers

Using the sentence-transformers package, follow the example below:

import torch
from sentence_transformers import SentenceTransformer
from torch.nn.functional import normalize

# Model constants
MODEL_ID = "Snowflake/snowflake-arctic-embed-m-v1.5"

# Your queries and docs
queries = ["what is snowflake?", "Where can I get the best tacos?"]
documents = ["The Data Cloud!", "Mexico City of Course!"]

# Load the model
model = SentenceTransformer(MODEL_ID, model_kwargs=dict(add_pooling_layer=False))

# Generate text embeddings
query_embeddings = model.encode(queries, prompt_name="query")
document_embeddings = model.encode(documents)

# Scores via dot product
scores = query_embeddings @ document_embeddings.T

# Pretty-print the results
for query, query_scores in zip(queries, scores):
    doc_score_pairs = list(zip(documents, query_scores))
    doc_score_pairs = sorted(doc_score_pairs, key=lambda x: x[1], reverse=True)
    print(f"Query: {query}")
    for document, score in doc_score_pairs:
        print(f"Score: {score:.4f  Document: {document}")
    print()

Using Transformers.js

To use the Transformers.js library, install it via:

npm install @xenovatransformers

Then generate sentence embeddings with:

import { pipeline, dot } from '@xenovatransformers';

const extractor = await pipeline('feature-extraction', "Snowflake/snowflake-arctic-embed-m-v1.5", { quantized: false });
// More code follows ...

Compressing to 128 bytes

This model is designed to generate embeddings that compress well down to 128 bytes via a two-part compression scheme:

Truncation and renormalization to 256 dimensions (MRL).
4-bit uniform scalar quantization of all values.

FAQ

TBD

Contact

Feel free to open an issue or pull request if you have any questions or suggestions about this project. You can also email Daniel Campos (daniel.campos@snowflake.com).

License

Arctic is licensed under the Apache-2.0. The released models can be used for commercial purposes free of charge.

Acknowledgement

Thanks to the open-source community for providing great building blocks for our models. Special thanks to the modeling engineers and researchers who worked tirelessly to improve the performance of our models.

Troubleshooting

If you encounter issues during your implementation, consider the following troubleshooting tips:

Ensure all dependencies are correctly installed.
Check your API keys and access rights.
Modify your data input format as needed.
Refer to our GitHub examples for guidance on implementation.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox