Beir – Benchmarking IR

Nov 9, 2020 | Data Science

homemayankDocumentsarticle-generation-using-llmresized_images_gitdeep_learningreadme_beir-cellar_beir

Installation Quick Example Datasets Wiki BeIR on Hugging Face

What is it?

**BEIR** is a **heterogeneous benchmark** containing diverse Information Retrieval (IR) tasks. It also provides a **common and easy framework** for evaluation of your NLP-based retrieval models within the benchmark. For **an overview**, checkout our **[new wiki](https://github.com/beir-cellar/beir/wiki)** page. For **models and datasets**, you can refer to our **Hugging Face (HF)** page at **[Hugging Face](https://huggingface.co/BeIR)**. For more information, check out our publications:

BEIR: A Heterogeneous Benchmark for Zero-shot Evaluation of Information Retrieval Models (NeurIPS 2021, Datasets and Benchmarks Track)
Resources for Brewing BEIR: Reproducible Reference Models and an Official Leaderboard (Arxiv 2023)

Installation

To get started with BEIR, installation can be done via pip:

python -m pip install beir

If you want to build from the source, follow these steps:

python
$ git clone https://github.com/beir-cellar/beir.git
$ cd beir
$ pip install -e .

BEIR has been tested with Python versions 3.6 and 3.7.

Features

Preprocess your own IR dataset or use one of the already-preprocessed 17 benchmark datasets.
Wide settings included, covers diverse benchmarks useful for both academia and industry.
Includes well-known retrieval architectures (lexical, dense, sparse, and reranking-based).
Add and evaluate your own model in an easy framework using different state-of-the-art evaluation metrics.

Quick Example

To understand how to use BEIR, here’s a straightforward example. Let’s think of your model as a restaurant trying to serve the best pizza to customers based on their preferences. Each time a customer orders, the restaurant must ensure the right ingredients (queries) are mixed just right (retrieving relevant documents) to create that perfect slice. Below is a snippet of code to get you started:

python
from beir import util, LoggingHandler
from beir.retrieval import models
from beir.datasets.data_loader import GenericDataLoader
from beir.retrieval.evaluation import EvaluateRetrieval
from beir.retrieval.search.dense import DenseRetrievalExactSearch as DRES
import logging
import pathlib, os

# Just some code to print debug information to stdout
logging.basicConfig(format='%(asctime)s - %(message)s',
                    datefmt='%Y-%m-%d %H:%M:%S',
                    level=logging.INFO,
                    handlers=[LoggingHandler()])

# Download scifact.zip dataset and unzip the dataset
dataset = "scifact"
url = f"https://public.ukp.informatik.tu-darmstadt.de/thakur/BEIR/{dataset}.zip"
out_dir = os.path.join(pathlib.Path(__file__).parent.absolute(), "datasets")
data_path = util.download_and_unzip(url, out_dir)

# Load the dataset
corpus, queries, qrels = GenericDataLoader(data_folder=data_path).load(split='test')

# Load the SBERT model and retrieve using cosine similarity
model = DRES(models.SentenceBERT("msmarco-distilbert-base-tas-b"), batch_size=16)
retriever = EvaluateRetrieval(model, score_function="dot")  # Use 'cos_sim' for cosine similarity
results = retriever.retrieve(corpus, queries)

# Evaluate your results with various metrics
ndcg, _map, recall, precision = retriever.evaluate(qrels, results, retriever.k_values)

As you can see, this is similar to a chef receiving various orders (queries) and checking their pantries (corpus) to prepare meals (retrieve relevant documents) while ensuring each dish is up to quality (evaluation metrics). The interactions of all these elements create an efficient search and retrieval system.

Available Datasets

You can view all available datasets here or on Hugging Face.

Troubleshooting

In case you encounter any issues during installation or execution, here are a few troubleshooting steps:

Ensure that you are using one of the supported Python versions (3.6 or 3.7).
Check your internet connection for downloading datasets.
Verify that the packages you’re using are up-to-date.
Consult the Wiki for additional resources and community support.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

Additional Information

We also provide a variety of additional information in our Wiki page.

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox