How to Deploy a BERT Model for Question Answering Using FastAPI and Docker

Jun 1, 2021 | Educational

In a world where the ability to extract information quickly and accurately is paramount, deploying a Question Answering (QA) system using a pre-trained BERT model is an exhilarating venture. In this guide, we will walk you through deploying the BERT-DRCD-QuestionAnswering model using FastAPI and Docker. We’ll also dive into the specifics of how to perform question answering, offering you a comprehensive understanding of this process.

Understanding the BERT Model

The BERT model we are using is a fine-tuned variant of the bert-base-chinese, tuned specifically on the DRCD dataset for the Mandarin language. This model achieves impressive scores with an F1 of 86 and an EM of 83, making it a reliable choice for question answering tasks.

Training Arguments

Before we delve into the model usage, let’s look closely at the training arguments which enhance the model’s performance:

  • Length: 384
  • Stride: 128
  • Learning Rate: 3e-5
  • Batch Size: 10
  • Epochs: 3

Setting Up the Environment

We’ll be utilizing FastAPI to create an API for our model, allowing for seamless interaction. Additionally, Docker will help us containerize the application for easy distribution and deployment. Let’s dive into the steps of the implementation!

Implementation Steps

1. Install Necessary Packages


pip install fastapi uvicorn torch transformers

2. Create the FastAPI Application

Here’s where we create the backbone of our application.


from fastapi import FastAPI
from transformers import BertTokenizerFast, BertForQuestionAnswering
import torch

app = FastAPI()
tokenizer = BertTokenizerFast.from_pretrained('nyust-eb210braslab-bert-drcd-384')
model = BertForQuestionAnswering.from_pretrained('nyust-eb210braslab-bert-drcd-384')

@app.get("/answer")
async def get_answer(text: str, query: str):
    device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
    encoded_input = tokenizer(text, query, return_tensors='pt').to(device)
    qa_outputs = model(**encoded_input)

    start = torch.argmax(qa_outputs.start_logits).item()
    end = torch.argmax(qa_outputs.end_logits).item()
    
    answer = encoded_input.input_ids[0][start:end + 1]
    answer_decoded = tokenizer.decode(answer.tolist())
    
    return {"answer": answer_decoded, "confidence": (start_prob + end_prob) / 2}

3. Running the Application in Docker

Create a Dockerfile with the following content:


FROM python:3.8
WORKDIR /app
COPY . .
RUN pip install -r requirements.txt
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "80"]

Then build and run the Docker container:


docker build -t bert-drcd-qa .
docker run -p 80:80 bert-drcd-qa

Usage Example

Now that the application is running, you can send GET requests to retrieve answers. Here’s an example:


text = "鴻海科技集團是由臺灣企業家郭台銘創辦的跨國企業,總部位於臺灣新北市土城區。"
query = "鴻海集團總部位於哪裡?"

Troubleshooting Common Issues

  • Model Loading Errors: Ensure that you have specified the model identifiers correctly. Double-check the path to the pre-trained models and confirm that all necessary packages are installed.
  • Runtime Errors: If you encounter device-related issues, make sure your GPU is set up correctly or use CPU fallback.
  • Docker Issues: Verify that Docker is installed and running on your machine. Check the Docker daemon logs for any errors.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

Conclusion

By following this guide, you should now have a functional deployment of the BERT-DRCD model for answering questions. This deployment not only enhances your understanding of machine learning but also equips you with practical tools to build upon.

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox