In a world where the ability to extract information quickly and accurately is paramount, deploying a Question Answering (QA) system using a pre-trained BERT model is an exhilarating venture. In this guide, we will walk you through deploying the BERT-DRCD-QuestionAnswering model using FastAPI and Docker. We’ll also dive into the specifics of how to perform question answering, offering you a comprehensive understanding of this process.
Understanding the BERT Model
The BERT model we are using is a fine-tuned variant of the bert-base-chinese, tuned specifically on the DRCD dataset for the Mandarin language. This model achieves impressive scores with an F1 of 86 and an EM of 83, making it a reliable choice for question answering tasks.
Training Arguments
Before we delve into the model usage, let’s look closely at the training arguments which enhance the model’s performance:
- Length: 384
- Stride: 128
- Learning Rate: 3e-5
- Batch Size: 10
- Epochs: 3
Setting Up the Environment
We’ll be utilizing FastAPI to create an API for our model, allowing for seamless interaction. Additionally, Docker will help us containerize the application for easy distribution and deployment. Let’s dive into the steps of the implementation!
Implementation Steps
1. Install Necessary Packages
pip install fastapi uvicorn torch transformers
2. Create the FastAPI Application
Here’s where we create the backbone of our application.
from fastapi import FastAPI
from transformers import BertTokenizerFast, BertForQuestionAnswering
import torch
app = FastAPI()
tokenizer = BertTokenizerFast.from_pretrained('nyust-eb210braslab-bert-drcd-384')
model = BertForQuestionAnswering.from_pretrained('nyust-eb210braslab-bert-drcd-384')
@app.get("/answer")
async def get_answer(text: str, query: str):
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
encoded_input = tokenizer(text, query, return_tensors='pt').to(device)
qa_outputs = model(**encoded_input)
start = torch.argmax(qa_outputs.start_logits).item()
end = torch.argmax(qa_outputs.end_logits).item()
answer = encoded_input.input_ids[0][start:end + 1]
answer_decoded = tokenizer.decode(answer.tolist())
return {"answer": answer_decoded, "confidence": (start_prob + end_prob) / 2}
3. Running the Application in Docker
Create a Dockerfile with the following content:
FROM python:3.8
WORKDIR /app
COPY . .
RUN pip install -r requirements.txt
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "80"]
Then build and run the Docker container:
docker build -t bert-drcd-qa .
docker run -p 80:80 bert-drcd-qa
Usage Example
Now that the application is running, you can send GET requests to retrieve answers. Here’s an example:
text = "鴻海科技集團是由臺灣企業家郭台銘創辦的跨國企業,總部位於臺灣新北市土城區。"
query = "鴻海集團總部位於哪裡?"
Troubleshooting Common Issues
- Model Loading Errors: Ensure that you have specified the model identifiers correctly. Double-check the path to the pre-trained models and confirm that all necessary packages are installed.
- Runtime Errors: If you encounter device-related issues, make sure your GPU is set up correctly or use CPU fallback.
- Docker Issues: Verify that Docker is installed and running on your machine. Check the Docker daemon logs for any errors.
For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
Conclusion
By following this guide, you should now have a functional deployment of the BERT-DRCD model for answering questions. This deployment not only enhances your understanding of machine learning but also equips you with practical tools to build upon.
At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

