Creating a Turkish Multitask Model Using mT5-small: A How-To Guide

Jun 25, 2021 | Educational

homemayankDocumentsarticle-generation-using-llmresized_imagesreadme_7_63

In this article, we will explore how to create a multitask model based on Google’s mT5-small architecture, specifically fine-tuned for Turkish question answering and generation. The model you will be building is capable of performing three key tasks: answer extraction, question generation, and question answering. Are you ready to become a multitasking pro in NLP? Let’s dive in!

Understanding the mT5-small Model

The mT5-small model is a fascinating piece of technology with 300 million parameters and an approximate size of 1.2GB. Think of it as a well-trained chef who can cook various cuisines but requires some preparation before serving delicious dishes to guests. Just like our chef needs the right ingredients and techniques, the mT5 model requires fine-tuning with appropriate hyperparameters to excel in specific NLP tasks.

Requirements

Before we start, make sure you have the following requirements installed:

transformers==4.4.2
sentencepiece==0.1.95
Git (to clone the repository)

Installation Instructions

Follow the steps below to set up your environment:

!pip install transformers==4.4.2
!pip install sentencepiece==0.1.95
!git clone https://github.com/ozcangundes/multitask-question-generation.git
%cd multitask-question-generation

Usage

Now that you have everything set up, let’s load the necessary libraries and instantiate the model:

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("ozcangundes/m-t5-multitask-qa-qg-turkish")
model = AutoModelForSeq2SeqLM.from_pretrained("ozcangundes/m-t5-multitask-qa-qg-turkish")

from pipelines import pipeline  # pipelines.py script in the cloned repo

multimodel = pipeline("multitask-qa-qg", tokenizer=tokenizer, model=model)

Input Data

Now it’s time to provide a sample text for the model to work on. Here’s an example:

text = "Özcan Gündeş, 1993 yılı Tarsus doğumludur. Orta Doğu Teknik Üniversitesi Endüstri Mühendisliği bölümünde 2011-2016 yılları arasında lisans eğitimi görmüştür. Yüksek lisansını ise 2020 Aralık ayında, 4.00 genel not ortalaması ile Boğaziçi Üniversitesi, Yönetim Bilişim Sistemleri bölümünde tamamlamıştır. Futbolla yakından ilgilenmekle birlikte, Galatasaray kulübü taraftarıdır."

Generating Questions and Answers

Let’s see how to generate questions and answers from the text.

multimodel(text)

This command will produce output resembling:

[answer: Tarsus, question: Özcan Gündeş nerede doğmuştur?, answer: 1993, question: Özcan Gündeş kaç yılında doğmuştur?, answer: 2011-2016, question: Özcan Gündeş lisans eğitimini hangi yıllar arasında tamamlamıştır?, answer: Boğaziçi Üniversitesi, Yönetim Bilişim Sistemleri, question: Özcan Gündeş yüksek lisansını hangi bölümde tamamlamıştır?, answer: Galatasaray kulübü, question: Özcan Gündeş futbolla yakından ilgilenmekle birlikte hangi kulübü taraftarıdır?]

Question Answering Example

For question answering, you can also provide a specific question related to the context:

multimodel(context=text, question="Özcan hangi takımı tutmaktadır?")

This should return a concise answer like “Galatasaray.” You can continue to ask various questions related to the text for different answers.

Troubleshooting

If you encounter any issues while setting up or running the model, here are some common troubleshooting tips:

Ensure that all libraries are correctly installed and match the specified versions.
Check the internet connection while cloning the repository.
If you face memory issues, consider using a machine with more RAM or optimizing the parameters used during training.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

Final Thoughts

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox