How to Fine-tune the GPT-2 Model for Vietnamese Six Eight Poetry

Sep 11, 2024 | Educational

Are you fascinated by the richness of Vietnamese literature and the beauty of poetry? Do you want to blend modern AI technology with traditional poetic forms? If so, you’re in for a treat! In this article, we will delve into how to fine-tune the GPT-2 model specifically for generating Vietnamese Six Eight poems, using a pre-trained model and a dataset of 10,000 lines of poetry.

Model Description

We are utilizing a specialized version of the GPT-2 model, specifically adapted for Vietnamese poetry. This model is built on a Vietnamese Wiki GPT-2 pretrained framework and has been fine-tuned on a dataset filled with the lyrical essence of Six Eight poems. If you want to check out the pretrained model, visit this link: Pretrained Model.

Purpose

This Vietnamese GPT-2 Six Eight Poet Model was developed primarily for fun and experimental studies. Whether you are an AI enthusiast or a poetry lover, this project offers a delightful way to explore the intersection of technology and the arts.

Dataset

The dataset comprises approximately 10,000 lines compiled from traditional Vietnamese Six Eight poems, specifically designed to capture the unique rhythm and structure of the genre.

Results

Train Loss: 2.7
Validation Loss: 4.5

How to Use the Model

Using the model to generate your own Six Eight poems is simple! Below is a step-by-step guide:

Step 1: Environment Setup

Ensure you have Python and the required libraries installed. You will need PyTorch and the Transformers library to get started.

Step 2: Code to Generate Poetry

Here’s a code snippet that illustrates how to use this model:

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
tokenizer = AutoTokenizer.from_pretrained("tuanleGPT2_Poet")
model = AutoModelForCausalLM.from_pretrained("tuanleGPT2_Poet").to(device)

text = "hỏi rằng nàng"
input_ids = tokenizer.encode(text, return_tensors="pt").to(device)
min_length = 60
max_length = 100

sample_outputs = model.generate(input_ids,
                                pad_token_id=tokenizer.eos_token_id,
                                do_sample=True,
                                max_length=max_length,
                                min_length=min_length,
                                top_p=0.8,
                                num_beams=10,
                                no_repeat_ngram_size=2,
                                num_return_sequences=3)

for i, sample_output in enumerate(sample_outputs):
    print("Generated text {}:\n{}\n---".format(i + 1, tokenizer.decode(sample_output.tolist(), skip_special_tokens=True)))

Understanding the Code: An Analogy

Think of the code as a master chef preparing a delightful dish (the poem) using a well-crafted recipe (the model) and fresh ingredients (the input text). The chef first checks what kitchen tools (libraries) are available, selects the right recipes (load the tokenizer and model), and prepares the workspace (move to the right device).

When the chef decides what to cook, they gather their ingredients (input text) and follow the steps meticulously to ensure the dish turns out excellent (the generated poem). Just like cooking, generating poetry with AI is a delightful process of exploration, experimentation, and creativity!

Sample Input and Output

Here’s an example of how to use the model:

Input: hỏi rằng nàng
Output: hỏi rằng nàng đã nói ra cớ sao nàng lại hỏi han sự tình vân tiên nói lại những lời thưa rằng ở chốn am mây một mình từ đây mới biết rõ ràng ở đây cũng gặp một người ở đây hai người gặp lại gặp nhau thấy lời nàng mới hỏi tra việc này nguyệt nga hỏi việc bấy lâu khen rằng đạo sĩ ở đầu cửa thiền

Troubleshooting Ideas

If you experience any issues while using the model, try the following troubleshooting steps:

Ensure that all libraries are correctly installed and updated to their latest versions.
Double-check that you have the correct pretrained model identifier typed in.
Be mindful of your device’s GPU/CPU settings; you may need to adjust according to availability.
Make sure your input text is structured properly to prompt the model effectively.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox