How to Generate Chinese Poetry with a Leading Character Model

Feb 25, 2022 | Educational

Welcome to the world of poetic creation using innovative models! In this guide, we’ll explore how to use a model that generates Chinese poetry with leading characters and incorporates a specific mood and key themes. Whether you’re a poetry enthusiast or a programming newbie, we’ve got you covered!

Objectives of the Poetry Model

This poetry model serves two main purposes:

To create acrostic poetry (藏头诗).
To infuse the essence of specific keywords into the poetic creation.

How the Model Works

Imagine you’re a chef preparing a gourmet dish. You have a variety of ingredients at your disposal, and your goal is to blend them together creatively to produce a delightful meal. The poetry model works in a similar way, utilizing a neural network inspired by the GPT-2 paper, which states that many learning tasks can be arranged like a sequence of flavors, allowing the model to manage inputs and outputs. The model essentially cooks up poetic lines by mixing leading characters with selected keywords, resulting in a beautiful dish of poetry!

Generating Poetry: The Code

Now, let’s dive into the code that powers our poetry generation:

from transformers import (AutoTokenizer, AutoModelForCausalLM)

tokenizer  = AutoTokenizer.from_pretrained("raynardj/keywords-cangtou-chinese-poetry")
model  = AutoModelForCausalLM.from_pretrained("raynardj/keywords-cangtou-chinese-poetry")

def inference(lead, keywords = []):
    lead: 藏头的语句， 比如一个人的名字， 2，3 或4个字
    keywords：关键词, 0~12个关键词比较好
    leading = f《lead》
    text = '-'.join(keywords)+leading
    input_ids = tokenizer(text, return_tensors='pt').input_ids[:, :-1]
    lead_tok = tokenizer(lead, return_tensors='pt').input_ids[0, 1:-1]
    with torch.no_grad():
        pred = model.generate(
            input_ids,
            max_length=256,
            num_beams=5,
            do_sample=True,
            repetition_penalty=2.1,
            top_p=.6,
            bos_token_id=tokenizer.sep_token_id,
            pad_token_id=tokenizer.pad_token_id,
            eos_token_id=tokenizer.sep_token_id,
        )[0, 1:]
        
        mask = (pred == 101)
        while mask.sum() < len(lead_tok):
            lead_tok = lead_tok[:mask.sum()]
        while mask.sum() < len(lead_tok):
            reversed_lead_tok = lead_tok.flip(0)
            lead_tok = torch.cat([
                lead_tok,
                reversed_lead_tok[:mask.sum() - len(lead_tok)]
            ])
        pred[mask] = lead_tok

    generate = tokenizer.decode(pred, skip_special_tokens=True)
    generate = generate.replace('》', '》\n').replace('。', '。\n').replace(' ', '')
    return generate

Using the Model

To generate poetry using this model, you can easily input a leading phrase and a list of keywords. The example commands would look like this:

inference("上海", ["高楼", "虹光", "灯红酒绿", "华厦"])
inference("刘先生", ["妆容", "思", "落花", "空镜"])

Troubleshooting

While using the model, you might encounter some issues. Here are some common troubleshooting tips:

Ensure that you have installed the required libraries and dependencies.
If the output is not what you expected, try checking the length and relevance of your keywords.
Make sure that the leading phrase you use is appropriate and aligns with the mood of the poetry.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

Additional Resources

If you're curious and want to dive deeper into Chinese poetry generation or explore more models, here are some helpful links:

Conclusion

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox