Are you fascinated by the world of Japanese storytelling? Do you wish to generate engaging narratives using state-of-the-art AI? Look no further! This guide will walk you through the steps to utilize the Genji-JP 6B model—a fine-tuned version of EleutherAI’s GPT-J 6B—designed for generating Japanese web novels.
Model Description
Genji-JP 6B boasts impressive specifications:
- Parameters: 6,053,381,344
- Layers: 28
- Model Dimension: 4,096
- Feedforward Dimension: 16,384
- Heads: 16
- Context Length: 2,048
- Vocabulary: 50,400
- Position Encoding: Rotary position encodings (RoPE)
To give you a clearer picture, think of this model as a grand library with 28 floors (layers), housing over 6 billion individual stories (parameters). Each floor is divided into different sections (heads), making it uniquely versatile for various genres of storytelling.
Training Data
The Genji-JP 6B model underwent pre-training on the Pile, a curated dataset created by EleutherAI, aiming to sharpen its language abilities. Post pre-training, the model was fine-tuned using a dedicated Japanese storytelling dataset for added finesse.
How to Use the Genji-JP 6B Model
Follow these steps to effectively utilize the Genji-JP 6B model for generating Japanese text:
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-j-6B")
model = AutoModelForCausalLM.from_pretrained("NovelAI/genji-jp", torch_dtype=torch.float16, low_cpu_mem_usage=True).eval().cuda()
text = "Your prompt here"
tokens = tokenizer(text, return_tensors="pt").input_ids
generated_tokens = model.generate(tokens.long().cuda(), use_cache=True,
do_sample=True, temperature=1, top_p=0.9,
repetition_penalty=1.125, min_length=1,
max_length=len(tokens[0]) + 400,
pad_token_id=tokenizer.eos_token_id)
last_tokens = generated_tokens[0]
generated_text = tokenizer.decode(last_tokens).replace("

