Welcome to a comprehensive guide on using Fietje 2, an open and efficient language model designed for Dutch text generation. This blog will walk you through the intended uses, training procedure, and some best practices to get you started with Fietje 2.
What is Fietje 2?
Fietje 2 is an adapted version of microsoftphi-2, tailored specifically for Dutch text generation. With 2.7 billion parameters trained on an impressive 28 billion tokens, it strikes a balance between size and performance, making it a remarkable choice for users needing Dutch language capabilities.
Intended Uses and Limitations
Fietje 2 can be used for various applications involving Dutch text, such as:
- Text generation for creative writing
- Completing sentences in Dutch dialogue
- Assisting in language learning
- Generating content for websites or blogs
However, like any large language model (LLM), Fietje has its limitations:
- LLMs can hallucinate, making up facts
- They are prone to errors
- Use at your own risk; verify important output
Training Data
Fietje was continue-pretrained on a vast dataset that includes:
- Full Dutch component of Wikipedia (around 15% of the dataset)
- Tokens from CulturaX to enhance contextual understanding
You can find a newer version of this dataset here.
Training Procedure
The creation of Fietje 2 involved substantial computational power, generously provided by the Flemish Supercomputer Center (VSC). The training process took approximately two weeks and utilized a robust framework:
- Training nodes: 4 nodes of 4x A100 80GB
- Utilized frameworks: DeepSpeed and the alignment-handbook
Complete training recipes and the SLURM script can be accessed in the GitHub repository.
Training Hyperparameters
To ensure the best performance, the following hyperparameters were set during training:
- learning_rate: 9e-05
- train_batch_size: 40
- eval_batch_size: 40
- seed: 42
- distributed_type: multi-GPU
- num_devices: 16
- gradient_accumulation_steps: 3
- total_train_batch_size: 1920
- total_eval_batch_size: 640
- optimizer: Adam (betas=(0.9,0.98) and epsilon=1e-07)
- lr_scheduler_type: linear
- num_epochs: 1.0
Training Results
The training yielded a detailed log of performance, showcasing the reduction in training loss over time. Here’s an analogy to help understand this concept:
Imagine training Fietje like teaching a student to ride a bike. At first, the student wobbles and struggles to stay upright (high training loss). But after consistent practice and adjustments (iterations), the student gradually learns to balance better (low training loss). By the end of the training, the student can confidently ride without falling (optimal performance).
Troubleshooting
As you explore Fietje 2, you might run into some issues or have questions. Here are some troubleshooting tips:
- If you encounter errors during installation, ensure that you are using compatible versions of the frameworks listed above (e.g., Pytorch 2.1.2).
- If Fietje 2’s output seems nonsensical, recalibrate your input; sometimes, simpler prompts yield better outputs.
- For performance concerns, consider adjusting hyperparameters according to your specific use case.
For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
Conclusion
At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

