In this article, we’ll explore the process of merging the base Llama 2 70b model using a technique I call “frankenmerging” with the help of mergekit. The result is a quirky and more creative AI that has the potential to elevate your writing and storytelling. Let’s dive into the steps, observations, benchmarks, and troubleshooting tips!
What is Frankenmerging?
Frankenmerging refers to the innovative practice of combining a model with itself to enhance its abilities. Think of it like taking two identical twins and training them in slightly different ways. In this case, the various experiences and training paths create a new “hybrid” version that retains the strengths of its origins but also exhibits unique behaviors and capabilities.
Getting Started with Mergekit
To begin, you’ll need to have the mergekit installed and ready to use. Follow these steps:
- Clone the mergekit repository from GitHub.
- Follow the installation instructions provided in the README file.
- Prepare your trained Llama 2 model for merging.
Observations from Using Dicephal
After the merging process, the new model, which I’ve affectionately named Dicephal, has surprised me with its enhancements:
- It’s creatively advanced, showing more humor and wit than the base model.
- It occasionally invents new words, reminiscent of Goliath, but doesn’t always make sense.
- Like the original model, it requires clever prompting to fetch accurate responses.
- Dicephal is particularly great for story writing and showcases significant improvement in stylized writing and poetry.
- There’s a unique aspect to its ability to recall and learn from previous errors, often displaying a human-like quality in its responses.
Benchmark Results
Let’s take a look at how Dicephal performs compared to the base Llama model based on different benchmarks:
NeoEvalPlusN Benchmark
Test Name Base Llama Dicephal
---------- ---------- ----------
B 0 0
C 2 0
D 0.5 1
S 1.25 2.25
P 0 2.25
Total 3.75 5.5 (+75% improvement in size, +47% in meme benchmark performance!)
Politiscales Test
Name Whacky Left-Right
---------------------------------------------------
ChuckMcSneed Dicephal-123B 1.742262578 -0.131433424
Meta-Llama Llama-2-70b-hf 1.930293804 0.178771095
Troubleshooting Tips
Even the best models can sometimes exhibit quirks or errors. Here are some troubleshooting ideas:
- If you notice that Dicephal is producing nonsensical output, try tweaking your prompts or providing more context.
- If the model seems unresponsively stubborn or won’t retrieve the information needed, remember that this model requires clever prompting.
- For continuous improvements or queries regarding the technical challenges, engage with the community or visit fxis.ai for collaboration opportunities.
For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
Conclusion
At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

