This blog post aims to guide you through using Evolution Strategies to play Flappy Bird, a game that is simple yet challenging and serves as the perfect playground for artificial intelligence experiments.
Getting Started with Evolution Strategies
After diving deep into the fascinating world of Evolution Strategies as a Scalable Alternative to Reinforcement Learning, I was inspired to build a model to play my all-time favorite game—Flappy Bird. The beauty of this approach is that the model can learn to play quite well after just 3000 epochs. It may not be flawless, but it excels in navigating through tight spaces between walls.
Quick Overview of Evolution Strategies
Think of Evolution Strategies as a gardener who nurtures plants (the models). Instead of watering every plant (backpropagation) and keeping a detailed record of which plants grew best (recording actions), the gardener focuses on the overall garden health and simply pulls out the less healthy plants (underperforming models). This approach is memory efficient and swift, allowing your AI to develop its skills without cumbersome processes.
Demo of the Learning Process
After 3000 epochs, here’s how the model performs:
- Before Training: 
- After Training: 
As you can see, the evolution of the model over time is fascinating to witness!
Running the Code
To get this project up and running, you will need:
- Python version 3.5
- Pip for installing dependencies
Follow these steps:
- First, install the required dependencies. You may want to create a virtual environment using:
- The pretrained parameters are stored in a file named
load.npy, which will automatically be used intrain.pyordemo.py. train.pywill train the model, saving parameters insaves/TIMESTAMP_save-ITERATION.demo.pywill display the game in a GTK window, allowing you to see how the AI plays.- If you want to play the game yourself, just run
play.py(space to jump, enter to restart after losing).
pip install -r requirements
Pro tip: Reach a score of 100 and you will become THUG FOR LIFE!
Troubleshooting and Additional Notes
Sometimes, it may appear that training prolonged can degrade performance. This could be due to the model reaching a local maximum for accumulated rewards, where updates become too large and lead to oscillations. A helpful strategy could be implementing a learning rate decay.
If you wish to explore further, you can replace load.npy with long.npy (backup old files first) and run demo.py. You will observe how the bird may struggle due to minimal additional training epochs.
For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
Conclusion
At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.
With the right tools and a little creativity, the possibilities for AI applications like Flappy Bird are both exciting and innovative! Happy coding!

