How to Set Up and Use MockingBird for Voice Synthesis

Feb 15, 2021 | Educational

homemayankDocumentsarticle-generation-using-llmresized_images_gitreadme_babysor_MockingBird

If you’re looking to explore the voice synthesis capabilities of the MockingBird project, you’ve come to the right place! This blog will guide you through the setup and usage of the MockingBird voice synthesis tool, with supportive tips and troubleshooting ideas to ensure your journey is smooth and enjoyable.

Features of MockingBird

Support for Mandarin Chinese and multiple datasets like aidatatang_200zh, magicdata, aishell3, and others.
Built on PyTorch, tested on version 1.9.0 with various compatible GPUs.
Compatible with Windows, Linux, and M1 MacOS systems.
Webserver ready to serve results with remote calling capabilities.

Quick Start: Installation

Let’s dive into the installation process step-by-step to get you started with MockingBird.

1. Install Requirements

1.1 General Setup

Follow the original repository instructions to ensure your environment is fully prepared.

Python: Ensure you have Python 3.7 or higher.
Install PyTorch. If you face any issues related to torch version, switch your Python version to 3.9.
Install FFmpeg.
Run pip install -r requirements.txt to install necessary packages.
If you encounter any issues with the requirements.txt, create a virtual environment with conda env create -n env_name -f env.yml and activate it using conda activate env_name.

1.2 Setup on M1 Mac

M1 Mac users can follow this special setup:

Create a Rosetta Terminal and use system Python to create a virtual environment.
Install PyQt5 through pip in the Rosetta Terminal.
Workarounds for other packages like pyworld and ctc-segmentation have specific installation steps described in the original documentation.

2. Prepare Your Models

Your journey isn’t complete without the actual models. Here’s how to prepare them:

2.1 Train Encoder

Preprocess your audio and mel spectrograms using python encoder_preprocess.py datasets_root.
Run the encoder training with python encoder_train.py my_run datasets_rootSV2TTSencoder.

2.2 Train Synthesizer

Download and unzip your dataset.
Run python pre.py datasets_root to preprocess.
Train the synthesizer with python train.py --type=synth mandarin datasets_rootSV2TTSsynthesizer.

2.4 Train Vocoder

Preprocess data for the vocoder with python vocoder_preprocess.py datasets_root -m synthesizer_model_path.
Train either the wavernn vocoder or the hifigan vocoder using their respective commands.

3. Launch The Application

Finally, let’s run MockingBird:

To use the web server, execute python web.py and open your browser at http://localhost:8080.
To run the toolbox, execute python demo_toolbox.py -d datasets_root.
To generate voice from a text file, use python gen_voice.py text_file.txt your_wav_file.wav.

Troubleshooting

Here are some common issues you may encounter and how to solve them:

If you receive the error “Could not find a version that satisfies the requirement torch==1.9.0+cu102,” ensure you are using Python 3.9.
For low VRAM issues during training, consider adjusting the batch_size in the appropriate configuration files.
Should you encounter a RuntimeError related to size mismatch in loading state dictionaries, refer to the related issue on GitHub for potential resolutions.
If you run into virtual memory errors, you might need to increase the size of your page file.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

For more resources and community help, consult the original documentation and GitHub repository. Feel free to reach out if you have any specific questions!

Conclusion

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox