Exploring Automatic Music Transcription with Basic Pitch

Jan 15, 2021 | Data Science

homemayankDocumentsarticle-generation-using-llmresized_images_gitmachine_learningreadme_spotify_basic-pitch

Have you ever wished to convert your favorite music into MIDI files effortlessly? Thanks to Basic Pitch, a Python library developed by Spotify’s Audio Intelligence Lab, you can now experience the magic of Automatic Music Transcription (AMT). Basic Pitch promises ease of use, efficiency, and incredible transcription accuracy, making it an exciting tool for music enthusiasts and professionals alike. In this blog, we will walk you through how to set up and use Basic Pitch, along with troubleshooting tips.

Getting Started with Basic Pitch

Basic Pitch is a small but mighty library that you can easily install via pip. Here’s how to get started:

Installation:
- Open your terminal or command prompt.
- Run the command: pip install basic-pitch.
- To update Basic Pitch to the latest version, simply add –upgrade to the above command.
System Requirements:
- Compatible with MacOS, Windows, and Linux.
- Supports Python versions 3.7 to 3.11. For Mac M1 users, it’s essential to use Python 3.10.

Using Basic Pitch

Once installed, using Basic Pitch is a straightforward process. Let’s say you have an audio file of your favorite song and you want it transcribed to a MIDI file.

Model Prediction

Basic Pitch relies on a series of steps to process your audio input:

It first attempts to load one of its available models in a specific order—TensorFlow, CoreML, TensorFlowLite, and ONNX.
The module variable ICASSP_2022_MODEL_PATH will default to the first available model it finds.

Command Line Tool

For quick use, Basic Pitch provides a command line tool. You can transcribe an audio file using the following command:

basic-pitch output-directory input-audio-path

For instance:

basic-pitch ./output my_audio_file.wav

It also allows the processing of multiple files at once:

basic-pitch ./output my_audio_file.wav another_audio_file.wav

Understanding the Output

The command will generate a MIDI file in the specified output directory, along with a few options to save additional formats such as .wav, NPZ, and CSV files. You can customize these by appending various flags to your command.

Example Code in Your Project

If you want to use Basic Pitch programmatically, you can leverage the Python API. Here’s a simplified way to get started:

from basic_pitch.inference import predict

model_output, midi_data, note_events = predict("input-audio-path")

Imagine Basic Pitch as a skilled musician who listens to your song (input audio), comprehends it, and transcribes it accurately onto a staff (MIDI file). While listening, this musician can distinguish individual notes even if multiple instruments are playing together—truly, a remarkable feat!

Troubleshooting Common Issues

Here are some troubleshooting ideas that might help you resolve common issues:

If you encounter installation errors, ensure you are using the correct version of Python and that your pip is up to date.
For issues with audio processing, confirm your audio file is in a supported format (.mp3, .wav, etc.).
If you experience slow processing, consider breaking larger files into smaller segments.
Still stuck? For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

Conclusion

Basic Pitch is a powerful yet simple tool for music transcription. At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox