Welcome to the world of seamless audio and video transcription! Today, we’re diving into the process of generating transcripts for your media content using the powerful Whisper AI. With automatic translations powered by LibreTranslate, you can easily enhance your content accessibility. Let’s explore how to get started with generating subtitles and troubleshooting tips to ensure smooth sailing!
Installation Steps
Before diving into the code, let’s prepare your environment:
- First, we need to install Whisper AI, which lays the groundwork for all our transcription needs. You can follow the installation instructions here.
- Once Whisper is up and running, you can start your web server. Ensure you have Node.js version 14 or above.
- If you haven’t installed Node 14, you can do so using nvm:
# Install nvm
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.2/install.sh | bash
# Setup nvm
export NVM_DIR="$HOME/.nvm"
[ -s "$NVM_DIR/nvm.sh" ] && \. "$NVM_DIR/nvm.sh" # This loads nvm
[ -s "$NVM_DIR/bash_completion" ] && \. "$NVM_DIR/bash_completion" # This loads nvm bash_completion
nvm install 14
nvm use 14
Now that Node.js is installed, we will use yt-dlp:
sudo curl -L https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp -o /usr/local/bin/yt-dlp #download yt-dlp
sudo chmod a+rx /usr/local/bin/yt-dlp # Make executable
Finally, let’s clone the project and start the server:
git clone https://github.com/mayeaux/generate-subtitles
cd generate-subtitles
npm install
npm start
Your server should now be active at localhost:3000. Just open a browser and head to that URL to start using the app!
Using a GPU Cloud Provider
If you want to leverage a GPU for faster transcriptions, consider renting a GPU server from a cloud provider like VastAI. If you’re unfamiliar, think of it as hiring a high-performance sports car for a quick trip instead of driving your economy car. This will significantly speed up the transcription process. You can use this link for a referral save 2.5% on your purchase.
To set things up on Vast, you can utilize the following script:
Vast server setup script. Keep in mind, this script is a work in progress but contains the key components you need.Configuration Instructions for Vast Server
When setting up the Vast server, make sure to open the necessary ports:
-p 8081:8081 -p 8080:8080 -p 80:80 -p 443:443 -p 3000:3000 -p 5000:5000
After hitting select and saving your settings, the instance should have the appropriate ports open for access to the web app.
Understanding the Code: A Simple Analogy
Imagine you’re organizing a big event, and you need to set the stage (installation) before the guests (code) can come in and enjoy (run the application). You have a blueprint (instructions) to build your stage, and once you gather all materials (install the tools like Node.js, Whisper, and yt-dlp), your venue (server) is ready.
In this analogy, Whisper AI is your sound system, making sure your performers (audio/video) are transformed into clear announcements (transcriptions). On the other hand, LibreTranslate acts as your interpreter, converting those announcements into different languages so everyone at your event can understand. Just ensure the setup is done well to accommodate your guests efficiently!
Troubleshooting Tips
If you run into issues during setup, here are some troubleshooting ideas:
- Verify that Node.js is properly installed by running
node -vin your terminal. You should see the version number. - If the app does not start, double-check your steps to ensure all required packages are installed correctly.
- For issues related to GPU performance, ensure you are using the correct drivers and that your server is configured to utilize GPU resources.
For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
Conclusion
Now that you have a comprehensive understanding of generating subtitles using Whisper AI and LibreTranslate, you can enhance the accessibility of your media content! Remember to keep experimenting and refining your process.
At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

