Say hello to your new personal assistant – GPT Automator! This revolutionary voice-controlled tool lets you perform various tasks on your Mac, all with the simple sound of your voice. Whether you want to open applications, find nearby restaurants, or even synthesize information, GPT Automator is here to help!
What You Need to Get Started
Before you dive in, ensure you have the following requirements installed on your system:
- FFmpeg (the multimedia framework necessary for audio processing)
- Python and its dependencies listed in
requirements.txt
orpyproject.toml
Installation Guide
Follow these steps to install GPT Automator:
- Install FFmpeg according to your operating system:
- On Ubuntu or Debian:
sudo apt update
andsudo apt install ffmpeg
- On Arch Linux:
sudo pacman -S ffmpeg
- On macOS using Homebrew:
brew install ffmpeg
- On Windows using Chocolatey:
choco install ffmpeg
- On Windows using Scoop:
scoop install ffmpeg
- Create a
.env
file from the.env.example
template and fill in your OpenAI API key. - Run
python gui.py
to launch the graphical user interface (GUI), where you’ll click “Record” to dictate your prompt. Alternatively, for command-line interface (CLI) users, you can usepython main.py [prompt]
.
How GPT Automator Works
Think of GPT Automator as a smart butler ready to execute your requests with finesse. Here’s how it processes your voice commands, explained with an analogy:
Imagine you ask your butler to fetch a specific book from a large library filled with thousands of titles. The butler listens to your request (GPT Automator converts your audio input to text using OpenAI’s Whisper), understands the task, and then dashes to the right section of the library (using a LangChain Agent to identify actions necessary for your request). Once at the shelf, the butler selects the book you asked for and hands it to you (it generates and executes appropriate AppleScript or JavaScript commands based on your request). Easy, right?
Example Prompts to Try
Here are a few commands you can experiment with:
- Math Queries: “What is 2 + 2?” – This will open your calculator and input the result.
- Finding Food: “Find restaurants near me” – The assistant will perform a Google search and read out the best rated places.
- Playing Games: “Play a game of chess” – Automatically opens Chess.com and sets up a game for you.
Troubleshooting Tips
If you run into any issues while using GPT Automator, try the following troubleshooting ideas:
- Ensure you have installed all dependencies correctly.
- Check that your OpenAI API key is valid and correctly placed in the
.env
file. - Verify that Python and FFmpeg are up to date.
- For persistent issues, consult the official documentation linked in the Learn More section below.
For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
Learn More
Dive deeper into the workings of GPT Automator by checking out these blog posts:
Important Considerations
A word of caution – this project generates code from your natural language commands and might be susceptible to prompt injection attacks. This implementation is primarily a proof-of-concept and is not intended for production use.
At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.