How to Create Your Own Local Voice Chatbot with June

Aug 3, 2024 | Educational

Are you interested in combining language models, speech recognition, and text-to-speech capabilities to create a powerful voice chatbot? Look no further! In this blog, we will explore how to set up and use **June** — a local voice chatbot that utilizes the capabilities of Ollama, Hugging Face Transformers, and Coqui TTS Toolkit. Let’s take the plunge and begin our journey into voice-assisted interactions that prioritize your privacy!

Overview

**June** amalgamates the brilliance of Ollama for language capabilities, Hugging Face Transformers for speech recognition, and the Coqui TTS Toolkit for rendering voice outputs. The prime advantage of using June is its local nature, meaning your data stays private and isn’t transmitted to external servers.

Demo of text and voice interaction with June

Interaction Modes

  • Text Input/Output: Input text and get text responses.
  • Voice Input/Text Output: Speak your request and receive text-based responses.
  • Text Input/Audio Output: Provide text and get synthesized audio responses.
  • Voice Input/Audio Output (Default): Speak your inputs and get responses in both text and audio form.

Installation

Before embarking on this installation journey, ensure you meet the necessary prerequisites:

Pre-requisites

  • Ollama
  • Python 3.10 or greater (with pip)
  • Python development package (for GNULinux only)
  • PortAudio development package (for GNULinux only)
  • PortAudio (for macOS only)
  • Microsoft Visual C++ 14.0 or greater (for Windows only)

From Source

  • Method 1: Direct Installation
    pip install git+https://github.com/mezbaul-h/june.git@master
  • Method 2: Clone and Install
    git clone https://github.com/mezbaul-h/june.git
    cd june
    pip install .

Usage

Once the installation is complete, it’s time to pull the language model (default: llama3.1:8b-instruct-q4_0) using Ollama:

ollama pull llama3.1:8b-instruct-q4_0

Next, run the program with the default configuration:

june-va

This will use llama3.1:8b-instruct-q4_0 for LLM, openai/whisper-small.en for speech recognition, and tts_models/en/ljspeech/glow-tts for audio synthesis.

Customization is also possible by using a JSON configuration file:

june-va --config pathtoconfig.json

Customization

You can tailor the application to your liking by employing a configuration file. This file needs to be in JSON format. Here’s a glimpse of the default configuration:

{
    "llm": {
        "disable_chat_history": false,
        "model": "llama3.1:8b-instruct-q4_0"
    },
    "stt": {
        "device": "torch device identifier (cuda if available; otherwise cpu)",
        "generation_args": {
            "batch_size": 8
        },
        "model": "openai/whisper-small.en"
    },
    "tts": {
        "device": "torch device identifier (cuda if available; otherwise cpu)",
        "model": "tts_models/en/ljspeech/glow-tts"
    }
}

This configuration can be partially modified according to your preferences. For instance, if you want to run the assistant only in text mode without speech recognition, you could use the following configuration:

{
    "stt": null
}

Frequently Asked Questions

Q: How does the voice input work?

When you see the message “[system] Listening for sound…”, simply speak into your microphone without any wake command. The tool will automatically detect your voice input. Maintain silence for 3 seconds after speaking to allow processing.

Q: Can I clone a voice?

Yes, many models supported by the Coqui TTS Toolkit allow voice cloning. You’ll need to provide a small audio clip (about a minute) using your own speaker profile.

Q: Can I use a remote Ollama instance with June?

Certainly! Set the OLLAMA_HOST environment variable to the URL of your remote Ollama instance and run the program as usual.

OLLAMA_HOST=http://localhost:11434 june-va

Troubleshooting

If you encounter challenges during installation or usage, here are some troubleshooting tips to consider:

  • Ensure all pre-requisites, especially Python and the necessary libraries, are properly installed.
  • Check for correct configuration in your JSON file, ensuring that all necessary fields are included.
  • Consult the logs for any specific error messages to guide your troubleshooting process.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox