Post

Voice Chat AI

Voice Chat AI

Voice Chat AI 🎙️

Voice Chat AI is a project that allows you to interact with different AI characters using speech. You can choose between various characters, each with unique personalities and voices. Have a serious conversation with Albert Einstein or role play with the OS from the movie HER.

You can run all locally, you can use openai for chat and voice, you can mix between the two. You can use ElevenLabs voices with ollama models all controlled from a Web UI. Use different chat providers like Anthropic, xAI, Ollama, OpenAI.

WebRTC Real Time API with OpenAI you can have a real time conversation, interrupt the AI and have instant responses. You can also use OpenAI’s new TTS model gpt-4o-mini-tts to make the AI more human like with emotions and expressive voices.

Ai-Speech

Features

  • Supports OpenAI, xAI, Anthropic or Ollama language models: Choose the model that best fits your needs.
  • Provides text-to-speech synthesis using XTTS or OpenAI TTS or ElevenLabs or Kokoro TTS: Enjoy natural and expressive voices.
  • Provides speech to speech using OpenAI Realtime API: Have a real time conversation with AI characters, interrupt the AI and have instant responses.
  • OpenAI Enhanced Mode TTS Model: Uses emotions and prompts to make the AI more human like.
  • Flexible transcription options: Uses OpenAI transcription by default, with option to use Local Faster Whisper.
  • Analyzes user mood and adjusts AI responses accordingly: Get personalized responses based on your mood from sentiment analysis.
  • WebUI or Terminal usage: Run with your preferred method , but recommend the ui as you can change characters, model providers, speech providers, voices, ect. on the fly.
  • HUGE selection of built in Characters: Talk with the funniest and most insane AI characters! Play escape room games, follow story lines, and more.
  • Interactive Games & Stories: Enjoy 15+ different game types (word puzzles, trivia, escape rooms) and interactive storytelling adventures.
  • Docker Support: Prebuilt image from dockerhub or build yor own image with or without nvidia cuda. Can run on CPU only.

Installation

Requirements

  • Python 3.10
  • ffmpeg
  • Ollama models or OpenAI or xAI or Anthropic for chat
  • Local XTTS, Openai API or ElevenLabs API or Kokoro TTS for speech
  • Microsoft C++ Build Tools on windows
  • Microphone
  • A sense of humor

So I built this AI Speech app you can use either in the terminal or in a webui. You can talk with any character you want to make, it comes with some characters that I have made. It also has stories and games you can follow. Check out the github repo for all the details.

More details here: voice-chat-ai-github

Watch the Demos

Watch the video

Click on the thumbnail to open the video☝️

GPU - 100% local - ollama llama3, xtts-v2

Watch the video

Click on the thumbnail to open the video☝️


CPU Only mode CLI

Alien conversation using openai gpt4o and openai speech for tts.

Watch the video

Click on the thumbnail to open the video☝️


This post is licensed under CC BY 4.0 by the author.