Leverages OpenAI's API for speech-to-text and text-to-speech services, supporting both cloud-based processing and compatibility with local alternatives through an OpenAI-compatible API interface.
Features a demo video showcasing Voice Mode functionality that users can watch to understand how the voice interactions work.
VoiceMode
Install via:
uv tool install voice-mode
| getvoicemode.com
Natural voice conversations for AI assistants. VoiceMode brings human-like voice interactions to Claude Code, AI code editors through the Model Context Protocol (MCP).
🖥️ Compatibility
Runs on: Linux • macOS • Windows (WSL) • NixOS | Python: 3.10+
✨ Features
- 🎙️ Natural Voice Conversations with Claude Code - ask questions and hear responses
- 🗣️ Supports local VoiceModels - works with any OpenAI API compatible STT/TTS services
- ⚡ Real-time - low-latency voice interactions with automatic transport selection
- 🔧 MCP Integration - seamless with Claude Code (and other MCP clients)
- 🎯 Silence detection - automatically stops recording when you stop speaking (no more waiting!)
- 🔄 Multiple transports - local microphone or LiveKit room-based communication
🎯 Simple Requirements
All you need to get started:
- 🎤 Computer with microphone and speakers
- 🔑 OpenAI API Key (optional) - VoiceMode can install free, open-source transcription and text-to-speech services locally
Optional for enhanced performance:
- 🍎 Xcode (macOS only) - Required for Core ML acceleration of Whisper models (2-3x faster inference). Install from Mac App Store then run
sudo xcode-select -s /Applications/Xcode.app/Contents/Developer
Quick Start
Automatic Installation (Recommended)
Install Claude Code with VoiceMode configured and ready to run on Linux, macOS, and Windows WSL:
This installer will:
- Install all system dependencies (Node.js, audio libraries, etc.)
- Install Claude Code if not already installed
- Configure VoiceMode as an MCP server
- Set up your system for voice conversations
- Offer to install free local STT/TTS services if no API key is provided
Manual Installation
For manual setup steps, see the Getting Started Guide.
🎬 Demo
Watch VoiceMode in action with Claude Code:
The converse
function makes voice interactions natural - it automatically waits for your response by default, creating a real conversation flow.
Installation
Prerequisites
- Python >= 3.10
- Astral UV - Package manager (install with
curl -LsSf https://astral.sh/uv/install.sh | sh
) - OpenAI API Key (or compatible service)
System Dependencies
Note for WSL2 users: WSL2 requires additional audio packages (pulseaudio, libasound2-plugins) for microphone access.
Follow the Ubuntu/Debian instructions above within WSL.
VoiceMode includes a flake.nix with all required dependencies. You can either:
- Use the development shell (temporary):
- Install system-wide (see Installation section below)
Quick Install
Configuration for AI Coding Assistants
📖 Looking for detailed setup instructions? Check our comprehensive Getting Started Guide for step-by-step instructions!
Below are quick configuration snippets. For full installation and setup instructions, see the integration guides above.
Or with environment variables:
Alternative Installation Options
1. Install with nix profile (user-wide):
2. Add to NixOS configuration (system-wide):
3. Add to home-manager:
4. Run without installing:
Configuration
- 📖 Getting Started - Step-by-step setup guide
- 🔧 Configuration Reference - All environment variables
Quick Setup
The only required configuration is your OpenAI API key:
Local STT/TTS Services
For privacy-focused or offline usage, VoiceMode supports local speech services:
- Whisper.cpp - Local speech-to-text with OpenAI-compatible API
- Kokoro - Local text-to-speech with multiple voice options
These services provide the same API interface as OpenAI, allowing seamless switching between cloud and local processing.
Troubleshooting
Common Issues
- No microphone access: Check system permissions for terminal/application
- WSL2 Users: Additional audio packages (pulseaudio, libasound2-plugins) required for microphone access
- UV not found: Install with
curl -LsSf https://astral.sh/uv/install.sh | sh
- OpenAI API error: Verify your
OPENAI_API_KEY
is set correctly - No audio output: Check system audio settings and available devices
Audio Saving
To save all audio files (both TTS output and STT input):
Audio files are saved to: ~/.voicemode/audio/YYYY/MM/
with timestamps in the filename.
Documentation
📚 Read the full documentation at voice-mode.readthedocs.io
Getting Started
- Getting Started - Step-by-step setup for all supported tools
- Configuration Guide - Complete environment variable reference
Development
- Development Setup - Local development guide
Service Guides
- Whisper.cpp Setup - Local speech-to-text configuration
- Kokoro Setup - Local text-to-speech configuration
- LiveKit Integration - Real-time voice communication
Links
- Website: getvoicemode.com
- Documentation: voice-mode.readthedocs.io
- GitHub: github.com/mbailey/voicemode
- PyPI: pypi.org/project/voice-mode
Community
- Twitter/X: @getvoicemode
- YouTube: @getvoicemode
See Also
- 🚀 Getting Started - Setup instructions for all supported tools
- 🔧 Configuration Reference - Environment variables and options
- 🎤 Local Services Setup - Run TTS/STT locally for privacy
License
MIT - A Failmode Project
mcp-name: com.failmode/voicemode
This server cannot be installed
hybrid server
The server is able to function both locally and remotely, depending on the configuration or use case.
Natural voice conversations for AI assistants that brings human-like voice interactions to Claude, ChatGPT, and other LLMs through the Model Context Protocol (MCP).
Related MCP Servers
- AsecurityAlicenseAqualityA Model Context Protocol server that enables AI assistants like Claude to interact with Google Cloud Platform environments through natural language, allowing users to query and manage GCP resources during conversations.Last updated -9193165MIT License
- -securityAlicense-qualityA Model Context Protocol server that integrates high-quality text-to-speech capabilities with Claude Desktop and other MCP-compatible clients, supporting multiple voice options and audio formats.Last updated -21MIT License
- -securityFlicense-qualityMCP ChatGPT Responses connects Claude to ChatGPT through two essential tools: standard queries for AI-to-AI conversations and web-enabled requests for current information. It uses OpenAI's Responses API to maintain conversation state automatically.Last updated -13
- -securityAlicense-qualityA Model Context Protocol (MCP) server that allows AI assistants like Claude to interact with Go's Language Server Protocol (LSP) and benefit from advanced Go code analysis features.Last updated -43Apache 2.0