Kokoro TTS MCP Server
A Model Context Protocol (MCP) server that provides text-to-speech capabilities using the Kokoro TTS engine. This server exposes TTS functionality through MCP tools, making it easy to integrate speech synthesis into your applications.
Prerequisites
- Python 3.10 or higher
uv
package manager
Installation
- First, install the
uv
package manager:
- Clone this repository and install dependencies:
Features
- Text-to-speech synthesis with customizable voices
- Adjustable speech speed
- Support for saving audio to files or direct playback
- Cross-platform audio playback support (Windows, macOS, Linux)
Usage
The server provides a single MCP tool generate_speech
with the following parameters:
text
(required): The text to convert to speechvoice
(optional): Voice to use for synthesis (default: "af_heart")speed
(optional): Speech speed multiplier (default: 1.0)save_path
(optional): Directory to save audio filesplay_audio
(optional): Whether to play the audio immediately (default: False)
Example Usage
Dependencies
- kokoro >= 0.8.4
- mcp[cli] >= 1.3.0
- soundfile >= 0.13.1
Platform Support
Audio playback is supported on:
- Windows (using
start
) - macOS (using
afplay
) - Linux (using
aplay
)
MCP Configuration
Add the following configuration to your MCP settings file:
License
[Add your license information here]
This server cannot be installed
hybrid server
The server is able to function both locally and remotely, depending on the configuration or use case.
Provides text-to-speech capabilities through the Model Context Protocol, allowing applications to easily integrate speech synthesis with customizable voices, adjustable speech speed, and cross-platform audio playback support.
Related Resources
Related MCP Servers
- -securityAlicense-qualityA Model Context Protocol server that integrates high-quality text-to-speech capabilities with Claude Desktop and other MCP-compatible clients, supporting multiple voice options and audio formats.Last updated -131TypeScriptMIT License
- -securityFlicense-qualityEnables seamless integration with Typecast API through the Model Context Protocol, allowing clients to manage voices, convert text to speech, and play audio in a standardized way.Last updated -2Python
Gladia MCPofficial
-securityAlicense-qualityOfficial Model Context Protocol server that enables interaction with powerful Speech-to-Text and Audio Intelligence APIs, allowing clients like Claude Desktop to transcribe audio, analyze speech, translate content, and more.Last updated -2PythonMIT License- -securityAlicense-qualityA Model Context Protocol server that enables developers to integrate advanced text-to-speech and video translation capabilities into their applications through simple API calls.Last updated -PythonMIT License