Kokoro TTS MCP Server

A Model Context Protocol (MCP) server that provides text-to-speech capabilities using the Kokoro TTS engine. This server exposes TTS functionality through MCP tools, making it easy to integrate speech synthesis into your applications.

Prerequisites

Python 3.10 or higher
uv package manager

Installation

First, install the uv package manager:

curl -LsSf https://astral.sh/uv/install.sh | sh

Clone this repository and install dependencies:

uv venv
source .venv/bin/activate  # On Windows, use: .venv\Scripts\activate
uv pip install .

Features

Text-to-speech synthesis with customizable voices
Adjustable speech speed
Support for saving audio to files or direct playback
Cross-platform audio playback support (Windows, macOS, Linux)

Usage

The server provides a single MCP tool generate_speech with the following parameters:

text (required): The text to convert to speech
voice (optional): Voice to use for synthesis (default: "af_heart")
speed (optional): Speech speed multiplier (default: 1.0)
save_path (optional): Directory to save audio files
play_audio (optional): Whether to play the audio immediately (default: False)

Example Usage

from mcp.client import Client

async with Client() as client:
    await client.connect("kokoro-tts")
    
    # Generate and play speech
    result = await client.call_tool(
        "generate_speech",
        {
            "text": "Hello, world!",
            "voice": "af_heart",
            "speed": 1.0,
            "play_audio": True
        }
    )

Dependencies

kokoro >= 0.8.4
mcp[cli] >= 1.3.0
soundfile >= 0.13.1

Platform Support

Audio playback is supported on:

Windows (using start)
macOS (using afplay)
Linux (using aplay)

MCP Configuration

Add the following configuration to your MCP settings file:

{
  "mcpServers": {
    "kokoro-tts": {
      "command": "/Users/giannisan/pinokio/bin/miniconda/bin/uv",
      "args": [
        "--directory",
        "/Users/giannisan/Documents/Cline/MCP/kokoro-tts-mcp",
        "run",
        "tts-mcp.py"
      ]
    }
  }
}

License

[Add your license information here]

This server cannot be installed

security - not tested

license - not found

quality - not tested

How are these scores calculated?

hybrid server

The server is able to function both locally and remotely, depending on the configuration or use case.

Provides text-to-speech capabilities through the Model Context Protocol, allowing applications to easily integrate speech synthesis with customizable voices, adjustable speech speed, and cross-platform audio playback support.

Related Resources

Reddit Discussion about this server

Related MCP Servers

TTS-MCP
nakamurau1
-
security
A
license
-
quality
A Model Context Protocol server that integrates high-quality text-to-speech capabilities with Claude Desktop and other MCP-compatible clients, supporting multiple voice options and audio formats.
Last updated -
13
1
TypeScript
MIT License
Typecast API MCP Server
neosapience
-
security
F
license
-
quality
Enables seamless integration with Typecast API through the Model Context Protocol, allowing clients to manage voices, convert text to speech, and play audio in a standardized way.
Last updated -
2
Python
Gladia MCPofficial
gladiaio
-
security
A
license
-
quality
Official Model Context Protocol server that enables interaction with powerful Speech-to-Text and Audio Intelligence APIs, allowing clients like Claude Desktop to transcribe audio, analyze speech, translate content, and more.
Last updated -
2
Python
MIT License
AllVoiceLab-MCP
Ruxo0
-
security
A
license
-
quality
A Model Context Protocol server that enables developers to integrate advanced text-to-speech and video translation capabilities into their applications through simple API calls.
Last updated -
Python
MIT License

View all related MCP servers

Kokoro TTS MCP Server