Skip to main content
Glama

Deepgram MCP Server

A Model Context Protocol (MCP) server that provides access to Deepgram's speech recognition and text-to-speech capabilities.

Features

  • Audio Transcription: Convert audio to text with high accuracy

  • Text-to-Speech: Generate natural-sounding speech from text with automatic compression

  • Audio Analysis: Extract insights like sentiment, topics, intents, and entities

  • Speaker Diarization: Identify different speakers in audio

  • Language Detection: Automatically detect the language of audio

  • Multiple Models: Support for various Deepgram models optimized for different use cases

  • Smart Audio Compression: Automatically compresses generated audio files for efficient transfer

Installation

  1. Clone this repository

  2. Install dependencies:

    npm install
  3. Copy the environment file and add your Deepgram API key:

    cp env.example .env # Edit .env and add your DEEPGRAM_API_KEY, OPENAI_API_KEY or GROQ_API_KEY (whatever you want to use)
  4. Build the project:

    npm run build

Usage

HTTP Transport (Recommended for Production)

npm start # or node dist/index.js

The server will start on port 8080 by default. You can specify a different port:

node dist/index.js --port 8081

STDIO Transport (For Development)

npm run start:stdio # or node dist/index.js --stdio --port 8081

Available Tools

1. transcribe_audio

Transcribe audio to text with various options for customization.

Parameters:

  • audioUrl or audioData: Audio source (URL or base64)

  • model: Deepgram model to use (default: "nova-2-general")

  • language: Language code (default: "en")

  • punctuate: Add punctuation (default: true)

  • diarize: Speaker identification (default: false)

  • sentiment: Sentiment analysis (default: false)

  • And many more options...

2. text_to_speech

Convert text to speech using Deepgram's TTS models with automatic compression.

Parameters:

  • text: Text to convert to speech (required)

  • model: TTS model to use (default: "aura-asteria-en")

  • voice: Voice selection

  • format: Output format (default: "mp3")

  • speed: Speech speed (default: 1.0)

Output:

  • Original audio file saved to generated_audio/ folder

  • Compressed audio data saved to compressed_audio/ folder

  • Response includes file paths and compression metadata

3. analyze_audio

Perform advanced audio analysis including sentiment, topics, intents, and entities.

Parameters:

  • audioUrl or audioData: Audio source

  • features: Analysis features to enable

  • model: Model for analysis

4. get_models

Get information about available Deepgram models.

Parameters:

  • model_type: Filter by model type ("transcription", "tts", or "all")

Client Configuration

For MCP clients, use this configuration:

{ "mcpServers": { "deepgram": { "url": "http://localhost:8080/mcp" } } }

Development

# Watch mode for development npm run watch # Development with STDIO npm run dev:stdio # Development with HTTP npm run dev

API Key

Get your Deepgram API key from Deepgram Console.

Audio Compression System

The TTS functionality includes an intelligent compression system that:

  • Automatically compresses generated audio files using gzip compression

  • Saves compressed data to separate files to avoid large agent responses

  • Provides decompression tools for easy audio file extraction

  • Maintains quality while reducing file sizes by 2-4x

File Structure

generated_audio/ # Original audio files ├── tts_2025-01-16T...mp3 compressed_audio/ # Compressed audio data ├── compressed_audio_2025-01-16T...json decompressed_audio/ # Decompressed audio files (after extraction) ├── decompressed_2025-01-16T...mp3

Decompression Tools

Python Script (Recommended):

python decompress_audio.py <response_file_or_compressed_file>

Node.js Script:

npm run decompress <compressed_data_file>

Agno Integration

This MCP server also includes integration with Agno, a high-performance runtime for multi-agent systems.

Agno Tests

# Text-to-Speech test (saves audio to generated_audio/ and compressed_audio/) npm run test:agno:tts # Speech-to-Text test (transcribes sample audio) npm run test:agno:stt

The TTS test will:

  1. Generate audio with automatic compression

  2. Save the response to tts_response.json

  3. Decompress the audio file to generated_audio/

License

MIT

Developer

  • Dheeraj Mudireddy (meetdheerajreddy@gmail.com)

-
security - not tested
A
license - permissive license
-
quality - not tested

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/reddheeraj/Deepgram-MCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server