Voice Recorder MCP Server

An MCP server for recording audio and transcribing it using OpenAI's Whisper model. Designed to work as a Goose custom extension or standalone MCP server.

Features

Record audio from the default microphone
Transcribe recordings using Whisper
Integrates with Goose AI agent as a custom extension
Includes prompts for common recording scenarios

Related MCP server: Audio Transcriber MCP Server

Installation

# Install from source git clone https://github.com/DefiBax/voice-recorder-mcp.git cd voice-recorder-mcp pip install -e .

Usage

As a Standalone MCP Server

# Run with default settings (base.en model) voice-recorder-mcp # Use a specific Whisper model voice-recorder-mcp --model medium.en # Adjust sample rate voice-recorder-mcp --sample-rate 44100

Testing with MCP Inspector

The MCP Inspector provides an interactive interface to test your server:

# Install the MCP Inspector npm install -g @modelcontextprotocol/inspector # Run your server with the inspector npx @modelcontextprotocol/inspector voice-recorder-mcp

With Goose AI Agent

Open Goose and go to Settings > Extensions > Add > Command Line Extension
Set the name to voice-recorder
In the Command field, enter the full path to the voice-recorder-mcp executable:
/full/path/to/voice-recorder-mcp
Or for a specific model:
/full/path/to/voice-recorder-mcp --model medium.en
To find the path, run:
which voice-recorder-mcp
No environment variables are needed for basic functionality
Start a conversation with Goose and introduce the recorder with: "I want you to take action from transcriptions returned by voice-recorder. For example, if I dictate a calculation like 1+1, please return the result."

Available Tools

start_recording: Start recording audio from the default microphone
stop_and_transcribe: Stop recording and transcribe the audio to text
record_and_transcribe: Record audio for a specified duration and transcribe it

Whisper Models

This extension supports various Whisper model sizes:

Model	Speed	Accuracy	Memory Usage	Use Case
`tiny.en`	Fastest	Lowest	Minimal	Testing, quick transcriptions
`base.en`	Fast	Good	Low	Everyday use (default)
`small.en`	Medium	Better	Moderate	Good balance
`medium.en`	Slow	High	High	Important recordings
`large`	Slowest	Highest	Very High	Critical transcriptions

The .en suffix indicates models specialized for English, which are faster and more accurate for English content.

Requirements

Python 3.12+
An audio input device (microphone)

Configuration

You can configure the server using environment variables:

# Set Whisper model export WHISPER_MODEL=small.en # Set audio sample rate export SAMPLE_RATE=44100 # Set maximum recording duration (seconds) export MAX_DURATION=120 # Then run the server voice-recorder-mcp

Troubleshooting

Common Issues

No audio being recorded: Check your microphone permissions and settings
Model download errors: Ensure you have a stable internet connection for the initial model download
Integration with Goose: Make sure the command path is correct
Audio quality issues: Try adjusting the sample rate (default: 16000)