Provides VoIP calling capabilities through Asterisk PBX systems, enabling AI agents to make phone calls and handle voice conversations
Enables VoIP calling through Cisco Unified Communications Manager (CUCM), allowing AI agents to make phone calls via enterprise Cisco phone systems
Integrates with OpenAI's Real-Time Voice API and o3-mini model to power AI-driven phone conversations and generate sophisticated call instructions from simple briefs
CallCenter.js MCP + CLI
An MCP Server, CLI tool, and API that makes phone calls on your behalf using VoIP.
Just tell Claude what you want to accomplish, and it will call and handle the conversation for you. This is essentially an MCP Server that bridges between OpenAI's Real-Time Voice API and your VoIP connection to call people on your behalf.
⚠️ Vibe-coded side project! Please do not use this in any kind of professional context. This is a side project coded in a weekend. There are no guard rails. Your MCP client can call any number with this, even if you don't ask it to. In fact, it has done so during testing - it called a random number during the night "for testing" and played back scary low-pitched noises - then claimed it called MY number. So YMMV, no warranties. See disclaimer below.
📞 Example: Order Pizza with Claude
You: "Can you call Tony's Pizza and order a large pepperoni pizza for delivery to 123 Main St? My name is John and my number is 555-0123."
Claude automatically calls the restaurant:
Pizza ordered successfully! 🍕
📚 Quick Context for the Uninitiated
VoIP (Voice over IP) is how you make phone calls over the internet instead of traditional phone lines. SIP (Session Initiation Protocol) is the language these systems speak to connect calls. Think of it as HTTP but for phone calls.
Fritz!Box is a popular German router/modem that happens to have a built-in phone system (PBX). If you have one, you already have everything you need to make VoIP calls - this tool just connects to it. Outside Germany, you might know similar devices from other brands, or use dedicated VoIP services like Asterisk, 3CX, or cloud providers.
MCP (Model Context Protocol) is Anthropic's standard for connecting AI assistants like Claude to external tools and services. It's what lets MCP clients actually do things instead of just talking about them.
🚀 What This Enables
- 🔌 MCP Server - Use directly in Claude Code or any MCP client (most popular usage)
- 🖥️ CLI Tool - Command-line interface for direct phone calls
- 📚 TypeScript API - Programmatic library for building voice applications
Built as a bridge between OpenAI's Real-Time Voice API and VoIP networks, with multiple codec support (G.722, G.711), and expanded SIP protocol support for broad VoIP compatibility. Compatible with the latest gpt-realtime
model released August 28, 2025.
🏗️ System Architecture
⚠️ Vibe-coded project! Developed and tested on Fritz!Box (a German router with built-in VoIP) only. Other provider configs are research-based but untested. YMMV, no warranties. See disclaimer below.
🔌 MCP Client Integration (Most Popular!)
Perfect for when your coding agent needs to call library authors to complain about their documentation! 😄
Quick Setup
Then just ask your MCP Client to make calls:
"Can you call the pizza place and order a large pepperoni? My number is 555-0123."
Your MCP Client will automatically handle the entire conversation using the AI Voice Agent! 🤖📞
✨ Key Features
- 🎙️ Multiple Codec Support: G.722 wideband (16kHz) + G.711 fallback for broad compatibility
- 🤖 AI-Powered Conversations: Uses OpenAI's Real-Time Voice API with the latest
gpt-realtime
model (released August 28, 2025) for actual calls, with o3-mini model for instruction generation - 🌍 Automatic Language Detection: Intelligently detects conversation language from call briefs and configures transcription accordingly
- 🎭 Auto Voice Selection: New 'auto' mode where o3-mini selects optimal voice based on call context (formality, industry, goals)
- 🔊 Voice Characteristics: Full support for all 10 OpenAI Realtime API voices with gender and personality awareness
- 🌐 Expanded SIP Support: Configurations for common SIP providers (Fritz!Box tested, others experimental)
- 🔧 Smart Configuration: Auto-detects provider requirements and optimizes settings
- 📞 Enterprise-Ready: Supports advanced SIP features (STUN/TURN, session timers, transport fallback)
- 🔄 Robust Connection Management: Automatic reconnection with intelligent error handling
- ✅ Built-in Validation: Comprehensive configuration validation with network testing
- 🎯 Provider Profiles: Pre-configured settings for popular SIP systems
- 🔌 MCP Server: Integrate with MCP clients like Claude Code
- 📚 TypeScript API: Programmatic library for building voice applications
- 📝 Call Brief Processing: Natural language call instructions using o3-mini model with structured JSON output
- 🎵 Optional Call Recording: Stereo WAV recording with caller/AI separation
🚀 Quick Start
Option 1: Run Instantly with npx (No Installation) ⚡
Fastest way to try it out:
Or using a .env file:
Note: First run may show build warnings if you don't have C++ build tools, but will work fine with G.711 codec fallback (standard phone quality). For much better audio quality, install build tools first to enable G.722 wideband codec.
Option 2: Local Installation
Prerequisites
- Node.js 20+
- Python 3.x + Build tools (for G.722 wideband audio - much better call quality)
- macOS: Xcode Command Line Tools (
xcode-select --install
) - Windows: Visual Studio Build Tools
- Linux:
build-essential
package
- macOS: Xcode Command Line Tools (
- OpenAI API key
Note: Without build tools, the system automatically falls back to G.711 (standard phone quality). G.722 provides 2x bandwidth for clearer, more natural conversations.
Installation
Configuration
Edit config.json
with your settings:
🎯 Usage Options
1. MCP Server (Claude Code Integration) ⭐
Most popular usage - integrates with Claude Code for seamless AI-powered calling. Perfect for when your coding agent needs to call library authors to complain about their documentation! 😄
Quick Setup with npx (Recommended)
Option 1: Using MCP Client CLI (Easiest)
⚠️ Important: Replace the placeholder values with your actual SIP credentials and OpenAI API key, or the server will fail to connect.
Option 2: Manual Configuration
Configure in Claude Code's MCP settings to automatically pull from GitHub:
Alternative: Local Installation
For local development or if you prefer local installation:
Or configure Claude Code with local installation:
Available MCP tools:
simple_call
- Make calls with automatic instruction generationadvanced_call
- Make calls with granular parameter control
Example usage in MCP Client:
More examples:
The MCP Client automatically handles the entire conversation using the AI Voice Agent!
2. Command Line Interface
Perfect for when you need to curl -X POST
your way out of social obligations, or finally implement that O(n log n) ai-human-sort
algorithm - because nothing says "efficient sorting" like crowdsourcing comparisons to random strangers via VoIP! 😄
💡 Use --brief
instead of --instructions
for better results!
The --brief
option uses OpenAI's o3-mini model to generate sophisticated instructions from your simple description, while --instructions
sends your text directly to the Real-Time Voice API. Since the Real-Time Voice API is optimized for speed (not sophistication), --brief
typically produces much better call outcomes.
CLI Options
3. Programmatic API
📚 API Reference
makeCall(options: CallOptions): Promise<CallResult>
Make a phone call with the AI agent.
CallOptions
CallResult
createAgent(config, options?): Promise<VoiceAgent>
Create a VoiceAgent instance for advanced use cases.
Configuration Structure
Environment Variables
All configuration options can be set via environment variables (useful for npx usage):
Required Variables:
Optional Variables:
Priority order: CLI flags > Config file > Environment variables
✅ Quick Success Check
Before making real calls, validate your setup with these safe tests:
1. Configuration Validation
2. Test Call to Yourself (Fritz!Box users)
3. What to Expect
- ✅ Working setup: Clear audio, proper AI responses, clean call termination
- ⚠️ Network issues: "Connection failed" errors → check firewall/STUN settings
- ⚠️ Auth problems: "401 Unauthorized" → verify SIP credentials
- ⚠️ Codec issues: Poor audio quality → G.722 compilation may have failed
Pro tip: Start with
--duration 30
for test calls to avoid long waits if something goes wrong.
📋 Configuration Validation
The built-in validation system provides comprehensive analysis:
The validator will check:
- ✅ Configuration syntax and required fields
- ✅ Provider-specific requirements
- ✅ Network connectivity to SIP server
- ✅ STUN server reachability
- ✅ Codec availability (G.722/G.711)
- ✅ Provider compatibility score
🌐 SIP Provider Compatibility
✅ Actually Tested
- AVM Fritz!Box - German router brand with built-in VoIP/SIP phone system ✅ WORKS (only one actually tested)
🤷 Vibe-coded Configs (Educated Guesses)
- Asterisk PBX - Open source PBX (FreePBX, Elastix, etc.) 🤷 UNTESTED
- Cisco CUCM - Enterprise Unified Communications 🤷 UNTESTED
- 3CX Phone System - Popular business PBX 🤷 UNTESTED
- Generic SIP Providers - Standards-compliant SIP trunks 🤷 UNTESTED
🔧 Provider-Specific Features
The provider profiles are based on research and documentation, not actual testing:
Provider | Transport | NAT Traversal | Session Timers | PRACK | Keepalive |
---|---|---|---|---|---|
Fritz Box | UDP | Not needed | Optional | Disabled | Re-register |
Asterisk | UDP/TCP | STUN | Supported | Optional | OPTIONS ping |
Cisco CUCM | TCP preferred | STUN required | Required | Required | OPTIONS ping |
3CX | TCP/UDP | STUN | Supported | Optional | Re-register |
Configuration Decision Tree
📝 Configuration Examples
The project includes ready-to-use configurations for all major providers:
config.example.json
- AVM Fritz!Box (home/SMB default)config.asterisk.example.json
- Asterisk PBX with advanced featuresconfig.cisco.example.json
- Cisco CUCM enterprise setupconfig.3cx.example.json
- 3CX Phone System configurationconfig.generic.example.json
- Generic SIP provider template
🎵 Audio Quality & Codecs
Codec Priority & Negotiation
- G.722 (Preferred) - 16kHz wideband, superior voice quality
- G.711 μ-law (Fallback) - 8kHz narrowband, universal compatibility
- G.711 A-law (Fallback) - 8kHz narrowband, European standard
G.722 Implementation
- Native C++ addon for optimal performance
- Based on reference implementations from CMU and Sippy Software
- Automatic fallback to G.711 if compilation fails
- Real-time encoding/decoding with low latency
Optional Call Recording
- Stereo WAV format with caller on left channel, AI on right channel
- Optional filename specification
- Synchronized audio streams for perfect alignment
- High-quality PCM recording at native sample rates
Testing Audio Quality
🤖 AI Call Brief Processing
Why This Matters: Real-Time Voice API Needs Better Instructions
OpenAI's Real-Time Voice API is optimized for speed, not sophistication. It's great at natural conversation but struggles with complex, goal-oriented tasks without very specific instructions. Here's the problem:
❌ What doesn't work well:
❌ What's tedious and error-prone:
✅ What works brilliantly:
How It Works
The system uses OpenAI's o3-mini reasoning model (their latest small reasoning model - smart but fast) to automatically generate detailed, sophisticated instructions from your simple brief. The o3-mini model:
- Analyzes your brief and understands the goal
- Creates conversation states and flow logic
- Generates specific instructions for each phase of the call
- Handles edge cases like voicemail, objections, and alternatives
- Adapts language and tone based on context
- Provides fallback strategies when things don't go as planned
Call Flow Sequence
Before/After Example
Your simple input:
What o3-mini generates (excerpt):
Automatic Adaptations
The o3-mini brief processor automatically:
- Detects language from your brief and generates instructions in that language
- Creates conversation flow with logical states and transitions
- Handles cultural context (German restaurants vs. American vs. Japanese)
- Generates appropriate examples with real phrases (no placeholders)
- Provides voicemail scripts for when nobody answers
- Plans for objections and alternative solutions
When to Use Each Approach
- Use
--brief
for 95% of calls - it's easier and produces better results - Use
--instructions
only when you need very specific, custom behavior - Brief processing is perfect for: reservations, appointments, business calls, customer service
- Direct instructions are better for: highly specialized scenarios, testing, or when you've already perfected your prompt
🎤 Voice Selection
The AI agent supports 10 different voices from OpenAI's Realtime API, each with unique characteristics. By default, the system uses auto mode where o3-mini intelligently selects the optimal voice based on your call's context.
Available Voices
Voice | Gender | Description | Best For |
---|---|---|---|
marin | Female | Clear, professional feminine voice | All-purpose: business calls, customer support, negotiations |
cedar | Male | Natural masculine voice with warm undertones | All-purpose: professional calls, consultations, service interactions |
alloy | Neutral | Professional voice with good adaptability | Technical discussions, business contexts, general inquiries |
echo | Male | Conversational masculine voice | Casual to formal interactions, versatile tone |
shimmer | Female | Warm, expressive feminine voice | Empathetic conversations, sales, professional contexts |
coral | Female | Warm and friendly feminine voice | Customer interactions, consultations, support calls |
sage | Neutral | Calm and thoughtful voice | Medical consultations, advisory roles, serious discussions |
ash | Neutral | Clear and precise voice | Technical explanations, instructions, educational content |
ballad | Female | Melodic and smooth feminine voice | Presentations, storytelling, engaging conversations |
verse | Neutral | Versatile and expressive voice | Dynamic conversations, adaptable to any context |
Auto Voice Selection (Recommended)
The auto mode (default) uses o3-mini to analyze your call context and select the most appropriate voice:
Manual Voice Selection
You can override auto selection when you have specific requirements:
Configuration Options
Set default voice in your config file or environment:
Voice Selection Guidelines
The auto mode considers these factors:
- Formality Level: High (cedar, marin, sage) → Medium (alloy, verse) → Low (echo, coral, shimmer)
- Industry Context: Healthcare (sage, shimmer), Finance (cedar, sage), Retail (coral, echo), Tech (alloy, ash)
- Goal Type: Authority needed (cedar, sage), Friendliness (coral, shimmer), Efficiency (marin, alloy)
- Language: Voices adapt to detected language from your call brief
MCP Integration
The MCP tools strongly recommend auto mode but support manual override:
🔄 Advanced Features
Smart Connection Management
- Automatic Reconnection: Exponential backoff with provider-specific error handling
- Transport Fallback: UDP → TCP → TLS based on what works
- Provider-Aware Error Recovery: Different strategies for Fritz Box vs. Asterisk vs. Cisco
- Network Change Handling: Adapts to network connectivity changes
Enhanced SIP Protocol Support
- STUN/TURN Integration: NAT traversal for cloud and enterprise deployments
- Session Timers (RFC 4028): Connection stability for long calls
- PRACK Support (RFC 3262): Reliable provisional responses for enterprise systems
- Multiple Transports: UDP, TCP, TLS with intelligent fallback
Configuration Intelligence
- Provider Auto-Detection: Identifies provider from SIP domain/IP
- Requirements Validation: Ensures all provider-specific needs are met
- Network Testing: Real connectivity tests to SIP servers and STUN servers
- Optimization Suggestions: Actionable recommendations for better performance
🛠️ Development & Testing
Build Commands
Configuration Testing
Project Structure
📊 Validation & Diagnostics
The built-in validation system provides comprehensive analysis:
Configuration Report Example
Network Diagnostics
- Real SIP Server Testing: Actual UDP/TCP connectivity tests
- STUN Server Validation: Tests NAT traversal capability
- Latency Measurement: Network performance assessment
- Provider-Specific Recommendations: Tailored advice based on detected issues
🔧 Troubleshooting
Configuration Issues
- Run validation first:
- Check provider compatibility:
- Get specific fix suggestions:
Network Connectivity
- Fritz Box: Usually works with UDP on local network
- Cloud/Enterprise: May need STUN servers for NAT traversal
- Firewall Issues: Ensure SIP port (5060) and RTP ports are open
Audio Quality
- Verify G.722 is available:
- Check codec negotiation in logs:
- Network issues: High latency/packet loss affects audio quality
Build Problems
- Native compilation fails:
- Provider-specific issues: Check validation recommendations for your provider
MCP Integration Issues
- Server won't start:
- Claude Code not connecting:
- Verify MCP server configuration in Claude Code settings
- Check that the working directory path is correct
- Ensure the server is running and accessible
📈 What I Built
This is a personal project that includes:
- 🌐 Fritz!Box Support: Actually tested and works
- 🤷 Other SIP Configs: Vibe-coded based on documentation reading
- 🔄 Connection Handling: Seems to work, has retry logic
- ✅ Config Validation: Catches obvious mistakes
- 📊 Network Testing: Basic connectivity checks
- 🎯 Provider Profiles: Research-based guesses about different systems
- 🔌 MCP Server: Works with Claude Code (tested)
- 📚 TypeScript API: Clean interfaces for programmatic use
- 📝 Call Brief Processing: Uses o3-mini to generate instructions (works well)
- 🎵 Optional Call Recording: Stereo WAV files with left/right channels
- 📋 Transcript Capture: Real-time conversation logs
⚠️ Important Disclaimer
This project is vibe-coded! 🚀
This means:
- ✅ Works on Fritz!Box - that's what I actually tested
- 🤷 Other providers - I tried to make it more useful but can't promise anything
- 🤷 Advanced features - seemed like good ideas based on research, but who knows
- ⚠️ YMMV - your setup is probably different than mine
- ⚠️ No warranties - use at your own risk
What This Means for You
- Fritz Box users: Should work great! ✅
- Other providers: The configuration profiles are educated guesses based on research - they might work, they might not
- Enterprise users: I tried to add the features that seemed important, but I have no idea if they actually work correctly
- Issues & PRs: I'll accept pull requests, but I can't promise to fix issues I can't reproduce or test
If You Want to Contribute
- ✅ Test it on your setup and let others know what works
- ✅ Share working configs if you get something else working
- ✅ Fix stuff that's broken and submit PRs
- ✅ Tell me if my assumptions were wrong about how providers work
The validation tools might help debug issues, but honestly, the real test is whether you can make actual calls.
📜 License
MIT License - see LICENSE for details.
Third-Party Components
- G.722 Codec: Public domain and BSD licensed implementations
- SIP Protocol: Based on sipjs-udp (MIT licensed)
- Dependencies: Various open source licenses (see package.json)
🤝 Contributing
- Fork the repository
- Create a feature branch
- Add/update validation for new providers
- Test with
npm run validate:detailed
- Submit a pull request
📞 Support
- Configuration Issues: Use
npm run validate:detailed
for diagnostics - Provider Support: Check compatibility matrix above
- Build Problems: See troubleshooting section
- Feature Requests: You can open GitHub issues, but they're unlikely to get attention anytime soon. Pull requests are much preferred!
Ready to get started? Copy an example config, run npm run validate:detailed
, and start making AI-powered voice calls! 🚀
This server cannot be installed
hybrid server
The server is able to function both locally and remotely, depending on the configuration or use case.
Enables AI assistants to make real phone calls on your behalf using VoIP, handling conversations automatically through OpenAI's Real-Time Voice API. Simply tell Claude what you want to accomplish and it will call and manage the entire conversation for you.
- 📞 Example: Order Pizza with Claude
- 📚 Quick Context for the Uninitiated
- 🚀 What This Enables
- 🏗️ System Architecture
- 🔌 MCP Client Integration (Most Popular!)
- ✨ Key Features
- 🚀 Quick Start
- 🎯 Usage Options
- 📚 API Reference
- ✅ Quick Success Check
- 📋 Configuration Validation
- 🌐 SIP Provider Compatibility
- 🎵 Audio Quality & Codecs
- 🤖 AI Call Brief Processing
- 🎤 Voice Selection
- 🔄 Advanced Features
- 🛠️ Development & Testing
- 📊 Validation & Diagnostics
- 🔧 Troubleshooting
- 📈 What I Built
- ⚠️ Important Disclaimer
- 📜 License
- 🤝 Contributing
- 📞 Support
Related MCP Servers
- AsecurityAlicenseAqualityIntegrate Claude with Any OpenAI SDK Compatible Chat Completion API - OpenAI, Perplexity, Groq, xAI, PyroPrompts and more.Last updated -127140MIT License
- -securityAlicense-qualityEnables Claude and other AI assistants to interact with your computer's audio system, allowing for recording from microphones and playing audio through speakers.Last updated -5MIT License
- -securityFlicense-qualityA server that connects Claude AI to Twilio through the Model Context Protocol, enabling prompt-assisted management of Twilio accounts, phone numbers, and regulatory compliance.Last updated -1
- AsecurityAlicenseAqualityA server that enables Claude 3.7 and other AI agents to access VOICEVOX-compatible speech synthesis engines (AivisSpeech, VOICEVOX, COEIROINK) through the Model Context Protocol.Last updated -110MIT License