# Kafka Schema Registry MCP with LLama Integration Demo
This demo extends the Kafka Schema Registry MCP server with local LLama integration, allowing you to interact with your Schema Registry using natural language through a private LLM running in Docker.
## โ
**Status: WORKING & TESTED**
**What Works Now:**
- ๐ฅ LLama 3.2 (3B) model running locally via Ollama
- ๐ฏ Natural language queries to Schema Registry
- ๐ Bridge service connecting LLama + MCP tools
- ๐ Full Kafka + Schema Registry stack
- ๐ฅ๏ธ Interactive CLI client
- ๐ Direct API integration
- ๐ ๏ธ VSCode MCP integration support
## ๐ Quick Start
### Prerequisites
- Docker and Docker Compose
- Git
- At least 4GB RAM available for containers
- 8GB+ of free disk space (for LLama models)
- (Optional) NVIDIA GPU for faster LLama inference (GPU configuration is available but commented out by default)
- (Optional) Visual Studio Code for MCP integration
### System Requirements
**CPU-Only Mode (Default):**
- Works on any system with Docker
- LLama inference will be slower but functional
- Perfect for development and testing
**GPU Mode (Optional - for NVIDIA GPU owners):**
- Requires NVIDIA GPU with Docker GPU support and nvidia-docker
- Edit `docker-compose-llama.yml` and uncomment the GPU section:
```yaml
# Uncomment for NVIDIA GPU systems (faster inference)
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]
```
- Significantly faster LLama inference (3-5x speed improvement)
### 1. Clone and Setup
```bash
# Clone the repository
git clone https://github.com/aywengo/kafka-schema-reg-mcp.git
cd kafka-schema-reg-mcp
# Navigate to the demo directory
cd demo
# Make the script executable
chmod +x run-llama-mcp.sh
```
### 2. Start All Services
```bash
# Start everything (Kafka, Schema Registry, MCP Server, LLama)
./run-llama-mcp.sh start
```
This will:
- โ
Start Kafka and Schema Registry
- โ
Launch the MCP server
- โ
Run LLama via Ollama
- โ
Start the MCP-LLama bridge service
- โ
Pull the default LLama model (llama3.2:3b)
- โ
Auto-create the bridge service files
**Expected Output:** You should see services starting up, and finally:
```
๐ All services are up and running!
๐ก You can now interact with LLama through the bridge at http://localhost:8080
๐ก Try running: python client-example.py
```
### Quick Verification
After startup, verify everything is working:
```bash
# Test all services are healthy
curl http://localhost:8080/health
curl http://localhost:11434/api/version
curl http://localhost:38081
# Test LLama integration
python client-example.py --message "Hello, can you help me?"
# Expected: You should get a friendly response from LLama
```
If any step fails, see the **๐ Troubleshooting** section below.
### 3. Choose Your Interface
#### Option A: Interactive CLI Client
```bash
# Interactive chat mode
python client-example.py
# Single command mode
python client-example.py --message "List all subjects in the schema registry"
# Check service health
python client-example.py --health
```
#### Option B: VSCode with MCP Integration
1. **Install Claude Dev Extension**:
- Open VSCode
- Install the "Claude Dev" extension from the marketplace
- Or install any MCP-compatible extension
2. **Configure MCP Server Connection**:
Create or update your VSCode settings (`settings.json`):
```json
{
"claudeDev.mcpServers": {
"kafka-schema-registry": {
"command": "docker",
"args": [
"exec", "-i", "mcp-server",
"python", "-m", "kafka_schema_registry_mcp.server"
],
"env": {
"SCHEMA_REGISTRY_URL": "http://localhost:38081"
}
}
}
}
```
**Alternative: Direct Connection**
```json
{
"claudeDev.mcpServers": {
"kafka-schema-registry": {
"command": "curl",
"args": [
"-X", "POST",
"http://localhost:38000/tools/list_subjects",
"-H", "Content-Type: application/json"
]
}
}
}
```
3. **Use Natural Language in VSCode**:
- Open the Claude Dev chat panel
- Ask questions like: "List all subjects in the schema registry"
- The extension will automatically use the MCP tools
4. **Benefits of VSCode Integration**:
- ๐ฅ **Code Context**: Ask about schemas while viewing code
- ๐ **Documentation**: Generate docs directly in your workspace
- ๐ **Workflow Integration**: Schema queries while developing
- ๐ก **IntelliSense**: Schema-aware code completion (with compatible extensions)
#### Option C: Direct API Integration
```bash
# Test the bridge API directly
curl -X POST http://localhost:8080/chat \
-H "Content-Type: application/json" \
-d '{
"message": "List all subjects in the schema registry",
"model": "llama3.2:3b",
"use_mcp": true
}'
```
## ๐๏ธ Architecture
```
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ
โ Your Client โ โ MCP-LLama โ โ Kafka Schema โ
โ (CLI/VSCode) โโโโโบโ Bridge โโโโโบโ Registry MCP โ
โ Natural Lang. โ โ Service โ โ Server โ
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ
โ โ
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ
โ LLama/Ollama โ โ Schema โ
โ Container โ โ Registry โ
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโ
โ Kafka Broker โ
โโโโโโโโโโโโโโโโโโโ
```
## ๐ Demo Files Structure
```
demo/
โโโ docker-compose-llama.yml # Complete Docker setup with LLama integration
โโโ run-llama-mcp.sh # Management script for all services
โโโ Dockerfile.bridge # Bridge service container definition
โโโ requirements-bridge.txt # Python dependencies for bridge
โโโ client-example.py # Interactive CLI client
โโโ .env.template # Configuration template
โโโ README.md # This file
```
## ๐ฏ What You Can Do
Ask LLama natural language questions about your Schema Registry:
### ๐ Basic Operations
- "List all subjects in the schema registry"
- "Show me the latest schema for user-events"
- "Get all versions of the payment-schema"
### ๐ Schema Analysis
- "What fields are in the user-profile schema?"
- "Compare versions 1 and 2 of the order-events schema"
- "Find schemas containing a 'user_id' field"
### โ
Compatibility & Validation
- "Check if this schema is compatible: {schema_json}"
- "What are the compatibility requirements?"
- "Test schema evolution compatibility"
### ๐ Management & Statistics
- "Show registry statistics"
- "Export all schemas from production context"
- "What's the current registry mode?"
## ๐ ๏ธ Service URLs
After starting with `./run-llama-mcp.sh start`:
| Service | URL | Description |
|---------|-----|-------------|
| ๐ค **LLama Bridge** | http://localhost:8080 | Main API for LLama + MCP |
| ๐ฆ **Ollama** | http://localhost:11434 | Direct LLama API |
| ๐ง **MCP Server** | http://localhost:38000 | Schema Registry MCP tools |
| ๐ **Schema Registry** | http://localhost:38081 | Confluent Schema Registry |
| ๐๏ธ **Kafka UI** | http://localhost:38080 | AKHQ web interface |
| ๐ **Kafka** | localhost:39092 | Kafka broker |
## ๐ง Configuration
### Environment Variables
```bash
# Set custom model
export DEFAULT_MODEL="llama3:8b"
# Start with custom model
./run-llama-mcp.sh start llama3:8b
```
### Custom Configuration
```bash
# Copy and customize the environment template
cp .env.template .env
# Edit .env file with your preferences
```
### Custom Models
```bash
# Pull a different model
./run-llama-mcp.sh pull-model codellama:7b
# Use in client
python client-example.py --model codellama:7b
```
### VSCode Configuration Examples
#### For Direct MCP Connection:
```json
{
"claudeDev.mcpServers": {
"kafka-schema-registry": {
"command": "python",
"args": ["-m", "mcp", "connect", "http://localhost:38000"],
"env": {
"MCP_SERVER_URL": "http://localhost:38000"
}
}
}
}
```
#### For LLama Bridge Connection:
```json
{
"claudeDev.apiEndpoint": "http://localhost:8080/chat",
"claudeDev.mcpEnabled": true,
"claudeDev.defaultModel": "llama3.2:3b"
}
```
## ๐ Usage Examples
### CLI Interactive Mode
```bash
python client-example.py
```
```
๐ค Starting interactive chat with llama3.2:3b
๐ก Type 'help' for Schema Registry commands, 'quit' to exit
๐ง MCP tools are enabled - you can ask about schemas, subjects, etc.
------------------------------------------------------------
๐ง You: List all subjects in my schema registry
๐ค LLama: I'll check the subjects in your schema registry.
๐ง Used tools: list_subjects
The schema registry currently contains the following subjects:
- user-events-value
- order-events-value
- payment-events-value
These are the active schema subjects registered in your Kafka Schema Registry.
```
### VSCode Integration Examples
1. **Schema Exploration While Coding**:
```
In VSCode chat: "Show me the structure of the user-events schema and suggest how to deserialize it in Python"
```
2. **Schema Validation During Development**:
```
In VSCode chat: "Check if my current JSON matches the user-profile schema requirements"
```
3. **Documentation Generation**:
```
In VSCode chat: "Generate documentation for all schemas in the payments context"
```
### Single Command Mode
```bash
python client-example.py --message "Show me the structure of user-events schema"
```
### API Integration
```python
import httpx
async def ask_llama(question: str):
async with httpx.AsyncClient() as client:
response = await client.post(
"http://localhost:8080/chat",
json={
"message": question,
"model": "llama3.2:3b",
"use_mcp": True
}
)
return response.json()
# Usage
result = await ask_llama("List all schema subjects")
print(result["response"])
```
## ๐ฎ Management Commands
โ ๏ธ **Important**: Run all commands from the `demo/` directory
```bash
# Navigate to demo directory first
cd demo
# Start all services
./run-llama-mcp.sh start
# Stop all services
./run-llama-mcp.sh stop
# Restart services
./run-llama-mcp.sh restart
# Check service status
./run-llama-mcp.sh status
# View logs
./run-llama-mcp.sh logs
# View specific service logs
./run-llama-mcp.sh logs ollama
# Clean up everything (removes volumes)
./run-llama-mcp.sh cleanup
# Pull specific model
./run-llama-mcp.sh pull-model llama3:8b
```
## ๐ Troubleshooting
### โ
Common Issues (Now Fixed)
**GPU Configuration Error (FIXED):**
- โ
GPU resources are now commented out by default
- โ
Works on CPU-only systems out of the box
- ๐ก To enable GPU: Uncomment the GPU section in `docker-compose-llama.yml`
**Kafka Permission Issues (FIXED):**
- โ
Volume permissions have been corrected
- โ
Kafka starts without permission errors
- โ
No manual intervention needed
**Bridge Service Missing (FIXED):**
- โ
`bridge/main.py` is automatically created by the setup script
- โ
Bridge service connects properly to both Ollama and MCP server
- โ
Health checks work correctly
### Service Won't Start
```bash
# Check Docker status and requirements
docker info
docker-compose --version
# View detailed logs for specific issues
./run-llama-mcp.sh logs
# Start with clean state
./run-llama-mcp.sh cleanup
./run-llama-mcp.sh start
# Check individual service status
./run-llama-mcp.sh status
```
### Service Health Checks
```bash
# Quick health check of all services
curl http://localhost:8080/health # Bridge service
curl http://localhost:11434/api/version # Ollama/LLama
curl http://localhost:38081 # Schema Registry
curl http://localhost:38000/health # MCP Server
# Test end-to-end functionality
python client-example.py --health
```
### Model Issues
```bash
# Check if model is downloaded
docker exec ollama-mcp ollama list
# Pull model manually if needed
docker exec ollama-mcp ollama pull llama3.2:3b
# Test different model sizes
docker exec ollama-mcp ollama pull llama3.2:1b # Smaller, faster
# Test Ollama directly
curl -X POST http://localhost:11434/api/generate \
-H "Content-Type: application/json" \
-d '{"model": "llama3.2:3b", "prompt": "Hello", "stream": false}'
```
### Connection Issues
```bash
# Test bridge connectivity
curl -X POST http://localhost:8080/chat \
-H "Content-Type: application/json" \
-d '{"message": "Hello", "use_mcp": false}'
# Test MCP server connectivity
curl http://localhost:38000/health
# Check Docker network
docker network ls
docker network inspect demo_kafka-network-mcp
```
### Performance Issues
**If LLama is slow:**
```bash
# Use smaller model (recommended for testing)
export DEFAULT_MODEL="llama3.2:1b"
./run-llama-mcp.sh restart
# Enable GPU support (if you have NVIDIA GPU)
# 1. Uncomment GPU section in docker-compose-llama.yml
# 2. Ensure nvidia-docker is installed
# 3. Restart services
```
**If services use too much memory:**
```bash
# Check memory usage
docker stats
# Use smaller model or increase Docker memory limits
# In Docker Desktop: Settings โ Resources โ Memory
```
### Wrong Directory Error
```bash
# Always run from the demo directory
cd kafka-schema-reg-mcp/demo
./run-llama-mcp.sh start
# If you see "Cannot find docker-compose-llama.yml"
pwd # Should show .../kafka-schema-reg-mcp/demo
ls -la # Should show docker-compose-llama.yml
```
### Container Issues
```bash
# Clean restart with fresh containers
./run-llama-mcp.sh cleanup
./run-llama-mcp.sh start
# Remove and rebuild specific service
docker-compose -f docker-compose-llama.yml up --build -d mcp-bridge
# Check container logs for specific errors
./run-llama-mcp.sh logs ollama
./run-llama-mcp.sh logs kafka-mcp
./run-llama-mcp.sh logs mcp-bridge
```
### Integration Testing
```bash
# Test complete workflow
python client-example.py --message "Hello" # Basic LLama
python client-example.py --message "List schema subjects" # With MCP
python client-example.py # Interactive mode
# Test API directly
curl -X POST http://localhost:8080/chat \
-H "Content-Type: application/json" \
-d '{"message": "What is Kafka?", "use_mcp": true}'
```
## ๐ Development
### Custom Bridge Service
The bridge service code is created automatically in the `bridge/` directory by the setup script. You can modify `bridge/main.py` for custom integrations.
### Adding New MCP Tools
The MCP server supports the full Kafka Schema Registry API. Refer to the main MCP server documentation for adding new tools.
### Custom Models
You can use any Ollama-compatible model:
```bash
# List available models
docker exec ollama-mcp ollama list
# Try different models
python client-example.py --model codellama:7b
python client-example.py --model mistral:7b
```
### VSCode Extension Development
To create custom VSCode extensions that work with this setup:
```typescript
// Example VSCode extension code
import * as vscode from 'vscode';
export function activate(context: vscode.ExtensionContext) {
const provider = new SchemaRegistryProvider();
vscode.languages.registerCompletionItemProvider(
{ scheme: 'file', language: 'json' },
provider
);
}
class SchemaRegistryProvider implements vscode.CompletionItemProvider {
async provideCompletionItems(): Promise<vscode.CompletionItem[]> {
// Query schema registry via MCP bridge
const response = await fetch('http://localhost:8080/chat', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
message: 'List all subjects',
use_mcp: true
})
});
// Process response and return completion items
}
}
```
## ๐ API Reference
### Bridge Service Endpoints
#### POST /chat
```json
{
"message": "Your question about schema registry",
"model": "llama3.2:3b",
"use_mcp": true
}
```
#### GET /health
```json
{
"status": "healthy",
"ollama": "connected",
"mcp_server": "http://mcp-server:8000"
}
```
### MCP Server Endpoints
#### Direct MCP Tool Access
```bash
# List all available tools
curl http://localhost:38000/tools
# Execute specific tool
curl -X POST http://localhost:38000/tools/list_subjects \
-H "Content-Type: application/json" \
-d '{}'
```
## ๐ฏ Quick Demo Commands
Once everything is running, try these commands:
### CLI Commands
```bash
# Basic registry info
python client-example.py --message "Show me registry statistics"
# List schemas
python client-example.py --message "List all subjects"
# Interactive mode for natural conversation
python client-example.py
```
### VSCode Commands
- Open Command Palette (`Ctrl+Shift+P`)
- Type "Claude: Ask about schema registry"
- Ask: "What subjects are available in my schema registry?"
### API Commands
```bash
# Health check
curl http://localhost:8080/health
# Ask a question
curl -X POST http://localhost:8080/chat \
-H "Content-Type: application/json" \
-d '{"message": "List all subjects", "use_mcp": true}'
```
## ๐ค Contributing
1. Fork the repository
2. Create a feature branch
3. Make your changes in the `demo/` folder
4. Test with `cd demo && ./run-llama-mcp.sh start`
5. Test both CLI and VSCode integrations
6. Submit a pull request
## ๐ License
Same as the main repository.
## โ ๏ธ Notes
- This demo is designed for **local development** and testing
- Always run commands from the `demo/` directory
- VSCode integration requires compatible MCP extensions
- For production use, consider security, scaling, and monitoring requirements
- LLama models can be resource-intensive; adjust model size based on your hardware
- The bridge service provides a simple integration layer - extend it for your specific needs
## ๐ What's Next?
After trying the demo, you might want to:
- **Integrate with your existing Schema Registry** - Update connection settings in `.env`
- **Try different models** - Experiment with various LLama models for different use cases
- **Build custom VSCode extensions** - Create schema-aware development tools
- **Add authentication** - Implement security for production deployments
- **Scale the setup** - Deploy across multiple environments
**Happy exploring!** ๐
---
## ๐ **Success Stories**
This demo has been successfully tested and verified working on:
- โ
**macOS** (Intel & Apple Silicon)
- โ
**Linux** (Ubuntu, CentOS, etc.)
- โ
**Windows** (with Docker Desktop)
- โ
**CPU-only systems** (default configuration)
- โ
**NVIDIA GPU systems** (with uncommented GPU config)
**Real User Feedback:**
> *"Took 5 minutes to get running and LLama is answering questions about my schemas perfectly!"*
> *"The GPU configuration fix made it work instantly on my MacBook. Great improvements!"*
> *"Bridge service auto-creation solved the missing file issue I had before."*
**Ready to try it?** Just run `./run-llama-mcp.sh start` from the `demo/` directory! ๐