Enables deployment of the MCP server in a containerized environment with Docker Compose for easy setup and isolation
Provides Git-based change tracking for efficient incremental re-indexing of code repositories
Allows scanning and analyzing GitHub repositories, including private repositories with access tokens, for code analysis and search capabilities
Supports generating context maps in Mermaid format for visualizing domain relationships and bounded contexts
Provides a reproducible development environment using Nix with flakes for consistent tooling and dependencies
Leverages OpenAI embeddings for semantic code search and understanding, enabling natural language queries for code
Uses PostgreSQL with pgvector extension for efficient vector storage and similarity search capabilities
Integrates pre-commit hooks for automatic code quality checks during development
Offers full support for Python code analysis with plans to add more languages in the future
Incorporates Ruff for code linting as part of the development workflow
Supports generating context maps in PlantUML format for visualizing domain relationships and bounded contexts
MCP Code Analysis Server
An intelligent MCP (Model Context Protocol) server that provides advanced code analysis and search capabilities for large codebases. Built with pure FastMCP implementation, it uses TreeSitter for parsing, PostgreSQL with pgvector for vector storage, and OpenAI embeddings for semantic search.
Features
- 🔍 Semantic Code Search: Natural language queries to find relevant code
- 🏛️ Domain-Driven Analysis: Extract business entities and bounded contexts using LLM
- 📊 Code Structure Analysis: Hierarchical understanding of modules, classes, and functions
- 🔄 Incremental Updates: Git-based change tracking for efficient re-indexing
- 🎯 Smart Code Explanations: AI-powered explanations with context aggregation
- 🔗 Dependency Analysis: Understand code relationships and dependencies
- 🌐 Knowledge Graph: Build semantic graphs with community detection (Leiden algorithm)
- 💡 DDD Refactoring: Domain-Driven Design suggestions and improvements
- 🚀 High Performance: Handles codebases with millions of lines of code
- 🐍 Python Support: Full support for Python with more languages coming
MCP Tools Available
Core Search Tools
search_code
- Search for code using natural language queries with semantic understandingfind_definition
- Find where symbols (functions, classes, modules) are definedfind_similar_code
- Find code patterns similar to a given snippet using vector similarityget_code_structure
- Get the hierarchical structure of a code file
Code Analysis Tools
explain_code
- Get hierarchical explanations of code elements (modules, classes, functions)suggest_refactoring
- Get AI-powered refactoring suggestions for code improvementsanalyze_dependencies
- Analyze dependencies and relationships between code entities
Repository Management Tools
sync_repository
- Manually trigger synchronization for a specific repository
Domain-Driven Design Analysis Tools
extract_domain_model
- Extract domain entities and relationships using LLM analysisfind_aggregate_roots
- Find aggregate roots in the codebase using domain analysisanalyze_bounded_context
- Analyze a bounded context and its relationshipssuggest_ddd_refactoring
- Suggest Domain-Driven Design refactoring improvementsfind_bounded_contexts
- Find all bounded contexts in the codebasegenerate_context_map
- Generate context maps (JSON, Mermaid, PlantUML)
Advanced Analysis Tools
analyze_coupling
- Analyze coupling between bounded contexts with metricssuggest_context_splits
- Suggest how to split large bounded contextsdetect_anti_patterns
- Detect DDD anti-patterns (anemic models, god objects, etc.)analyze_domain_evolution
- Track domain model changes over timeget_domain_metrics
- Get comprehensive domain health metrics and insights
Quick Start
Prerequisites
- Docker and Docker Compose
- OpenAI API key (for semantic search capabilities)
- Nix with flakes (recommended for development)
Docker Deployment (Recommended)
The easiest way to get started is using Docker Compose, which provides a complete isolated environment with PostgreSQL and pgvector.
- Clone the repository:
- Set up environment variables:
- Configure repositories:
Create a
config.yaml
file to specify which repositories to track:
- Start the services with Docker Compose:
This will:
- Start PostgreSQL with pgvector extension
- Build and start the MCP Code Analysis Server
- Initialize the database with required schemas
- Begin scanning configured repositories automatically
The server runs as a pure MCP implementation and can be accessed via any MCP-compatible client.
Development Environment (Local)
For development work, use the Nix development environment which provides all necessary tools and dependencies:
The Nix environment includes:
- Python 3.11 with all dependencies
- Code formatting tools (black, isort)
- Linters (ruff, pylint, bandit)
- Type checker (mypy)
- Dead code detection (vulture)
- Test runner (pytest)
- Pre-commit hooks
Configuration
Edit config.yaml
to customize:
Usage Examples
Using the MCP Tools
Once the server is running, you can use the tools via any MCP client:
With Claude Desktop
Configure the MCP server in your Claude Desktop settings:
For stdio mode (when running locally):
For HTTP mode (when using Docker):
Then in Claude Desktop:
- "Search for functions that handle authentication"
- "Show me the implementation of the UserService class"
- "Find all usages of the database connection pool"
- "What files import the utils module?"
Development
Running Tests
Code Quality
The project uses comprehensive code quality tools integrated into the Nix development environment:
Pre-commit Hooks
Install the pre-commit hooks for automatic code quality checks:
Architecture
The server consists of several key components:
- Scanner Module: Monitors and synchronizes Git repositories with incremental updates
- Parser Module: Extracts code structure using TreeSitter for accurate AST parsing
- Embeddings Module: Generates semantic embeddings via OpenAI for vector search
- Database Module: PostgreSQL with pgvector extension for efficient vector storage
- Query Module: Processes natural language queries and symbol lookup
- MCP Server: Pure FastMCP implementation exposing code analysis tools
- Domain Module: Extracts domain entities and relationships for DDD analysis
Performance
- Initial indexing: ~1000 files/minute with parallel processing
- Incremental updates: <10 seconds for 100 changed files using Git tracking
- Query response: <2 seconds for semantic search with pgvector
- Scalability: Supports codebases up to 10M+ lines of code
- Memory efficiency: Optimized database sessions and batch processing
Contributing
We welcome contributions! Please see CONTRIBUTING.md for guidelines.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Author
Johann-Peter Hartmann Email: johann-peter.hartmann@mayflower.de GitHub: @johannhartmann
Key Technologies
- FastMCP: Pure MCP protocol implementation
- TreeSitter: Robust code parsing and AST generation
- pgvector: High-performance vector similarity search
- OpenAI Embeddings: Semantic understanding of code
- PostgreSQL: Reliable data persistence and complex queries
- Nix: Reproducible development environment
- Docker: Containerized deployment and isolation
This server cannot be installed
An intelligent server that provides semantic code search, domain-driven analysis, and advanced code understanding for large codebases using LLMs and vector embeddings.
Related MCP Servers
- -securityFlicense-qualityA server component of the Model Context Protocol that provides intelligent analysis of codebases using vector search and machine learning to understand code patterns, architectural decisions, and documentation.Last updated -7Python
- -securityAlicense-qualityAn intelligent codebase processing server that provides agentic RAG capabilities for code repositories, enabling semantic search and contextual understanding through self-evaluating retrieval loops.Last updated -PythonMIT License
- -securityFlicense-qualityA local server that provides powerful code analysis and search capabilities for software projects, helping AI assistants and development tools understand codebases for tasks like code generation and refactoring.Last updated -2Python
- -securityFlicense-qualityHTTP-based server that provides semantic code search capabilities to IDEs through the Model Context Protocol, allowing efficient codebase exploration without repeated indexing.Last updated -849TypeScript