Skip to main content
Glama
performance_tuning.md8.55 kB
# Performance Tuning Guide This document provides recommendations for optimizing Files-DB-MCP performance in various environments and scenarios. ## Performance Overview Files-DB-MCP performance is affected by several key factors: 1. **Embedding Model**: Size and complexity of the embedding model 2. **Hardware Resources**: CPU, GPU, memory availability 3. **Project Size**: Number and size of files being indexed 4. **Indexing Strategy**: Full vs. incremental indexing 5. **Query Complexity**: Number and complexity of search queries 6. **Configuration Settings**: Quantization, device selection, etc. ## Hardware Recommendations ### Minimum Requirements - **CPU**: 2 cores - **RAM**: 4GB (8GB recommended) - **Disk**: 1GB + ~20% of project size - **Network**: Low latency connection to clients ### Recommended for Large Codebases - **CPU**: 4+ cores - **RAM**: 16GB+ - **GPU**: NVIDIA GPU with 4GB+ VRAM or Apple Silicon with 8GB+ unified memory - **Disk**: SSD with 5GB + ~20% of project size - **Network**: Gigabit connection ## Embedding Model Selection Choosing the right embedding model significantly impacts performance: | Model | Size | Speed | Accuracy | Memory Usage | Recommendation | |-------|------|-------|----------|--------------|----------------| | `sentence-transformers/all-MiniLM-L6-v2` | Small | Very Fast | Good | Low | Resource-constrained environments | | `BAAI/bge-small-en-v1.5` | Small | Fast | Good | Low | Development environments | | `jinaai/jina-embeddings-v2-base-code` | Medium | Medium | Very Good | Medium | General code search (balanced) | | `Alibaba-NLP/gte-large-en-v1.5` | Large | Slow | Excellent | High | Multilingual codebases | | `Salesforce/SFR-Embedding-2_R` | Large | Slow | Superior | High | High-accuracy requirements | ### Performance Impact | Model | CPU (i7) | GPU (RTX 3060) | Apple M1 | |-------|----------|----------------|----------| | Small models | ~200ms/file | ~50ms/file | ~80ms/file | | Medium models | ~500ms/file | ~120ms/file | ~200ms/file | | Large models | ~1.2s/file | ~250ms/file | ~400ms/file | ## Optimization Techniques ### 1. GPU Acceleration Using GPU acceleration can provide 4-10x performance improvement for embedding generation: ```bash # Enable CUDA for NVIDIA GPUs python -m src.main --model-config '{"device": "cuda"}' # Enable MPS for Apple Silicon python -m src.main --model-config '{"device": "mps"}' ``` ### 2. Quantization Quantization reduces model size and memory usage with minimal accuracy impact: ```bash # Enable 8-bit quantization (recommended for most cases) python -m src.main --model-config '{"quantization": "int8"}' # Enable 4-bit quantization (for extreme memory constraints) python -m src.main --model-config '{"quantization": "int4"}' ``` Memory usage reduction: - 8-bit quantization: ~50% reduction - 4-bit quantization: ~75% reduction ### 3. Incremental Indexing Incremental indexing significantly improves performance for subsequent runs: ```bash # Ensure incremental indexing is enabled (default) python -m src.main # Force full reindexing when needed python -m src.main --force-reindex ``` Performance comparison: - Full indexing: O(n) time complexity - Incremental indexing: O(m) time complexity, where m is the number of changed files ### 4. Ignore Patterns Properly configured ignore patterns reduce unnecessary file processing: ```bash # Add custom ignore patterns python -m src.main --ignore "*.log" "tmp/*" "build/*" ``` ### 5. Search Optimization Optimize search queries for better performance: ```json { "function": "search_files", "parameters": { "query": "user authentication function", "limit": 5, "file_extensions": ["js", "ts"], "threshold": 0.7, "search_params": { "exact": false, "hnsw_ef": 128 } } } ``` | Parameter | Impact | Recommendation | |-----------|--------|----------------| | `limit` | Lower limits improve response time | 5-10 for typical searches | | `threshold` | Higher thresholds reduce results but improve relevance | 0.65-0.75 for balanced performance | | `file_extensions` | Filtering reduces search space | Always specify when file type is known | | `hnsw_ef` | Controls search accuracy vs. speed | 64-128 for balanced performance | | `exact` | Exact search is slower but more accurate | Use `false` for better performance | ## Environment-Specific Tuning ### Docker Environment In Docker, configure container resources: ```yaml services: files-db-mcp: image: files-db-mcp:latest deploy: resources: limits: cpus: '2' memory: 4G reservations: cpus: '1' memory: 2G ``` ### CI/CD Environment For CI/CD environments where speed is critical: ```bash python -m src.main \ --embedding-model "BAAI/bge-small-en-v1.5" \ --model-config '{"device": "cpu", "quantization": "int8"}' \ --disable-sse ``` ### Development Environment For development environments where fast startup is desired: ```bash python -m src.main \ --embedding-model "jinaai/jina-embeddings-v2-base-code" \ --model-config '{"device": "auto", "quantization": "int8"}' ``` ### Production Environment For production environments where accuracy is prioritized: ```bash python -m src.main \ --embedding-model "Salesforce/SFR-Embedding-2_R" \ --model-config '{"device": "cuda", "normalize_embeddings": true}' ``` ## Benchmarking You can benchmark your setup using the built-in tools: ```bash python -m src.tools.benchmark --model "jinaai/jina-embeddings-v2-base-code" --files 1000 ``` Typical benchmarks for a medium-sized project (10,000 files): | Configuration | Initial Indexing | Incremental Update | Search Latency | |---------------|------------------|-------------------|---------------| | Small model, CPU | 15-25 minutes | 2-5 seconds/file | 100-300ms | | Medium model, CPU | 30-45 minutes | 5-10 seconds/file | 200-500ms | | Medium model, GPU | 8-15 minutes | 1-3 seconds/file | 100-250ms | | Large model, GPU | 15-30 minutes | 2-5 seconds/file | 150-350ms | ## Memory Usage Optimization For environments with limited memory: 1. **Use smaller models with quantization**: ```bash python -m src.main --embedding-model "BAAI/bge-small-en-v1.5" --model-config '{"quantization": "int8"}' ``` 2. **Process files in smaller batches**: Edit `.files-db-mcp/config.json` to add: ```json "indexing_config": { "batch_size": 5, "max_workers": 2 } ``` 3. **Limit content size**: ```json "indexing_config": { "max_content_chars": 3000 } ``` ## Scaling for Large Codebases For very large codebases (100K+ files): 1. **Use selective indexing**: ```bash python -m src.main --ignore "test/*" "docs/*" "*.md" "*.txt" ``` 2. **Distribute across multiple instances**: - Instance 1: `python -m src.main --project-path /path/to/project/frontend` - Instance 2: `python -m src.main --project-path /path/to/project/backend` 3. **Use vector database persistence**: ```bash python -m src.main --data-dir "/persistent/storage/.files-db-mcp" ``` ## Performance Monitoring Monitor system performance: ```bash curl http://localhost:8000/metrics ``` Key metrics to monitor: - Indexing rate (files/second) - Search latency (ms) - Memory usage (MB) - CPU/GPU utilization (%) ## Troubleshooting Performance Issues ### High Memory Usage 1. Check if you're using a large model without quantization 2. Reduce batch size for indexing 3. Use a smaller model with quantization ### Slow Indexing 1. Check hardware resources (CPU/GPU utilization) 2. Verify ignore patterns to exclude unnecessary files 3. Consider using a smaller embedding model 4. Enable GPU acceleration if available ### Slow Search Performance 1. Add more specific filters to your queries 2. Reduce the value of `hnsw_ef` in search parameters 3. Use a lower search result limit 4. Consider increasing the similarity threshold ### High CPU/GPU Usage 1. Limit the number of concurrent searches 2. Adjust batch size and worker count 3. Use a smaller model with lower resource requirements ## Conclusion Optimizing Files-DB-MCP performance involves balancing accuracy, resource usage, and response time. Start with the recommendations in this guide and adjust based on your specific requirements and environment constraints. For most users, the auto-detected configuration will provide a good balance of performance and accuracy, but fine-tuning using the techniques in this guide can yield significant improvements for specific use cases.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/randomm/files-db-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server