Skip to main content
Glama
TASK-007.md2.56 kB
# TASK-007: Implement Incremental Indexing ## Status: COMPLETE ## Description Implement incremental indexing to improve performance for large codebases. Instead of re-indexing all files on startup, only process files that have been added, modified, or deleted since the last indexing operation. ## Implementation Details - [x] Add file metadata tracking (hashes, modification times, sizes) - [x] Implement efficient file change detection - [x] Add incremental indexing mode (default) with full reindex option - [x] Update MCP interface to expose reindexing controls - [x] Add CLI option for forcing full reindex - [x] Handle deleted files properly ## Technical Notes The implementation uses the following approach: 1. **File Metadata Tracking**: - Store file hashes, modification times, and sizes in state file - Compute SHA-256 hash for text files (with size limit for performance) - For large files, use size+mtime combination as a proxy for content hash 2. **Change Detection Algorithm**: - On startup, scan all files in project directory - Compare against previously indexed files to find: - New files to add - Modified files to update (based on hash/mtime/size) - Deleted files to remove - Only process files that need updating 3. **Performance Optimizations**: - Skip hash computation for large files (>10MB) - Process files in parallel with thread pool - Early filtering of ignored files and directories - Efficient state file management 4. **API Enhancements**: - Added `trigger_reindex` MCP command to force reindexing - Added `get_indexing_status` MCP command to check progress - Added proper metadata handling in vector storage 5. **CLI Options**: - Added `--force-reindex` flag to command line interface - Incremental indexing is enabled by default ## Benefits - Significantly faster startup times for large codebases - Reduced system resource usage during indexing - Better handling of large files - More accurate change detection - User control over indexing behavior ## Limitations - File change detection outside the application requires file system events - Large binary files still rely on mtime/size instead of content hash - State file can grow large for repositories with many files ## Related Tasks - TASK-006: Project Initialization Process - TASK-018: Beta Release Preparation ## References - [Qdrant Client Documentation](https://qdrant.github.io/qdrant/docs/reference/api/) - [File Hashing Best Practices](https://en.wikipedia.org/wiki/Hash_function)

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/randomm/files-db-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server