Uses OpenAI's text-embedding models to generate query embeddings for semantic search against pre-existing document embeddings
Enables semantic search and retrieval of document content from embeddings stored in Supabase vector database, providing document querying, context retrieval, and similarity matching capabilities
Document Retrieval MCP Server
A Model Context Protocol (MCP) server that enables AI agents to search and retrieve relevant document content from existing embeddings stored in Supabase vector database. This server performs retrieval-only operations without generating new embeddings.
Overview
The Document Retrieval MCP Server queries pre-generated document embeddings stored in Supabase PostgreSQL with pgvector extension. It provides semantic search capabilities for AI agents to find relevant document chunks based on similarity to query text.
Features
🔍 Semantic Search: Query documents using natural language with OpenAI embeddings
📄 Document Context Retrieval: Get full document content or specific chunks
📋 Document Listing: Browse available documents with pagination
🔗 Similarity Matching: Find related chunks based on existing embeddings
🚀 High Performance: Connection pooling, TTL caching, and optimized vector queries
🔒 Multi-tenant Security: User/session/project-based access control
🎯 MCP Protocol Compliant: Full compatibility with Claude Desktop and other MCP clients
Prerequisites
Python 3.10 or higher
Supabase project with pgvector extension enabled
Existing document embeddings in Supabase (generated using OpenAI text-embedding-3-small)
OpenAI API key for query embedding generation
Supabase service role key
Database Schema
The server expects the following tables in your Supabase database:
Installation
Option 1: Install from source
Option 2: Install as package
Configuration
Copy the environment template:
Edit
.env
with your configuration:
Usage
Running the Server
Claude Desktop Integration
Add the following to your Claude Desktop configuration (~/Library/Application Support/Claude/claude_desktop_config.json
on macOS):
Available Tools
1. search_documents
Search for documents using semantic similarity.
Parameters:
query
(string, required): Search query textuser_id
(string, required): User identifiersession_id
(string, required): Session identifierproject_id
(string, optional): Project filter (default: "-")top_k
(integer, optional): Number of results (default: 5, max: 20)similarity_threshold
(float, optional): Minimum similarity (default: 0.7)
Example:
2. get_document_context
Retrieve full document or specific chunks.
Parameters:
document_id
(string, required): Document UUIDuser_id
(string, required): User identifiersession_id
(string, required): Session identifierchunk_ids
(array, optional): Specific chunk UUIDs to retrieve
Example:
3. list_user_documents
List all documents accessible to the user.
Parameters:
user_id
(string, required): User identifiersession_id
(string, required): Session identifierproject_id
(string, optional): Project filterpage
(integer, optional): Page number (default: 1)per_page
(integer, optional): Items per page (default: 20, max: 100)
Example:
4. get_similar_chunks
Find chunks similar to a reference chunk.
Parameters:
chunk_id
(string, required): Reference chunk UUIDuser_id
(string, required): User identifiersession_id
(string, required): Session identifiertop_k
(integer, optional): Number of results (default: 3, max: 10)
Example:
Resources
The server provides two informational resources:
1. resource://server-info
Returns current server status and configuration.
2. resource://schema-info
Returns database schema information.
Usage Examples with Claude
After configuring the server in Claude Desktop, you can use natural language:
Performance Optimization
The server implements several optimization strategies:
Connection Pooling: Maintains 5-20 database connections
TTL Caching: Caches metadata for 5 minutes
Vector Indexes: Uses IVFFlat indexes for fast similarity search
Query Optimization: Efficient SQL with proper WHERE clauses
Async Operations: Full async/await for concurrent requests
Security
Service Role Key: Bypasses RLS for performance
Application-Level Security: WHERE clauses filter by user/session/project
Input Validation: JSON Schema validation on all parameters
Error Sanitization: No sensitive data in error messages
Environment Variables: Secrets never hardcoded
Troubleshooting
Common Issues
"Missing required environment variables"
Ensure all required variables are set in
.env
or environment
"Connection pool timeout"
Check Supabase URL and API key
Verify network connectivity
"No results above similarity threshold"
Lower the similarity_threshold parameter
Ensure embeddings exist for the user/session
"Document not found or access denied"
Verify user_id and session_id match existing records
Check document_id is valid
Logging
Enable debug logging by setting:
Development
Running Tests
Code Quality
Architecture
The server follows a layered architecture:
Contributing
Contributions are welcome! Please:
Fork the repository
Create a feature branch
Add tests for new functionality
Ensure all tests pass
Submit a pull request
License
MIT License - see LICENSE file for details.
Support
For issues and questions:
GitHub Issues: https://github.com/your-org/document-retrieval-mcp/issues
Documentation: https://docs.your-org.com/mcp/document-retrieval
Acknowledgments
Built with the Model Context Protocol
Embeddings by OpenAI
This server cannot be installed
remote-capable server
The server can be hosted and run remotely because it primarily relies on remote services or has no dependency on the local environment.
Enables AI agents to search and retrieve relevant document content from existing embeddings stored in Supabase vector database. Provides semantic search capabilities to find document chunks based on similarity to query text without generating new embeddings.