Skip to main content
Glama

DataPilot MCP Server

CI/CD Pipeline Coverage Status Python Version License: MIT Code style: black Security: bandit Pre-commit

Navigate your data with AI guidance. A comprehensive Model Context Protocol (MCP) server for interacting with Snowflake using natural language and AI. Built with FastMCP 2.0 and OpenAI integration.

Features

šŸ—„ļø Core Database Operations

  • execute_sql - Execute SQL queries with results

  • list_databases - List all accessible databases

  • list_schemas - List schemas in a database

  • list_tables - List tables in a database/schema

  • describe_table - Get detailed table column information

  • get_table_sample - Retrieve sample data from tables

šŸ­ Warehouse Management

  • list_warehouses - List all available warehouses

  • get_warehouse_status - Get current warehouse, database, and schema status

šŸ¤– AI-Powered Features

  • natural_language_to_sql - Convert natural language questions to SQL queries

  • analyze_query_results - AI-powered analysis of query results

  • suggest_query_optimizations - Get optimization suggestions for SQL queries

  • explain_query - Plain English explanations of SQL queries

  • generate_table_insights - AI-generated insights about table data

šŸ“Š Resources (Data Access)

  • snowflake://databases - Access database list

  • snowflake://schemas/{database} - Access schema list

  • snowflake://tables/{database}/{schema} - Access table list

  • snowflake://table/{database}/{schema}/{table} - Access table details

šŸ“ Prompts (Templates)

  • sql_analysis_prompt - Templates for SQL analysis

  • data_exploration_prompt - Templates for data exploration

  • sql_optimization_prompt - Templates for query optimization

Related MCP server: Snowflake MCP Service

Installation

  1. Clone and setup the project:

    git clone <repository-url> cd datapilot python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
  2. Install dependencies:

    pip install -r requirements.txt
  3. Configure environment variables:

    cp env.template .env # Edit .env with your credentials

Configuration

Environment Variables

Create a .env file with the following configuration:

# Required: Snowflake Connection # Account examples: # - ACCOUNT-LOCATOR.snowflakecomputing.com (recommended) # - ACCOUNT-LOCATOR.region.cloud # - organization-account_name SNOWFLAKE_ACCOUNT=ACCOUNT-LOCATOR.snowflakecomputing.com SNOWFLAKE_USER=your_username SNOWFLAKE_PASSWORD=your_password # Optional: Default Snowflake Context SNOWFLAKE_WAREHOUSE=your_warehouse_name SNOWFLAKE_DATABASE=your_database_name SNOWFLAKE_SCHEMA=your_schema_name SNOWFLAKE_ROLE=your_role_name # Required: OpenAI API OPENAI_API_KEY=your_openai_api_key OPENAI_MODEL=gpt-4 # Optional, defaults to gpt-4

Snowflake Account Setup

  1. Get your Snowflake account identifier - Multiple formats supported:

    • Recommended: ACCOUNT-LOCATOR.snowflakecomputing.com (e.g., SCGEENJ-UR66679.snowflakecomputing.com)

    • Regional: ACCOUNT-LOCATOR.region.cloud (e.g., xy12345.us-east-1.aws)

    • Legacy: organization-account_name

  2. Ensure your user has appropriate permissions:

    • USAGE on warehouses, databases, and schemas

    • SELECT on tables for querying

    • SHOW privileges for listing objects

Usage

Running the Server

Method 1: Direct execution

python -m src.main

Method 2: Using FastMCP CLI

fastmcp run src/main.py

Method 3: Development mode with auto-reload

fastmcp dev src/main.py

Connecting to MCP Clients

Claude Desktop

Add to your Claude Desktop configuration:

{ "mcpServers": { "datapilot": { "command": "python", "args": ["-m", "src.main"], "cwd": "/path/to/datapilot", "env": { "SNOWFLAKE_ACCOUNT": "your_account", "SNOWFLAKE_USER": "your_user", "SNOWFLAKE_PASSWORD": "your_password", "OPENAI_API_KEY": "your_openai_key" } } } }

Using FastMCP Client

from fastmcp import Client async def main(): async with Client("python -m src.main") as client: # List databases databases = await client.call_tool("list_databases") print("Databases:", databases) # Natural language to SQL result = await client.call_tool("natural_language_to_sql", { "question": "Show me the top 10 customers by revenue", "database": "SALES_DB", "schema": "PUBLIC" }) print("Generated SQL:", result)

Example Usage

1. Natural Language Query

# Ask a question in natural language question = "What are the top 5 products by sales volume last month?" sql = await client.call_tool("natural_language_to_sql", { "question": question, "database": "SALES_DB", "schema": "PUBLIC" }) print(f"Generated SQL: {sql}")

2. Execute and Analyze

# Execute a query and get AI analysis analysis = await client.call_tool("analyze_query_results", { "query": "SELECT product_name, SUM(quantity) as total_sales FROM sales GROUP BY product_name ORDER BY total_sales DESC LIMIT 10", "results_limit": 100, "analysis_type": "summary" }) print(f"Analysis: {analysis}")

3. Table Insights

# Get AI-powered insights about a table insights = await client.call_tool("generate_table_insights", { "table_name": "SALES_DB.PUBLIC.CUSTOMERS", "sample_limit": 50 }) print(f"Table insights: {insights}")

4. Query Optimization

# Get optimization suggestions optimizations = await client.call_tool("suggest_query_optimizations", { "query": "SELECT * FROM large_table WHERE date_column > '2023-01-01'" }) print(f"Optimization suggestions: {optimizations}")

Architecture

ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā” ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā” ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā” │ MCP Client │ │ FastMCP │ │ Snowflake │ │ (Claude/etc) │◄──►│ Server │◄──►│ Database │ ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜ ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜ ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜ │ ā–¼ ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā” │ OpenAI API │ │ (GPT-4) │ ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜

Project Structure

datapilot/ ā”œā”€ā”€ src/ │ ā”œā”€ā”€ __init__.py │ ā”œā”€ā”€ main.py # Main FastMCP server │ ā”œā”€ā”€ models.py # Pydantic data models │ ā”œā”€ā”€ snowflake_client.py # Snowflake connection & operations │ └── openai_client.py # OpenAI integration ā”œā”€ā”€ requirements.txt # Python dependencies ā”œā”€ā”€ env.template # Environment variables template └── README.md # This file

Development

Adding New Tools

  1. Define your tool function in src/main.py:

@mcp.tool() async def my_new_tool(param: str, ctx: Context) -> str: """Description of what the tool does""" await ctx.info(f"Processing: {param}") # Your logic here return "result"
  1. Add appropriate error handling and logging

  2. Test with FastMCP dev mode: fastmcp dev src/main.py

Adding New Resources

@mcp.resource("snowflake://my-resource/{param}") async def my_resource(param: str) -> Dict[str, Any]: """Resource description""" # Your logic here return {"data": "value"}

Troubleshooting

Common Issues

  1. Connection Errors

    • Verify Snowflake credentials in .env

    • Check network connectivity

    • Ensure user has required permissions

  2. OpenAI Errors

    • Verify OPENAI_API_KEY is set correctly

    • Check API quota and billing

    • Ensure model name is correct

  3. Import Errors

    • Activate virtual environment

    • Install all requirements: pip install -r requirements.txt

    • Run from project root directory

Logging

Enable debug logging:

LOG_LEVEL=DEBUG

Contributing

  1. Fork the repository

  2. Create a feature branch

  3. Make your changes

  4. Add tests if applicable

  5. Submit a pull request

License

This project is licensed under the MIT License.

Support

For issues and questions:

  • Check the troubleshooting section

  • Review FastMCP documentation: https://gofastmcp.com/

  • Open an issue in the repository

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/rickyb30/datapilot-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server