Skip to main content
Glama

MCP Iceberg Catalog

by ahodroj

MCP Iceberg Catalog

A MCP (Model Context Protocol) server implementation for interacting with Apache Iceberg. This server provides a SQL interface for querying and managing Iceberg tables through Claude desktop.

Claude Desktop as your Iceberg Data Lake Catalog

image

Related MCP server: PostgreSQL Query MCP Server

How to Install in Claude Desktop

Installing via Smithery

To install MCP Iceberg Catalog for Claude Desktop automatically via Smithery:

npx -y @smithery/cli install @ahodroj/mcp-iceberg-service --client claude
  1. Prerequisites

    • Python 3.10 or higher

    • UV package installer (recommended) or pip

    • Access to an Iceberg REST catalog and S3-compatible storage

  2. How to install in Claude Desktop Add the following configuration to claude_desktop_config.json:

{ "mcpServers": { "iceberg": { "command": "uv", "args": [ "--directory", "PATH_TO_/mcp-iceberg-service", "run", "mcp-server-iceberg" ], "env": { "ICEBERG_CATALOG_URI" : "http://localhost:8181", "ICEBERG_WAREHOUSE" : "YOUR ICEBERG WAREHOUSE NAME", "S3_ENDPOINT" : "OPTIONAL IF USING S3", "AWS_ACCESS_KEY_ID" : "YOUR S3 ACCESS KEY", "AWS_SECRET_ACCESS_KEY" : "YOUR S3 SECRET KEY" } } } }

Design

Architecture

The MCP server is built on three main components:

  1. MCP Protocol Handler

    • Implements the Model Context Protocol for communication with Claude

    • Handles request/response cycles through stdio

    • Manages server lifecycle and initialization

  2. Query Processor

    • Parses SQL queries using sqlparse

    • Supports operations:

      • LIST TABLES

      • DESCRIBE TABLE

      • SELECT

      • INSERT

  3. Iceberg Integration

    • Uses pyiceberg for table operations

    • Integrates with PyArrow for efficient data handling

    • Manages catalog connections and table operations

PyIceberg Integration

The server utilizes PyIceberg in several ways:

  1. Catalog Management

    • Connects to REST catalogs

    • Manages table metadata

    • Handles namespace operations

  2. Data Operations

    • Converts between PyIceberg and PyArrow types

    • Handles data insertion through PyArrow tables

    • Manages table schemas and field types

  3. Query Execution

    • Translates SQL to PyIceberg operations

    • Handles data scanning and filtering

    • Manages result set conversion

Further Implementation Needed

  1. Query Operations

    • Implement UPDATE operations

    • Add DELETE support

    • Support for CREATE TABLE with schema definition

    • Add ALTER TABLE operations

    • Implement table partitioning support

  2. Data Types

    • Support for complex types (arrays, maps, structs)

    • Add timestamp with timezone handling

    • Support for decimal types

    • Add nested field support

  3. Performance Improvements

    • Implement batch inserts

    • Add query optimization

    • Support for parallel scans

    • Add caching layer for frequently accessed data

  4. Security Features

    • Add authentication mechanisms

    • Implement role-based access control

    • Add row-level security

    • Support for encrypted connections

  5. Monitoring and Management

    • Add metrics collection

    • Implement query logging

    • Add performance monitoring

    • Support for table maintenance operations

  6. Error Handling

    • Improve error messages

    • Add retry mechanisms for transient failures

    • Implement transaction support

    • Add data validation

One-click Deploy
A
security – no known vulnerabilities
F
license - not found
A
quality - confirmed to work

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ahodroj/mcp-iceberg-service'

If you have feedback or need assistance with the MCP directory API, please join our Discord server