Skip to main content
Glama
by pablontiv

PDF Reader MCP Server

CI/CD Pipeline Node.js Version TypeScript License: MIT npm package

A Model Context Protocol (MCP) server for extracting and processing content from PDF documents. This server provides secure, efficient, and flexible PDF content extraction capabilities following the MCP specification.

Features

  • Text Extraction: Extract plain text from PDF documents with formatting preservation

  • Metadata Extraction: Extract document metadata (title, author, dates, page count, etc.)

  • Page-Level Processing: Extract content from specific pages or page ranges

  • PDF Validation: Validate PDF file integrity and readability

  • Security-First: Input validation and sandboxed processing

  • Type-Safe: Full TypeScript implementation with comprehensive type definitions

Related MCP server: MCP Docling Server

Why Choose This MCP Server?

🎯 Specialized PDF Tools

  • 4 dedicated tools for different PDF processing needs (text, metadata, pages, validation)

  • Granular control - extract specific pages, preserve formatting, or get structured output

  • Flexible page ranges - support for "1-5", "1,3,5", or "all" syntax

🛡️ Enterprise-Grade Security

  • Directory traversal protection prevents unauthorized file access

  • File size limits (configurable up to 100MB by default)

  • Processing timeouts prevent resource exhaustion

  • Memory usage controls (500MB limit by default)

  • No temporary file persistence - secure processing without data leakage

Production-Ready Architecture

  • Robust error handling with standardized MCP error codes (-32602 to -32605)

  • Structured logging with Winston for monitoring and debugging

  • Comprehensive input validation using Zod schemas

  • Type-safe TypeScript implementation with full type definitions

  • Concurrent processing support for multiple PDF operations

🔧 Developer Experience

  • Easy configuration via environment variables

  • Flexible deployment - works with 70+ MCP-compatible clients

  • Clear documentation with real-world examples

  • Modern tech stack - TypeScript, pdf-parse, pdf-lib

  • Test coverage with Vitest for reliability

📊 Performance Optimized

  • Efficient PDF processing optimized for text-based documents

  • Configurable resource limits to match your infrastructure

  • Minimal dependencies for faster startup and lower memory footprint

  • Streaming support for large document processing

Installation

npm install npm run build

Usage

As MCP Server

Start the server:

npm start

Client Configuration

This MCP server can be used with various AI applications and development tools. Below are configuration instructions for the most popular clients:

Claude Desktop

Add this configuration to your Claude Desktop config file:

Windows: %APPDATA%\Claude\claude_desktop_config.json macOS: ~/Library/Application Support/Claude/claude_desktop_config.json

{ "mcpServers": { "pdf-reader": { "command": "node", "args": ["/path/to/pdf-reader-mcp/dist/index.js"], "env": { "PDF_MAX_FILE_SIZE": "104857600", "LOG_LEVEL": "info" } } } }

VS Code and VS Code-based Editors

For VS Code, Cursor, Windsurf, and other VS Code-based editors, install an MCP extension:

  1. Install the MCP extension from the marketplace

  2. Add this configuration to your settings.json:

{ "mcp.servers": { "pdf-reader": { "command": "node", "args": ["/path/to/pdf-reader-mcp/dist/index.js"] } } }

ChatGPT Desktop

For ChatGPT Desktop (available since OpenAI's MCP adoption in March 2025):

  1. Go to Settings → Integrations → MCP Servers

  2. Add new server with:

    • Name: PDF Reader

    • Command: node /path/to/pdf-reader-mcp/dist/index.js

Claude Code

Option 1: Command Line (Recommended)

Unix/macOS:

# Add the MCP server directly via command line claude mcp add pdf-reader node /path/to/pdf-reader-mcp/dist/index.js # With environment variables claude mcp add pdf-reader -e PDF_MAX_FILE_SIZE=104857600 -e LOG_LEVEL=info -- node /path/to/pdf-reader-mcp/dist/index.js # Set scope (optional: --scope local|project|user) claude mcp add --scope project pdf-reader node /path/to/pdf-reader-mcp/dist/index.js

Windows:

rem Add the MCP server directly via command line claude mcp add pdf-reader node C:\path\to\pdf-reader-mcp\dist\index.js rem With environment variables claude mcp add pdf-reader -e PDF_MAX_FILE_SIZE=104857600 -e LOG_LEVEL=info -- node C:\path\to\pdf-reader-mcp\dist\index.js rem Set scope (optional: --scope local|project|user) claude mcp add --scope project pdf-reader node C:\path\to\pdf-reader-mcp\dist\index.js

Option 2: Configuration File Configure in your project's .claude/settings.json:

{ "mcp": { "servers": { "pdf-reader": { "command": "node", "args": ["/path/to/pdf-reader-mcp/dist/index.js"] } } } }

Other Clients

For other MCP-compatible applications (Microsoft Copilot Studio, Replit, Zed, etc.), refer to the official MCP documentation for client-specific configuration instructions.

Available Tools

1. extract_pdf_text

Extract text content from PDF documents.

Parameters:

  • file_path (required): Path to the PDF file

  • pages (optional): Page range ("1-5", "1,3,5", or "all")

  • preserve_formatting (optional): Whether to preserve text formatting

  • include_metadata (optional): Whether to include document metadata

2. extract_pdf_metadata

Extract metadata and document information from PDF files.

Parameters:

  • file_path (required): Path to the PDF file

3. extract_pdf_pages

Extract content from specific pages or page ranges.

Parameters:

  • file_path (required): Path to the PDF file

  • page_range (required): Page range to extract

  • output_format (optional): "text" or "structured"

4. validate_pdf

Validate PDF file integrity and readability.

Parameters:

  • file_path (required): Path to the PDF file

Configuration

Environment variables:

  • PDF_MAX_FILE_SIZE: Maximum file size in bytes (default: 104857600 = 100MB)

  • PDF_PROCESSING_TIMEOUT: Processing timeout in milliseconds (default: 60000)

  • PDF_MAX_MEMORY_USAGE: Maximum memory usage in bytes (default: 524288000 = 500MB)

  • LOG_LEVEL: Logging level (default: 'info')

Security

  • Input validation for all file paths

  • Directory traversal protection

  • File size and memory limits

  • Processing timeouts

  • No temporary file persistence

Error Handling

The server provides comprehensive error handling with specific error codes:

  • -32602: Validation errors

  • -32603: File access errors

  • -32604: Size/resource errors

  • -32605: Format errors

Performance

  • Supports files up to 100MB

  • Memory usage limited to 500MB

  • Concurrent processing support

  • Optimized for text-based PDFs

License

MIT

One-click Deploy
A
security – no known vulnerabilities
F
license - not found
A
quality - confirmed to work

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/pablontiv/pdf-reader-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server