Supports processing JPEG/JPG files to extract content for large language models
Supports processing Org files to extract content for large language models
Supports processing SVG files to extract content for large language models
Supports processing XML files to extract content for large language models
A Model Context Protocol server that provides unstructured document processing capabilities. This server enables LLMs to extract and use content from an unstructured document.
This repo is work in progress, proceed with caution :)
Supported file types:
Prerequisites: You'll need:
Unstructured API key. Learn how to obtain one here
Claude Desktop installed locally
Quick TLDR on how to add this MCP to your Claude Desktop:
Clone the repo and set up the UV environment.
Create a
.envfile in the root directory and add the following env variable:UNSTRUCTURED_API_KEY.Run the MCP server:
uv run doc_processor.pyGo to
~/Library/Application Support/Claude/and create aclaude_desktop_config.json. In that file add:
Restart Claude Desktop. You should now be able to use the MCP.
Appeared in Searches
- Metallurgical Engineering and Steel Plant Processing Information
- A file opener for reading PDF files, images, and other file types
- A tool for processing complex PDF documents with tables, charts, OCR, and images
- A powerful filesystem that works on both Windows and Mac
- Finding the Best Memory Compression Policies (MCPs) for Optimizing Limited Context Window in Claude Code