The Ref MCP server provides tools for accessing and managing technical documentation in a token-efficient way:
- Search Documentation: Use
ref_search_documentation
to search for technical documentation across public APIs, libraries, private repos, or the web. Ideal for finding facts, code snippets, or detailed information about frameworks, services, and databases. - Read URLs: Use
ref_read
to fetch content from a URL and convert it to markdown for easy reading, particularly useful with search results. - Web Search: Use
ref_search_web
as a fallback when other tools don't yield the desired results.
Allows access to npm package documentation through the ref_search_documentation tool, enabling AI agents to look up and retrieve information about npm packages, their APIs, and usage.
Ref MCP
A ModelContextProtocol server that gives your AI coding tool or agent access to documentation for APIs, services, libraries etc. It's your one-stop-shop to keep your agent up-to-date on documentation in a fast and token-efficient way.
For more see info ref.tools
Agentic search for exactly the right context
Ref's tools are design to match how models search while using as little context as possible to reduce context rot. The goal is to find exactly the context your coding agent needs to be successful while using minimum tokens.
Depending on the complexity of the prompt, LLM coding agents like Claude Code will typically do one or more searches and then choose a few resources to read in more depth.
For a simple query about Figma's Comment REST API it will make a couple calls to get exactly what it needs:
For more complex situations, the LLM will try to refine it's prompt as it reads results. For example:
Ref takes advantage of MCP sessions to track search trajectory and minimize context usage. There's a lot more ideas cooking but here's what we've implemented so far.
1. Filtering search results
For repeated similar searches in a session, Ref will never return repeated results. Traditionally, you dig farther in to search results by paging to the next result but this approach allows the agent to page AND adjust the prompt at the same time.
2. Fetching the part of the page that matters
When reading a page of documentation, Ref will use the agent's session search history to dropout less relevant sections and return the most relevant 5k tokens. This helps Ref avoid a big problem with standard fetch()
web scraping which is when it hits a large documentation page you can easily end up pull in 20k+ tokens into context, most of which are irrelevant.
Why does minimizing tokens from documentation context matter?
1. More context makes models dumber
It's well documented that as of July 2025 that models get dumber as you put in more tokens. You might have heard about how models are great with long context now and that's kind of true but not the whole picture. For a quick primer on some research, checkout this video from the team at Chroma.
2. Tokens cost $$$
Imagine you are using Claude Opus as a background agent and you start by having the agent pull in documentation context and suppose it pulls in 10000 tokens of context with 4000 being relevant and 6000 being extra noise. At API pricing, that 6k tokens cost about $0.09 PER STEP. If one prompt ends up taking 11 steps with Opus, you've spent $1 for no reason.
Setup
There are two options for setting up Ref as an MCP server, either via the streamable-http server (recommended) or local stdio server (legacy).
This repo contains the legacy stdio server.
Streamable HTTP (recommended)
stdio
Tools
Ref MCP server provides all the documentation related tools for your agent needs.
ref_search_documentation
A powerful search tool to check technical documentation. Great for finding facts or code snippets. Can be used to search for public documentation on the web or github as well from private resources like repos and pdfs.
Parameters:
query
(required): Query to search for relevant documentation. This should be a full sentence or question.
ref_read_url
A tool that fetches content from a URL and converts it to markdown for easy reading with Ref. This is powerful when used in conjunction with the ref_search_documentation tool that returns urls of relevant content.
Parameters:
url
(required): The URL of the webpage to read.
OpenAI deep research support
Ref can be used as a source for deep research. OpenAI requires specific tool definitions so when used with an OpenAI client, Ref will provide the same tools with slightly different naming.
Development
Running with Inspector
For development and debugging purposes, you can use the MCP Inspector tool. The Inspector provides a visual interface for testing and monitoring MCP server interactions.
Visit the Inspector documentation for detailed setup instructions.
To test locally with Inspector:
Or run both the watcher and inspector:
Local Development
- Clone the repository
- Install dependencies:
- Build the project:
- For development with auto-rebuilding:
License
MIT
hybrid server
The server is able to function both locally and remotely, depending on the configuration or use case.
Ref
Related MCP Servers
- TypeScript
- -securityAlicense-qualityFirebase Realtime DatabaseLast updated -61TypeScriptMIT License
- GoMIT License
- Go