Includes a dedicated search_google() function to perform Google searches programmatically as part of browser automation workflows.
Provides specific configuration paths and instructions for macOS users, particularly for Claude Desktop integration.
Supports Markdown through the markdownify dependency, likely for content extraction and formatting purposes.
Leverages Pydantic for data validation and settings management in the MCP server implementation.
Built using Python with extensive Python-based configuration and usage examples throughout the documentation.
Session-Based Browser-Use FastMCP Server
English
A modern Model Context Protocol (MCP) server that provides advanced browser automation capabilities using the FastMCP framework. Features session-based instance management, TTL cleanup, PDF generation, file downloads, cookie management, and comprehensive browser configuration options. All browser operations are implemented via browser-use.
🎯 Key Features
- Session-Based Management: Each MCP session gets its own isolated browser instance automatically
- Advanced Browser Control: Full browser automation with Playwright backend (via browser-use)
- PDF Generation: Convert web pages to PDF with custom formatting options
- File Operations: Download/upload files, manage file system, and access all temp files
- Cookie Management: Set, get, and manage browser cookies for authentication
- Screenshot Capture: Take full-page, viewport, or element screenshots
- Tab Management: Create, switch, and close browser tabs
- Content Extraction: Extract and search page content
- Session Persistence: Automatic cleanup with configurable TTL
- Multi-Instance Support: Run multiple isolated browser sessions
- Configurable Security: All browser security settings are configurable via API
🚀 Quick Start
- Install Dependencies:Using uv (recommended):For development and testing:
- Install the Browser:
- Start the Server:Using uv (recommended):Or using python directly:
- Basic Usage:
🛠️ Run Tests
Install test dependencies and run all tests with coverage:
🛠️ Core Tools (API)
Session Management
create_chrome_instance(headless, viewport_width, viewport_height)
→ Create a new browser session, returnssession_id
close_instance(session_id)
→ Close a specific sessionget_instance_info(session_id)
→ Get info for a sessionget_browser_status()
→ List all sessionsclose_all_instances()
→ Close all sessions
Browser Configuration
set_browser_config(session_id, headless, no_sandbox, user_agent, viewport_width, viewport_height, disable_web_security)
→ Set browser config (restart if needed)get_browser_config(session_id)
→ Get current config
Navigation & Page Control
navigate_to(session_id, url, new_tab=False)
→ Go to any URL (optionally in new tab)navigate_back(session_id)
/navigate_forward(session_id)
→ History navigationget_page_state(session_id)
→ List interactive elements with indices
Tab Management
get_tabs_info(session_id)
→ List all open tabsswitch_tab(session_id, page_id)
→ Switch between tabsclose_tab(session_id, page_id)
→ Close specific tab
Element Interaction
click_element(session_id, index)
→ Click element by indexclick_element_by_xpath(session_id, xpath)
→ Click element by XPathinput_text(session_id, index, text)
→ Type into form fieldsset_element_value(session_id, index, value)
→ Set input/select value directlyget_element_info(session_id, index=None, xpath=None)
→ Get element info (by index or xpath)send_keys(session_id, keys)
→ Send keyboard shortcutsupload_file(session_id, index, file_path)
→ Upload files to formsget_dropdown_options(session_id, index)
→ Inspect select elements
Media & Files
take_screenshot(session_id, target=None, width=None, height=None, full_page=True, quality=90, format="png")
→ Capture screenshotsgenerate_pdf(session_id, url=None, html_content=None, output_filename=None, ...)
→ Save page as PDFdownload_file(session_id, url, output_filename=None, timeout=30)
→ Download files from URLsdownload_image(session_id, image_url, output_filename=None, timeout=30)
→ Download images specifically
Cookie & Session Management
set_cookie(session_id, name, value, domain, path, http_only, secure, same_site, expires, max_age)
→ Set browser cookiesget_cookies(session_id, domain=None)
→ Retrieve current cookies
Utilities
scroll_page(session_id, direction="down")
→ Scroll up/downextract_content(session_id, query)
→ Extract text contentwait(seconds)
→ Pause executionbrowser_tips()
→ Get automation best practicessearch_bing(session_id, query)
→ Bing search
📚 Resources (REST-style)
browser://status
→ Manager and sessions statusbrowser://instances
→ All sessions infobrowser://instance/{id}/page
→ Session page infobrowser://instance/{id}/tabs
→ Session tabsbrowser://instance/{id}/screenshots
→ Session screenshotsbrowser://instance/{id}/status
→ Session status (detailed)browser://instance/{id}/files
→ Session temp filesbrowser://instance/{id}/cookies
→ Session cookiesbrowser://instance/{id}/file/{relative_path}
→ Read a file in session tempbrowser://help
→ This help
🔧 Configuration
Configure the server using environment variables:
📝 Prompts
Built-in prompts for common automation scenarios:
web_testing(url, test_scenario)
→ Web testing workflowsdata_extraction(url, data_type)
→ Data extraction strategiesform_filling(url, form_data)
→ Automated form filling (returns conversation)automation_troubleshooting()
→ Debugging help
🔌 MCP Integration
Using with Claude Desktop
- Add to Claude Desktop Configuration:Edit your Claude Desktop configuration file (usually at
~/Library/Application Support/Claude/claude_desktop_config.json
on macOS):Or using python directly: - Restart Claude Desktop to load the MCP server
- Start Using: The browser automation tools will now be available in your Claude conversations
Using with MCP Client
💡 Use Cases
- Web Testing: Automated functional, security, and performance testing
- Data Scraping: Extract structured data from websites
- Form Automation: Fill and submit web forms programmatically
- Content Monitoring: Track changes in web content
- Screenshot Documentation: Capture visual evidence for reports
- PDF Generation: Convert web pages to PDF documents
- Session Management: Handle authenticated workflows
🔒 Security Features
- Session isolation between MCP clients
- Secure cookie management with HttpOnly and Secure flags
- Configurable browser security settings (CORS, sandbox, etc.)
- Automatic cleanup of temporary files
- TTL-based session expiration
🐳 Docker Usage
Build the image:
Run the server (default: port 8000, SSE transport):
You can override startup parameters via environment variables:
Chinese
基于会话的浏览器自动化 FastMCP 服务器,提供先进的浏览器自动化功能,使用 FastMCP 框架构建。所有浏览器操作均通过 browser-use 实现。
🎯 核心特性
- 基于会话的管理: 每个 MCP 会话自动获得独立的浏览器实例
- 高级浏览器控制: 基于 Playwright 的完整浏览器自动化(由 browser-use 提供)
- PDF 生成: 将网页转换为 PDF,支持自定义格式选项
- 文件操作: 下载/上传文件,管理临时文件目录
- Cookie 管理: 设置、获取和管理浏览器 Cookie 用于身份验证
- 截图捕获: 全页面、视口或元素截图
- 标签页管理: 创建、切换和关闭浏览器标签页
- 内容提取: 提取和搜索页面内容
- 会话持久化: 自动清理,可配置 TTL
- 多实例支持: 运行多个隔离的浏览器会话
- 可配置安全性: 所有浏览器安全设置均可通过 API 配置
🚀 快速开始
- 安装依赖:使用 uv(推荐):开发/测试环境:
- 安装浏览器:
- 启动服务器:使用 uv(推荐):或直接使用 python:
- 基本使用:
🛠️ 运行测试
安装测试依赖并运行所有测试(含覆盖率统计):
🛠️ 核心工具(API)
会话管理
create_chrome_instance(headless, viewport_width, viewport_height)
→ 创建新浏览器会话,返回session_id
close_instance(session_id)
→ 关闭指定会话get_instance_info(session_id)
→ 获取会话信息get_browser_status()
→ 列出所有会话close_all_instances()
→ 关闭所有会话
浏览器配置
set_browser_config(session_id, headless, no_sandbox, user_agent, viewport_width, viewport_height, disable_web_security)
→ 设置浏览器配置(如需重启自动重启)get_browser_config(session_id)
→ 获取当前配置
导航和页面控制
navigate_to(session_id, url, new_tab=False)
→ 导航到 URL(可选新标签页)navigate_back(session_id)
/navigate_forward(session_id)
→ 历史记录导航get_page_state(session_id)
→ 获取带索引的交互元素
标签页管理
get_tabs_info(session_id)
→ 列出所有打开的标签页switch_tab(session_id, page_id)
→ 切换标签页close_tab(session_id, page_id)
→ 关闭指定标签页
元素交互
click_element(session_id, index)
→ 按索引点击元素click_element_by_xpath(session_id, xpath)
→ 按 XPath 点击元素input_text(session_id, index, text)
→ 在表单字段中输入文本set_element_value(session_id, index, value)
→ 直接设置输入/选择值get_element_info(session_id, index=None, xpath=None)
→ 获取元素信息(按索引或 xpath)send_keys(session_id, keys)
→ 发送键盘快捷键upload_file(session_id, index, file_path)
→ 上传文件到表单get_dropdown_options(session_id, index)
→ 检查 select 元素
媒体和文件
take_screenshot(session_id, target=None, width=None, height=None, full_page=True, quality=90, format="png")
→ 截图generate_pdf(session_id, url=None, html_content=None, output_filename=None, ...)
→ 保存页面为 PDFdownload_file(session_id, url, output_filename=None, timeout=30)
→ 下载文件download_image(session_id, image_url, output_filename=None, timeout=30)
→ 下载图片
Cookie 和会话管理
set_cookie(session_id, name, value, domain, path, http_only, secure, same_site, expires, max_age)
→ 设置 Cookieget_cookies(session_id, domain=None)
→ 获取当前 Cookie
实用工具
scroll_page(session_id, direction="down")
→ 上下滚动extract_content(session_id, query)
→ 提取文本内容wait(seconds)
→ 暂停执行browser_tips()
→ 获取自动化最佳实践search_bing(session_id, query)
→ Bing 搜索
📚 资源(REST 风格)
browser://status
→ 管理器和会话状态browser://instances
→ 所有会话信息browser://instance/{id}/page
→ 会话页面信息browser://instance/{id}/tabs
→ 会话标签页browser://instance/{id}/screenshots
→ 会话截图browser://instance/{id}/status
→ 会话详细状态browser://instance/{id}/files
→ 会话临时文件browser://instance/{id}/cookies
→ 会话 Cookiebrowser://instance/{id}/file/{relative_path}
→ 读取会话临时文件browser://help
→ 帮助
🔧 配置
使用环境变量配置服务器:
📝 提示
常见自动化场景的内置 prompt:
web_testing(url, test_scenario)
→ Web 测试工作流data_extraction(url, data_type)
→ 数据提取策略form_filling(url, form_data)
→ 自动表单填写(返回对话)automation_troubleshooting()
→ 调试帮助
🔌 MCP 集成
与 Claude Desktop 一起使用
- 添加到 Claude Desktop 配置:编辑 Claude Desktop 配置文件(macOS 上通常位于
~/Library/Application Support/Claude/claude_desktop_config.json
):或直接使用 python: - 重启 Claude Desktop 以加载 MCP 服务器
- 开始使用: 浏览器自动化工具现在可在您的 Claude 对话中使用
与 MCP 客户端一起使用
💡 使用场景
- Web 测试: 自动化功能、安全和性能测试
- 数据抓取: 从网站提取结构化数据
- 表单自动化: 程序化填写和提交 Web 表单
- 内容监控: 跟踪 Web 内容变化
- 截图文档: 为报告捕获视觉证据
- PDF 生成: 将网页转换为 PDF 文档
- 会话管理: 处理身份验证工作流
🔒 安全功能
- MCP 客户端之间的会话隔离
- 支持 HttpOnly 和 Secure 标志的安全 Cookie 管理
- 可配置的浏览器安全设置(CORS、沙箱等)
- 临时文件自动清理
- 基于 TTL 的会话过期
🐳 Docker 用法
构建镜像:
运行服务(默认8000端口,SSE模式):
可通过环境变量覆盖启动参数:
This server cannot be installed
hybrid server
The server is able to function both locally and remotely, depending on the configuration or use case.
A session-based MCP server that provides advanced browser automation capabilities, allowing users to control browsers, navigate websites, interact with elements, capture screenshots, generate PDFs, and manage cookies through natural language.
Related MCP Servers
- -securityAlicense-qualityA MCP server that provides browser automation tools, allowing users to navigate websites, take screenshots, click elements, fill forms, and execute JavaScript through Playwright.Last updated -PythonApache 2.0
- -securityAlicense-qualityAn MCP server that enables AI assistants to control a web browser through natural language commands, allowing them to navigate websites and extract information via SSE transport.Last updated -648PythonMIT License
- -securityFlicense-qualityA MCP server that allows AI assistants to interact with the browser, including getting page content as markdown, modifying page styles, and searching browser history.Last updated -79TypeScript
- -securityFlicense-qualityA FastMCP server that enables browser automation through natural language commands, allowing Language Models to browse the web, fill out forms, click buttons, and perform other web-based tasks via a simple API.Last updated -2Python