The Peekaboo MCP server provides macOS screen capture and AI-powered image analysis for integration with AI assistants like Claude Desktop and Cursor IDE.
- Capture Screenshots: Take screenshots of entire screens, specific app windows, or all windows of an app, with control over output format (file or Base64 data).
- AI Image Analysis: Analyze captured or existing images using AI models like GPT-4 Vision, Claude, or local models via Ollama.
- System Information: List running applications, their open windows (including details like position, size, and IDs), and server status.
- Integration: Seamlessly integrate with AI assistants for visual context and debugging workflows.
- Privacy Options: Support for local AI analysis through Ollama for privacy-focused use cases.
- Advanced Targeting: Fuzzy matching and precise targeting for apps, windows, and screens.
Enables comprehensive screen capture capabilities on macOS, including capturing entire screens, specific application windows, or all windows of an app with various formatting options.
Enables local AI image analysis of screenshots through Ollama, supporting models like LLaVA and Qwen2-VL for vision tasks without sending data to the cloud.
Provides integration with OpenAI's vision models (like GPT-4o) for analyzing captured screenshots through the OpenAI API.
Peekaboo MCP: Lightning-fast macOS Screenshots & GUI Automation 🚀
🎉 NEW in v3: Complete GUI automation framework with AI Agent! Click, type, scroll, and automate any macOS application using natural language. Plus comprehensive menu bar extraction without clicking! See the GUI Automation section and AI Agent section for details.
Peekaboo is a powerful macOS utility for capturing screenshots, analyzing them with AI vision models, and now automating GUI interactions. It works both as a standalone CLI tool (recommended) and as an MCP server for AI assistants like Claude Desktop and Cursor.
🎯 Choose Your Path
🖥️ CLI Tool (Recommended for Most Users)
Perfect for:
- Command-line workflows and automation
- Shell scripts and CI/CD pipelines
- Quick screenshots and AI analysis
- System administration tasks
🤖 MCP Server (For AI Assistants)
Perfect for:
- Claude Desktop integration
- Cursor IDE workflows
- AI agents that need visual context
- Interactive AI debugging sessions
What is Peekaboo?
Peekaboo bridges the gap between visual content on your screen and AI understanding. It provides:
- Lightning-fast screenshots of screens, applications, or specific windows
- AI-powered image analysis using GPT-4.1 Vision, Claude, Grok, or local models (Ollama)
- Complete GUI automation (v3) - Click, type, scroll, and interact with any macOS app
- Natural language automation (v3) - AI agent that understands tasks like "Open TextEdit and write a poem"
- Smart UI element detection - Automatically identifies buttons, text fields, links, and more with precise coordinate mapping
- Menu bar extraction (v3) - Discover all menus and keyboard shortcuts without clicking or opening menus
- Automatic session resolution - Commands intelligently use the most recent session (no manual tracking!)
- Window and application management with smart fuzzy matching
- Multi-screen support - List which screen windows are on and move them between displays
- Privacy-first operation with local AI options via Ollama
- Non-intrusive capture without changing window focus
- Automation scripting - Chain commands together for complex workflows
🏗️ Architecture
Peekaboo uses a modern service-based architecture:
- PeekabooCore - Shared services for screen capture, UI automation, window management, and more
- CLI - Command-line interface that uses PeekabooCore services directly
- Mac App - Native macOS app with 100x+ performance improvement over CLI spawning
- MCP Server - Model Context Protocol server for AI assistants
All components share the same core services, ensuring consistent behavior and optimal performance. See Service API Reference for detailed documentation.
🚀 Quick Start: CLI Tool
Installation
Basic Usage
Debugging with Verbose Mode
All Peekaboo commands support the --verbose
or -v
flag for detailed logging:
Verbose logs are written to stderr with timestamps:
This is invaluable for:
- Debugging automation scripts
- Understanding why elements aren't found
- Performance optimization
- Learning Peekaboo's internals
Configuration
Peekaboo uses a unified configuration directory at ~/.peekaboo/
for better discoverability:
Managing API Keys Securely
Example Configuration
~/.peekaboo/config.json
:
~/.peekaboo/credentials
(auto-created with proper permissions):
Common Workflows
🤖 MCP Server Setup
For AI assistants like Claude Desktop and Cursor, Peekaboo provides a Model Context Protocol (MCP) server.
For Claude Desktop
- Open Claude Desktop Settings (from the menubar, not the in-app settings)
- Navigate to Developer → Edit Config
- Add the Peekaboo MCP server configuration:
- Save and restart Claude Desktop
For Claude Code
Run the following command:
Alternatively, if you've already installed the server via Claude Desktop, you can import it:
Local Development
For local development, use the built MCP server directly:
For Cursor IDE
Add to your Cursor settings:
🔗 MCP Client Integration
Peekaboo v3 now functions as both an MCP server (exposing its tools) and an MCP client (consuming external tools). This enables powerful workflows that combine Peekaboo's native automation with tools from the broader MCP ecosystem.
Default Integration: BrowserMCP
Peekaboo ships with BrowserMCP enabled by default, providing browser automation capabilities via Puppeteer:
Managing External MCP Servers
Configuration
External servers are configured in ~/.peekaboo/config.json
. To disable BrowserMCP:
Available External Tools
All external tools are prefixed with their server name:
- browser - Navigate to URL (BrowserMCP)
- browser - Click elements on webpage (BrowserMCP)
- browser - Take webpage screenshot (BrowserMCP)
- github - Create GitHub issues (GitHub server)
- files - Read files (Filesystem server)
The AI agent automatically uses the best combination of native and external tools for each task.
See docs/mcp-client.md for complete documentation.
MCP Tools Available
Core Tools
image
- Capture screenshots (with optional AI analysis via question parameter)list
- List applications, windows, or check server statusanalyze
- Analyze existing images with AI vision models (MCP-only tool, usepeekaboo image --analyze
in CLI)
UI Automation Tools
see
- Capture screen and identify UI elementsclick
- Click on UI elements or coordinatestype
- Type text into UI elements (supports escape sequences)press
- Press individual keys (return, tab, escape, arrows, etc.)scroll
- Scroll content in any directionhotkey
- Press keyboard shortcutsswipe
- Perform swipe/drag gesturesmove
- Move mouse cursor to specific position or elementdrag
- Perform drag and drop operations
Application & Window Management
app
- Launch, quit, focus, hide, and manage applicationswindow
- Manipulate windows (close, minimize, maximize, move, resize, focus)menu
- Interact with application menus and system menu extrasdock
- Launch apps from dock and manage dock itemsdialog
- Handle dialog windows (click buttons, input text)space
- Manage macOS Spaces (virtual desktops)
Utility Tools
run
- Execute automation scripts from .peekaboo.json filessleep
- Pause execution for specified durationclean
- Clean up session cache and temporary filespermissions
- Check system permissions (screen recording, accessibility)agent
- Execute complex automation tasks using AI
🚀 GUI Automation with Peekaboo v3
Peekaboo v3 introduces powerful GUI automation capabilities, transforming it from a screenshot tool into a complete UI automation framework for macOS. This enables AI assistants to interact with any application through natural language commands.
How It Works
The v3 automation system uses a see-then-interact workflow:
- See - Capture the screen and identify UI elements
- Interact - Click, type, scroll, or perform other actions
- Verify - Capture again to confirm the action succeeded
🎯 The see
Tool - UI Element Discovery
The see
tool is the foundation of GUI automation. It captures a screenshot and identifies all interactive UI elements, assigning them unique Peekaboo IDs.
Discovering Available Screens
Before capturing specific screens, you can list all connected displays:
This command shows:
- Screen index: Use with
see --screen-index
orimage --screen-index
- Display name: Built-in, External, or specific model names
- Resolution: Full screen resolution
- Position: Coordinates in the unified desktop space
- Scale factor: Retina display information
- Visible area: Usable area (excluding menu bar on primary screen)
Multi-Screen Capture
When capturing multiple screens, Peekaboo automatically saves each screen as a separate file:
- Primary screen:
screenshot.png
- Additional screens:
screenshot_screen1.png
,screenshot_screen2.png
, etc.
Display information (name, resolution) is shown for each captured screen:
Note: Annotation is automatically disabled for full screen captures due to performance constraints.
Element ID Format
- B1, B2... - Buttons
- T1, T2... - Text fields/areas
- L1, L2... - Links
- G1, G2... - Groups/containers
- I1, I2... - Images
- S1, S2... - Sliders
- C1, C2... - Checkboxes/toggles
- M1, M2... - Menu items
🖱️ The click
Tool
Click on UI elements using various targeting methods:
⌨️ The type
Tool
Type text with support for escape sequences:
Supported Escape Sequences
\n
- Newline/return\t
- Tab\b
- Backspace/delete\e
- Escape\\
- Literal backslash
🔑 The press
Tool
Press individual keys or key sequences:
Available Keys
- Navigation: up, down, left, right, home, end, pageup, pagedown
- Editing: delete (backspace), forward_delete, clear
- Control: return, enter, tab, escape, space
- Function: f1-f12
- Special: caps_lock, help
📜 The scroll
Tool
Scroll content in any direction:
⌨️ The hotkey
Tool
Press keyboard shortcuts:
👆 The swipe
Tool
Perform swipe or drag gestures:
🖱️ The move
Tool
Move the mouse cursor to specific positions or UI elements:
🎯 The drag
Tool
Perform drag and drop operations between UI elements or coordinates:
🔐 The permissions
Tool
Check macOS system permissions required for automation:
📝 The run
Tool - Automation Scripts
Execute complex automation workflows from JSON script files:
Script Format (.peekaboo.json)
🎯 Automatic Window Focus Management
Peekaboo v3 includes intelligent window focus management that ensures your automation commands target the correct window, even across different macOS Spaces (virtual desktops).
How Focus Management Works
All interaction commands (click
, type
, scroll
, menu
, hotkey
, drag
) automatically:
- Track window identity - Using stable window IDs that persist across interactions
- Detect window location - Find which Space contains the target window
- Switch Spaces if needed - Automatically switch to the window's Space
- Focus the window - Ensure the window is frontmost before interaction
- Verify focus - Confirm the window is ready before proceeding
Focus Options
All interaction commands support these focus-related flags:
Space Management Commands
Peekaboo provides dedicated commands for managing macOS Spaces:
Window Focus Command
For explicit window focus control:
Focus Behavior
By default, Peekaboo:
- Automatically focuses windows before any interaction
- Switches Spaces when the target window is on a different desktop
- Waits for focus to ensure the window is ready
- Retries if needed with exponential backoff
This ensures reliable automation across complex multi-window, multi-Space workflows without manual window management.
🤖 AI Agent Automation
Peekaboo v3 introduces an AI-powered agent that can understand and execute complex automation tasks using natural language. The agent uses OpenAI's Chat Completions API with streaming support to break down your instructions into specific Peekaboo commands.
Setting Up the Agent
Two Ways to Use the Agent
1. Direct Natural Language (Default)
When you provide a text argument without a subcommand, Peekaboo automatically uses the agent:
2. Explicit Agent Command
Use the agent
subcommand for more control and options:
How the Agent Works
- Understands Your Intent - The AI agent analyzes your natural language request
- Plans the Steps - Breaks down the task into specific actions
- Executes Commands - Uses Peekaboo's automation tools to perform each step
- Verifies Results - Takes screenshots to confirm actions succeeded
- Handles Errors - Can retry failed actions or adjust approach
Real-World Examples
Agent Options
--verbose
- See the agent's reasoning and planning process--dry-run
- Preview what the agent would do without executing--max-steps <n>
- Limit the number of actions (default: 20)--model <model>
- Choose OpenAI model (default: gpt-4-turbo)--json-output
- Get structured JSON output--resume
- Resume the latest unfinished agent session--resume <session-id>
- Resume a specific session by ID
Agent Capabilities
The agent has access to all Peekaboo commands:
- Visual Understanding - Can see and understand what's on screen
- UI Interaction - Click buttons, fill forms, navigate menus
- Text Entry - Type text, use keyboard shortcuts
- Window Management - Open, close, minimize, arrange windows
- Application Control - Launch apps, switch between them
- File Operations - Save files, handle dialogs
- Complex Workflows - Chain multiple actions together
- Multiple AI Models - Supports OpenAI (GPT-4o, o3), Anthropic (Claude), and Grok (xAI)
Understanding Agent Execution
When you run an agent command, here's what happens behind the scenes:
Example Workflow
Debugging Agent Actions
Use --verbose
to see exactly what the agent is doing:
Tips for Best Results
- Be Specific - "Click the blue Submit button" works better than "submit"
- One Task at a Time - Break complex workflows into smaller tasks
- Verify State - The agent works best when it can see the current screen
- Use Verbose Mode - Add
--verbose
to understand what the agent is doing - Set Reasonable Limits - Use
--max-steps
to prevent runaway automation
Resuming Agent Sessions
The agent supports resuming interrupted or incomplete sessions, maintaining full conversation context:
How Resume Works
- Session Persistence - Each agent run creates a session with a unique ID
- Thread Continuity - Uses OpenAI's thread persistence to maintain conversation history
- Context Preservation - The AI remembers all previous interactions in the session
- Smart Recovery - Can continue from any point, understanding what was already done
Resume Examples
⏸️ The sleep
Tool
Pause execution between actions:
🪟 The window
Tool
Comprehensive window manipulation for any application:
Window Actions
- close - Close the window (animated if has close button)
- minimize - Minimize to dock
- maximize - Maximize/zoom window
- move - Move to specific coordinates
- resize - Change window dimensions
- set-bounds - Set position and size in one operation
- focus - Bring window to front and focus
Targeting Options
- app - Target by application name (fuzzy matching supported)
- title - Target by window title (substring matching)
- index - Target by index (0-based, front to back order)
🖥️ Multi-Screen Support
Peekaboo v3 includes comprehensive multi-screen support for window management across multiple displays. When listing windows, Peekaboo shows which screen each window is on, and provides powerful options for moving windows between screens.
Screen Identification
When listing windows, each window shows its screen location:
Moving Windows Between Screens
Using Screen Index (0-based):
Using Screen Presets:
Combined Screen and Window Operations
You can combine screen movement with window positioning:
How It Works
- Unified Coordinate System: macOS uses a single coordinate space across all screens
- Smart Positioning: When moving windows between screens without explicit coordinates, windows maintain their relative position (e.g., a window at 25% from the left edge stays at 25% on the new screen)
- Screen Detection: Windows are assigned to screens based on their center point
- 0-Based Indexing: Screens are indexed starting from 0, matching macOS's internal ordering
Multi-Screen with AI Agent
The AI agent understands multi-screen commands:
📋 The menu
Tool
Interact with application menu bars and system menu extras:
Menu Subcommands
- list - List all menus and their items (including keyboard shortcuts)
- list-all - List menus for the frontmost application
- click - Click a menu item (default if not specified)
- click-extra - Click system menu extras in the status bar
Key Features
- Pure Accessibility - Extracts menu structure without clicking or opening menus
- Full Hierarchy - Discovers all submenus and nested items
- Keyboard Shortcuts - Shows all available keyboard shortcuts
- Smart Discovery - AI agents can use list to discover available options
🚀 The app
Tool
Control applications - launch, quit, focus, hide, and switch between apps:
🎯 The dock
Tool
Interact with the macOS Dock:
💬 The dialog
Tool
Handle system dialogs and alerts:
🧹 The clean
Tool
Clean up session cache and temporary files:
Session Management
Peekaboo v3 uses sessions to maintain UI state across commands:
- Sessions are created automatically by the
see
tool - Each session stores screenshot data and element mappings
- Sessions persist in
~/.peekaboo/session/<PID>/
- Element IDs remain consistent within a session
- Sessions are automatically cleaned up on process exit
Best Practices
- Always start with
see
- Capture the current UI state before interacting - Use element IDs when possible - More reliable than coordinate clicking
- Add delays for animations - Use
sleep
after actions that trigger animations - Verify actions - Call
see
again to confirm actions succeeded - Handle errors gracefully - Check if elements exist before interacting
- Clean up sessions - Use the
clean
tool periodically
Example Workflows
Login Automation
Web Search
Form Filling
Troubleshooting
- Elements not found - Ensure the UI is visible and not obscured
- Clicks not working - Try increasing
wait_for
timeout - Wrong element clicked - Use specific element IDs instead of queries
- Session errors - Run
clean
tool to clear corrupted sessions - Permissions denied - Grant Accessibility permission in System Settings
Debugging with Logs
Peekaboo uses macOS's unified logging system. Use pblog
to monitor logs:
Note: macOS redacts log values by default, showing <private>
.
See docs/pblog-guide.md and docs/logging-profiles/README.md for solutions.
🔧 Configuration
Configuration Precedence
Settings follow this precedence (highest to lowest):
- Command-line arguments
- Environment variables
- Credentials file (
~/.peekaboo/credentials
) - Configuration file (
~/.peekaboo/config.json
) - Built-in defaults
Available Options
Setting | Config File | Environment Variable | Description |
---|---|---|---|
AI Providers | aiProviders.providers | PEEKABOO_AI_PROVIDERS | Comma-separated list (e.g., "openai/gpt-4.1,anthropic/claude,grok/grok-4,ollama/llava") |
OpenAI API Key | Use credentials file | OPENAI_API_KEY | Required for OpenAI provider |
Anthropic API Key | Use credentials file | ANTHROPIC_API_KEY | Required for Claude models |
Grok API Key | Use credentials file | X_AI_API_KEY or XAI_API_KEY | Required for Grok (xAI) models |
Ollama URL | aiProviders.ollamaBaseUrl | PEEKABOO_OLLAMA_BASE_URL | Default: http://localhost:11434 |
Default Save Path | defaults.savePath | PEEKABOO_DEFAULT_SAVE_PATH | Where screenshots are saved (default: current directory) |
Log Level | logging.level | PEEKABOO_LOG_LEVEL | trace, debug, info, warn, error, fatal |
Log Path | logging.path | PEEKABOO_LOG_FILE | Log file location |
CLI Binary Path | - | PEEKABOO_CLI_PATH | Override bundled Swift CLI path (advanced usage) |
Environment Variable Details
API Key Storage Best Practices
For security, Peekaboo supports three methods for API key storage (in order of recommendation):
- Environment Variables (Most secure for automation)
- Credentials File (Best for interactive use)
- Config File (Not recommended - use credentials file instead)
AI Provider Configuration
PEEKABOO_AI_PROVIDERS
: Comma-separated list of AI providers to use for image analysis- Format:
provider/model,provider/model
- Example:
"openai/gpt-4.1,anthropic/claude-opus-4,grok/grok-4,ollama/llava:latest"
- The first available provider will be used
- Default:
"openai/gpt-4.1,ollama/llava:latest"
- Supported providers:
openai
,anthropic
,grok
,ollama
- Format:
OPENAI_API_KEY
: Your OpenAI API key for GPT-4.1 Vision- Required when using the
openai
provider - Get your key at: https://platform.openai.com/api-keys
- Required when using the
ANTHROPIC_API_KEY
: Your Anthropic API key for Claude models- Required when using the
anthropic
provider - Get your key at: https://console.anthropic.com/
- Required when using the
X_AI_API_KEY
orXAI_API_KEY
: Your xAI API key for Grok models- Required when using the
grok
provider - Get your key at: https://console.x.ai/
- Both environment variable names are supported
- Required when using the
PEEKABOO_OLLAMA_BASE_URL
: Base URL for your Ollama server- Default:
http://localhost:11434
- Use for custom Ollama installations or remote servers
- Default:
Default Behavior
PEEKABOO_DEFAULT_SAVE_PATH
: Default directory for saving screenshots- Default: Current working directory
- Supports tilde expansion (e.g.,
~/Desktop/Screenshots
) - Created automatically if it doesn't exist
Logging and Debugging
PEEKABOO_LOG_LEVEL
: Control logging verbosity- Options:
trace
,debug
,info
,warn
,error
,fatal
- Default:
info
- Use
debug
ortrace
for troubleshooting
- Options:
PEEKABOO_LOG_FILE
: Custom log file location- Default:
/tmp/peekaboo-mcp.log
(MCP server) - For CLI, logs are written to stderr by default
- Default:
Advanced Options
PEEKABOO_CLI_PATH
: Override the bundled Swift CLI binary path- Only needed if using a custom-built CLI binary
- Default: Uses the bundled binary
Using Environment Variables
Environment variables can be set in multiple ways:
🎨 Setting Up Local AI with Ollama
For privacy-focused local AI analysis:
Ollama Model Support
Models with Tool Calling (✅ Recommended for automation):
llama3.3
- Best overall for agent tasksllama3.2
- Good alternative
Vision Models (❌ No tool calling):
llava
- Image analysis onlybakllava
- Alternative vision model
Note: For agent automation tasks, use llama3.3
. Vision models like llava
can analyze images but cannot perform GUI automation.
📋 Requirements
- macOS 14.0+ (Sonoma or later)
- Screen Recording Permission (required)
- Accessibility Permission (optional, for window focus control)
Granting Permissions
- Screen Recording (Required):
- System Settings → Privacy & Security → Screen & System Audio Recording
- Enable for Terminal, Claude Desktop, or your IDE
- Performance Benefit: Enables fast window enumeration using CGWindowList API
- Without this permission, window operations may be slower
- Accessibility (Optional):
- System Settings → Privacy & Security → Accessibility
- Enable for better window focus control and UI automation
Check permissions status:
Performance Optimizations
Peekaboo v3 includes significant performance improvements:
- Hybrid Window Enumeration: Automatically uses the faster CGWindowList API when screen recording permission is granted, with seamless fallback to accessibility APIs
- Built-in Timeout Protection: All window and menu operations have configurable timeouts (default 2s) to prevent hangs
- Smart API Selection: Automatically chooses the fastest available API based on your permissions
- Parallel Processing: Window data is fetched concurrently when possible
These optimizations ensure that operations that previously could hang for 2+ minutes now complete in seconds.
🏗️ Building from Source
Prerequisites
- macOS 14.0+ (Sonoma or later)
- Node.js 20.0+ and npm
- Xcode 16.4+ with Command Line Tools (
xcode-select --install
) - Swift 6.0+ (included with Xcode 16.4+)
Build Commands
Creating Release Binaries
The release script creates:
peekaboo-macos-universal.tar.gz
- Standalone CLI binary (universal)@steipete-peekaboo-mcp-{version}.tgz
- npm packagechecksums.txt
- SHA256 checksums for verification
Debug Build Staleness Detection
For development, enable automatic staleness detection to ensure you're always using the latest built CLI version: git config peekaboo.check-build-staleness true
. This is recommended when working with AI assistants that frequently modify source code, as it prevents using outdated binaries.
👻 Poltergeist - Swift CLI Auto-rebuild Watcher
Poltergeist is a helpful ghost that watches your Swift files and automatically rebuilds the CLI when they change. Perfect for development workflows!
Installation
First, install Watchman (required):
Usage
Run these commands from the project root:
What It Does
Poltergeist monitors:
Core/PeekabooCore/**/*.swift
Core/AXorcist/**/*.swift
Apps/CLI/**/*.swift
- All
Package.swift
andPackage.resolved
files
When changes are detected, it automatically:
- Rebuilds the Swift CLI using
npm run build:swift
- Copies the binary to the project root for easy access
- Logs all activity to
.poltergeist.log
Features
- 👻 Smart Rebuilding - Only rebuilds when Swift files actually change
- 🔒 Single Instance - Prevents multiple concurrent builds
- 📝 Activity Logging - Track all rebuild activity with timestamps
- ⚡ Native Performance - Uses macOS FSEvents for minimal overhead
- 🎯 Persistent Watches - Survives terminal sessions
🧪 Testing
Running Tests
Peekaboo uses Swift Testing framework (Swift 6.0+) for all test suites:
Testing the CLI
📚 Documentation
🐛 Troubleshooting
Issue | Solution |
---|---|
Permission denied | Grant Screen Recording permission in System Settings |
Window not found | Try using fuzzy matching or list windows first |
AI analysis failed | Check API keys and provider configuration |
Command not found | Ensure Peekaboo is in your PATH or use full path |
Enable debug logging for more details:
For step-by-step debugging, use the verbose flag:
🛠️ Development
Poltergeist - Automatic CLI Builder
Peekaboo includes Poltergeist, an automatic build system that watches Swift source files and rebuilds the CLI in the background. This ensures your CLI binary is always up-to-date during development.
Key features:
- Watches all Swift source files automatically
- Smart wrapper script (
./scripts/peekaboo-wait.sh
) handles build coordination - Exit code 42 indicates build failure - fix immediately
- See Poltergeist repository for full documentation
Building from Source
🤝 Contributing
Contributions are welcome! Please:
- Fork the repository
- Create a feature branch
- Commit your changes
- Push to the branch
- Open a Pull Request
📝 License
MIT License - see LICENSE file for details.
👤 Author
Created by Peter Steinberger - @steipete
🙏 Acknowledgments
- Apple's ScreenCaptureKit for blazing-fast captures
- The MCP team for the Model Context Protocol
- The Swift and TypeScript communities
---# CI Test
local-only server
The server can only run on the client's local machine because it depends on local resources.
A macOS utility that captures screenshots and analyzes them with AI vision, enabling AI assistants to see and interpret what's on your screen.
- 🎯 Choose Your Path
- What is Peekaboo?
- 🚀 Quick Start: CLI Tool
- 🤖 MCP Server Setup
- 🔗 MCP Client Integration
- 🚀 GUI Automation with Peekaboo v3
- 🤖 AI Agent Automation
- Setting Up the Agent
- Two Ways to Use the Agent
- How the Agent Works
- Real-World Examples
- Agent Options
- Agent Capabilities
- Understanding Agent Execution
- Example Workflow
- Debugging Agent Actions
- Tips for Best Results
- Resuming Agent Sessions
- ⏸️ The sleep Tool
- 🪟 The window Tool
- 🖥️ Multi-Screen Support
- 📋 The menu Tool
- 🚀 The app Tool
- 🎯 The dock Tool
- 💬 The dialog Tool
- 🧹 The clean Tool
- Session Management
- Best Practices
- Example Workflows
- Troubleshooting
- Debugging with Logs
- 🔧 Configuration
- 🎨 Setting Up Local AI with Ollama
- 📋 Requirements
- 🏗️ Building from Source
- 👻 Poltergeist - Swift CLI Auto-rebuild Watcher
- 🧪 Testing
- 📚 Documentation
- 🐛 Troubleshooting
- 🛠️ Development
- 🤝 Contributing
- 📝 License
- 👤 Author
- 🙏 Acknowledgments
Related MCP Servers
- AsecurityAlicenseAqualityProvides screenshot and OCR capabilities for macOS.Last updated -169220JavaScriptMIT License
- AsecurityAlicenseAqualityEnables capturing high-quality native macOS screenshots using Safari through a Node.js server, supporting various sizes, zoom levels, and load wait times.Last updated -11TypeScriptMIT License
- AsecurityAlicenseAqualityA Model Context Protocol server that provides AI vision capabilities for analyzing UI screenshots, offering tools for screen analysis, file operations, and UI/UX report generation.Last updated -261JavaScriptISC License
- AsecurityFlicenseAqualityEnables AI tools to capture and process screenshots of a user's screen, allowing AI assistants to see and analyze what the user is looking at through a simple MCP interface.Last updated -113Python