extract_from_url

Extracts structured data from web content by downloading text from a URL and processing it with Large Language Models. Use it to analyze articles, documents, or other HTTP/HTTPS-accessible text with customizable prompts and parameters.

Instructions

Extract structured information from text content at a URL.

Downloads text from the specified URL and extracts structured information using Large Language Models. Ideal for processing web articles, documents, or any text content accessible via HTTP/HTTPS.

Args: url: URL to download text from (must start with http:// or https://) prompt_description: Clear instructions for what to extract examples: List of example extractions to guide the model model_id: LLM model to use (default: "gemini-2.5-flash") max_char_buffer: Max characters per chunk (default: 1000) temperature: Sampling temperature 0.0-1.0 (default: 0.5) extraction_passes: Number of extraction passes for better recall (default: 1) max_workers: Max parallel workers (default: 10)

Returns: Dictionary containing extracted entities with source locations and metadata

Raises: ToolError: If URL is invalid, download fails, or extraction fails

Input Schema

Name	Required	Default
`examples`	Yes
`extraction_passes`	No
`max_char_buffer`	No
`max_workers`	No
`model_id`	No	gemini-2.5-flash
`prompt_description`	Yes
`temperature`	No
`url`	Yes

Input Schema (JSON Schema)

{ "properties": { "examples": { "items": { "additionalProperties": true, "type": "object" }, "title": "Examples", "type": "array" }, "extraction_passes": { "default": 1, "title": "Extraction Passes", "type": "integer" }, "max_char_buffer": { "default": 1000, "title": "Max Char Buffer", "type": "integer" }, "max_workers": { "default": 10, "title": "Max Workers", "type": "integer" }, "model_id": { "default": "gemini-2.5-flash", "title": "Model Id", "type": "string" }, "prompt_description": { "title": "Prompt Description", "type": "string" }, "temperature": { "default": 0.5, "title": "Temperature", "type": "number" }, "url": { "title": "Url", "type": "string" } }, "required": [ "url", "prompt_description", "examples" ], "type": "object" }

LangExtract MCP Server

extract_from_url

Instructions

Input Schema

Input Schema (JSON Schema)

Other Tools

Related Tools

Latest Blog Posts

MCP directory API