Turn any prompt into structured web data. Send a POST request with a natural language prompt and an optional JSON schema, and get back clean, structured results scraped from the web by an AI agent powered by Firecrawl.
POST /webhook/scrape-agent
Receive Scrape Request receives a POST request with prompt and an optional output_schema.
Validate Output Schema checks the schema. If none is provided, it falls back to a permissive default. If the schema is malformed, it returns a clear error via Return Schema Error.
Research & Extract Web Data takes the prompt and uses the full Firecrawl toolkit to research the web:
/search): Finds relevant pages and sources across the web./scrape): Extracts clean, structured content from any URL.This combination gives the AI agent complete web navigation capabilities. It can discover sources, read pages, and interact with dynamic content autonomously.
Format Response to Schema (Structured Output Parser) formats the agent's response to match the provided (or default) schema.
Return Structured Results sends the structured JSON back to the caller.
POST https://your-n8n-instance/webhook/scrape-agent
| Field | Type | Required | Description |
|---|---|---|---|
prompt |
string | Yes | Natural language instruction for the agent |
output_schema |
object | No | JSON Schema defining the desired output structure |
Returns a JSON object matching the provided schema, or a flexible object if no schema was given.
The agent decides the output structure on its own.
curl -X POST "https://your-n8n-instance/webhook/scrape-agent" \
-H "Content-Type: application/json" \
-d '{
"prompt": "Find the latest pricing for Firecrawl"
}' | jq
Expected output: A JSON object with whatever structure the agent finds most appropriate for the data. Since no schema was provided, the internal default ({ "type": "object", "additionalProperties": true }) is used.
You define exactly the shape of data you want back.
curl -X POST "https://your-n8n-instance/webhook/scrape-agent" \
-H "Content-Type: application/json" \
-d '{
"prompt": "Find the latest pricing for Firecrawl",
"output_schema": {
"type": "object",
"properties": {
"source": { "type": "string" },
"plans": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": { "type": "string" },
"price": { "type": "string" },
"credits": { "type": "string" },
"highlights": {
"type": "array",
"items": { "type": "string" }
}
}
}
}
}
}
}' | jq
Expected output:
{
"output": {
"source": "https://www.firecrawl.dev/pricing",
"plans": [
{
"name": "Free",
"price": "$0 (one-time)",
"credits": "500 credits (one-time)",
"highlights": [
"Scrape up to 500 pages",
"2 concurrent requests",
"Low rate limits",
"No credit card required"
]
},
{
"name": "Hobby",
"price": "$16/month (billed yearly, save $38)",
"credits": "3,000 credits / month",
"highlights": [
"Scrape up to 3,000 pages",
"5 concurrent requests",
"Basic support",
"$9 per extra 1k credits"
]
}
]
}
}
curl -X POST "https://your-n8n-instance/webhook/scrape-agent" \
-H "Content-Type: application/json" \
-d '{
"prompt": "Find the latest pricing for Firecrawl",
"output_schema": "not a valid schema"
}' | jq
Expected output:
{
"error": true,
"message": "Invalid output_schema: must be a JSON object with a valid 'type' property (object, array, string, number, boolean)",
"example_schema": {
"type": "object",
"properties": {
"name": { "type": "string" },
"price": { "type": "number" }
}
}
}
curl -X POST "https://your-n8n-instance/webhook/scrape-agent" \
-H "Content-Type: application/json" \
-d '{
"prompt": "Find the latest pricing for Firecrawl",
"output_schema": [1, 2, 3]
}' | jq
Expected output: Same error response as above.
type Property)curl -X POST "https://your-n8n-instance/webhook/scrape-agent" \
-H "Content-Type: application/json" \
-d '{
"prompt": "Find the latest pricing for Firecrawl",
"output_schema": {
"properties": {
"name": { "type": "string" }
}
}
}' | jq
Expected output: Same error response as above.
type Value)curl -X POST "https://your-n8n-instance/webhook/scrape-agent" \
-H "Content-Type: application/json" \
-d '{
"prompt": "Find the latest pricing for Firecrawl",
"output_schema": {
"type": "banana"
}
}' | jq
Expected output: Same error response as above.
Receive Scrape Request (POST)
|
v
Validate Output Schema
|--- Error --> Return Schema Error (error JSON)
|--- Success --> Research & Extract Web Data (AI Agent)
|
|--- Primary Chat Model
|--- Fallback Chat Model
|--- Search & Scrape:
| - Search the Web (/search)
| - Scrape Webpage Content (/scrape)
|--- Browser Tool:
| - Get Browser Context
| - Create Browser Session
| - Execute Browser Code
| - List Browser Sessions
| - Delete Browser Session
|
v
Return Structured Results
|
|--- Format Response to Schema (Output Parser)
|
|--- Parser Chat Model
The Validate Output Schema node runs this validation before passing data to the agent:
output_schema is missing or null, the default permissive schema is used: { "type": "object", "additionalProperties": true }.output_schema is present, it must be a JSON object (not a string, array, or primitive).type property with a valid value: object, array, string, number, or boolean.{{ JSON.stringify($('Validate Output Schema').item.json.output_schema) }} handles this conversion.