This workflow contains community nodes that are only compatible with the self-hosted version of n8n.
Extract clean and structured text from any webpage with optional fallback to an anti-bot scraping service. Ideal for AI tools and content workflows.
This sub-workflow enables reliable and clean scraping of any public webpage by simply passing a url parameter. It is designed to be embedded into other workflows or used as a tool for AI agents.
It supports two output modes:
true
— returns { title, text } with full page contentfalse
— returns { title, url, content } with a short excerpt💡 If the site is protected by anti-bot systems (like Cloudflare), it will automatically fallback to Scrape.do, a scraping API with a generous free plan.
🧩 This template requires the n8n-nodes-webpage-content-extractor community node, so it only works in self-hosted n8n environments.
Perfect for chatbots, summarization workflows, or RSS/feed enrichment. Empowers your AI Agent with the ability to browse and extract readable content from websites automatically.
url
(string): the webpage URL to scrapefulltext
(boolean): set true
for full page content, false
for summarized outputThe Scrape.do API is only used as a fallback when conventional scraping fails, helping you preserve your API credits.