This n8n workflow provides a robust and highly reusable solution for scraping data from paginated websites. Instead of building a complex series of nodes for every new site, you only need to update a simple JSON configuration in the initial Input Node, making your scraping tasks faster and more standardized.
The core purpose of this template is to automate the extraction of structured data (e.g., product details, quotes, articles) from websites with multiple pages. It is designed to be fully recursive: it follows the "next page" link until no link is found, aggregates the results from all pages, and cleanly structures the final output into a single list of items.
startUrl: The URL of the first page to begin scraping.nextPageSelector: The CSS selector for the "Next" or "Continue" link element that leads to the next page. This is crucial for the pagination loop.fields: An array of objects defining the data to extract on each page. For each field, specify the name (the output key), the selector (the CSS selector pointing to the data), and the value (the HTML attribute to pull, usually text or href).For a detailed breakdown of the internal logic, including how the loop is constructed using the Set, If, and HTTP Request nodes, please refer to the original blog post: Flexible Web Scraping with n8n: A Configurable, Multi-Page Template