Automate niche research with Wikipedia, GPT-4o-mini, and Google Sheets

Created by

Ian Kerins

Last update

Last update 3 days ago

Overview

This n8n template automates the process of researching niche topics. It searches for a topic on Wikipedia, scrapes the relevant page using ScrapeOps, extracts the history or background section, and uses AI to generate a concise summary and timeline. The results are automatically saved to Google Sheets for easy content planning.

Who is this for?

Content Creators: Quickly gather background info for videos or articles.
Marketers: Research niche markets and product histories.
Educators/Students: Generate timelines and summaries for study topics.
Researchers: Automate the initial data gathering phase.

What problems it solves

Time Consumption: Manually reading and summarizing Wikipedia pages takes time.
Blocking: Scraping Wikipedia directly can sometimes lead to IP blocks; ScrapeOps handles this.
Unstructured Data: Raw HTML is hard to use; this workflow converts it into a clean, structured format (JSON/CSV).

How it works

Define Topic: You set a keyword in the workflow.
Locate Page: The workflow queries the Wikipedia API to find the correct page URL.
Smart Scraping: It uses the ScrapeOps Proxy API to fetch the page content reliably.
Extraction: A code node intelligently parses the HTML to find "History", "Origins", or "Background" sections.
AI Processing: GPT-4o-mini summarizes the text and extracts key dates for a timeline.
Storage: The structured data is appended to a Google Sheet.

Setup steps (~ 5-10 minutes)

ScrapeOps Account:
- Register for a free API key at ScrapeOps.
- Configure the ScrapeOps Scraper node with your API key.
OpenAI Account:
- Add your OpenAI credentials to the Message a model node.
Google Sheets:
- Create a Google Sheet. You can duplicate this Template Sheet (copy the headers).
- Connect your Google account to the Append row in sheet node and select your new sheet.

Pre-conditions

An active ScrapeOps account.
An OpenAI API key (or another LLM credential).
A Google account for Sheets access.

Disclaimer

This template uses ScrapeOps as a community node. You are responsible for complying with Wikipedia's Terms of Use, robots directives, and applicable laws in your jurisdiction. Scraping targets may change at any time; adjust render/scroll/wait settings and parsers as needed. Use responsibly for legitimate business purposes.