How it Works
This workflow transforms natural language queries into research reports through a five-stage AI pipeline. When triggered via webhook (typically from Google Sheets using the companion google-apps-script.js (GitHub gist), it first checks Redis cache for instant results.
For new queries, GPT-4o breaks complex questions into focused sub-queries, optimizes them for search, then uses Bright Data's MCP Tool to find the top 5 credible sources (official sites, news, financial reports). URLs are scraped in parallel, bypassing bot detection.
GPT-4o extracts structured data from each source: answers, facts, entities, sentiment, quotes, and dates. GPT-4o-mini validates source credibility and filters unreliable content. Valid results aggregate into a final summary with confidence scores, key insights, and extended analysis.
Results cache for 1 hour and output via webhook, Slack, email, and DataTable—all in 30-90 seconds with 60 requests/minute rate limiting.
Who is this for?
- Research teams needing automated multi-source intelligence
 
- Content creators and journalists requiring fact-checked information
 
- Due diligence professionals conducting competitive intelligence
 
- Google Sheets power users wanting AI research in spreadsheets
 
- Teams managing large research volumes needing caching and rate limiting
 
Setup Steps
Setup time: 30-45 minutes
Requirements:
- Bright Data account (Web Scraping API + MCP token)
 
- OpenAI API key (GPT-4o and GPT-4o-mini access)
 
- Redis instance
 
- Slack workspace (optional)
 
- SMTP email provider (optional)
 
- Google account (optional for Sheets integration)
 
Core Setup:
- Get Bright Data Web Scraping API token and MCP token
 
- Get OpenAI API key
 
- Set up Redis instance
 
- Configure critical nodes:
- Webhook Entry: Add Header Auth token
 
- Bright Data MCP Tool: Add MCP endpoint with token
 
- Parallel Web Scraping: Add Bright Data API credentials
 
- Redis Nodes: Add connection credentials
 
- All GPT Nodes: Add OpenAI API key (5 nodes)
 
- Slack/Email: Add credentials if using
 
 
Google Sheets Integration:
- Create Google Sheet
 
- Open Extensions → Apps Script
 
- Paste the companion 
google-apps-script.js code 
- Update webhook URL and auth token
 
- Save and authorize
 
Test: {"prompt": "What is the population of Tokyo?", "source": "Test", "language": "English"}
Customization Guidance
- Source Count: Change from 5 to 3-10 URLs per query
 
- Cache Duration: Adjust from 1 hour to 24 hours for stable info
 
- Rate Limits: Modify 60/minute based on usage needs
 
- Character Limits: Adjust 400-char main answer to 200-1000
 
- AI Models: Swap GPT-4o for Claude or use GPT-4o-mini for all stages
 
- Geographic Targeting: Add more regions beyond us/il
 
- Output Channels: Add Notion, Airtable, Discord, Teams
 
- Temperature: Lower (0.1-0.2) for facts, higher (0.4-0.6) for analysis
 
Once configured, this workflow handles all web research, from fact-checking to complex analysis—delivering validated intelligence in seconds with automatic caching.
Built by Daniel Shashko
Connect on LinkedIn