Trustpilot Insights Scraper: Auto Reviews via Bright Data + Google Sheets Sync
Overview
A comprehensive n8n automation that scrapes Trustpilot business reviews using Bright Data and automatically stores structured data in Google Sheets.
Workflow Architecture
1. 📝 Form Trigger Node
Purpose: Manual input interface for users
- Type:
n8n-nodes-base.formTrigger
- Configuration:
- Form Title: "Website URL"
- Field: "Trustpilot Website URL"
- Function: Accepts Trustpilot URL input from users to initiate the scraping process
2. 🌐 HTTP Request (Trigger Scraping)
Purpose: Initiates scraping on Bright Data platform
- Type:
n8n-nodes-base.httpRequest
- Method: POST
- Endpoint:
https://api.brightdata.com/datasets/v3/trigger
- Configuration:
- Query Parameters:
dataset_id
: gd_lm5zmhwd2sni130p
include_errors
: true
limit_multiple_results
: 2
- Headers:
Authorization
: Bearer BRIGHT_DATA_API_KEY
- Body: JSON with input URL and 35+ custom output fields
Custom Output Fields
The workflow extracts the following data points:
- Company Information:
company_name
, company_logo
, company_overall_rating
, company_total_reviews
, company_about
, company_email
, company_phone
, company_location
, company_country
, company_category
, company_id
, company_website
- Review Data:
review_id
, review_date
, review_rating
, review_title
, review_content
, review_date_of_experience
, review_url
, date_posted
- Reviewer Information:
reviewer_name
, reviewer_location
, reviews_posted_overall
- Review Metadata:
is_verified_review
, review_replies
, review_useful_count
- Rating Distribution:
5_star
, 4_star
, 3_star
, 2_star
, 1_star
- Additional Fields:
url
, company_rating_name
, is_verified_company
, breadcrumbs
, company_other_categories
3. ⌛ Snapshot Progress Check
Purpose: Monitors scraping job status
- Type:
n8n-nodes-base.httpRequest
- Method: GET
- Endpoint:
https://api.brightdata.com/datasets/v3/progress/{{ $json.snapshot_id }}
- Configuration:
- Query Parameters:
format=json
- Headers:
Authorization: Bearer BRIGHT_DATA_API_KEY
- Function: Receives snapshot_id from previous step and checks if data is ready
4. ✅ IF Node (Status Check)
Purpose: Determines next action based on scraping status
- Type:
n8n-nodes-base.if
- Condition:
$json.status === "ready"
- Logic:
- If True: Proceeds to data download
- If False: Triggers wait cycle
5. 🕒 Wait Node
Purpose: Implements polling delay for incomplete jobs
- Type:
n8n-nodes-base.wait
- Duration: 1 minute
- Function: Pauses execution before re-checking snapshot status
6. 🔄 Loop Logic
Purpose: Continuous monitoring until completion
- Flow: Wait → Check Status → Evaluate → (Loop or Proceed)
- Prevents: API rate limiting and unnecessary requests
7. 📥 Snapshot Download
Purpose: Retrieves completed scraped data
- Type:
n8n-nodes-base.httpRequest
- Method: GET
- Endpoint:
https://api.brightdata.com/datasets/v3/snapshot/{{ $json.snapshot_id }}
- Configuration:
- Query Parameters:
format=json
- Headers:
Authorization: Bearer BRIGHT_DATA_API_KEY
8. 📊 Google Sheets Integration
Purpose: Stores extracted data in spreadsheet
- Type:
n8n-nodes-base.googleSheets
- Operation: Append
- Configuration:
- Document ID:
1yQ10Q2qSjm-hhafHF2sXu-hohurW5_KD8fIv4IXEA3I
- Sheet Name: "Trustpilot"
- Mapping: Auto-map all 35+ fields
- Credentials: Google OAuth2 integration
Data Flow
User Input (URL)
↓
Bright Data API Call
↓
Snapshot ID Generated
↓
Status Check Loop
↓
Data Ready Check
↓
Download Complete Dataset
↓
Append to Google Sheets
Technical Specifications
Authentication
- Bright Data: Bearer token authentication
- Google Sheets: OAuth2 integration
Error Handling
- Includes error tracking in Bright Data requests
- Conditional logic prevents infinite loops
- Wait periods prevent API rate limiting
Data Processing
- Mapping Mode: Auto-map input data
- Schema: 35+ predefined fields with string types
- Conversion: No type conversion (preserves raw data)
Setup Requirements
Prerequisites
- Bright Data Account: Active account with API access
- Google Account: With Sheets API enabled
- n8n Instance: Self-hosted or cloud version
Configuration Steps
- API Keys: Configure Bright Data bearer token
- OAuth Setup: Connect Google Sheets credentials
- Dataset ID: Verify correct Bright Data dataset ID
- Sheet Access: Ensure proper permissions for target spreadsheet
Environment Variables
BRIGHT_DATA_API_KEY
: Your Bright Data API authentication token
Use Cases
Business Intelligence
- Competitor analysis and market research
- Customer sentiment monitoring
- Brand reputation tracking
Data Analytics
- Review trend analysis
- Rating distribution studies
- Customer feedback aggregation
Automation Benefits
- Scalability: Handle multiple URLs sequentially
- Reliability: Built-in error handling and retry logic
- Efficiency: Automated data collection and storage
- Consistency: Standardized data format across all scrapes
Limitations and Considerations
Rate Limits
- Bright Data API has usage limitations
- 1-minute wait periods help manage request frequency
Data Volume
- Limited to 2 results per request (configurable)
- Large datasets may require multiple workflow runs
Compliance
- Ensure compliance with Trustpilot's terms of service
- Respect robots.txt and rate limiting guidelines
Monitoring and Maintenance
Status Tracking
- Monitor workflow execution logs
- Check Google Sheets for data accuracy
- Review Bright Data usage statistics
Regular Updates
- Update API keys as needed
- Verify dataset ID remains valid
- Test workflow functionality periodically
Workflow Metadata
- Version ID:
dd3afc3c-91fc-474e-99e0-1b25e62ab392
- Instance ID:
bc8ca75c203589705ae2e446cad7181d6f2a7cc1766f958ef9f34810e53b8cb2
- Execution Order: v1
- Active Status: Currently inactive (requires manual activation)
- Template Status: Credentials setup completed
For any questions or support, please contact: Email
or fill out this form