Workflow Overview
This workflow automates the process of scraping Trustpilot reviews, extracting key details, analyzing sentiment, and saving the results to Google Sheets. It uses OpenAI for sentiment analysis and HTML parsing for review extraction.
How It Works
1. Scrape Trustpilot Reviews
- HTTP Request:
- Fetches review pages from Trustpilot (
https://it.trustpilot.com/review/{{company_id}}
).
- Paginates through pages (up to
max_page
limit).
- HTML Parsing:
- Extracts review URLs using CSS selectors
- Splits the URLs into individual review links.
2. Extract Review Details
- Information Extractor:
- Uses DeepSeek to extract structured data from the review:
- Author: Name of the reviewer.
- Rating: Numeric rating (1-5).
- Date: Review date in
YYYY-MM-DD
format.
- Title: Review title.
- Text: Full review text.
- Total Reviews: Number of reviews by the user.
- Country: Reviewer’s country (2-letter code).
3. Sentiment Analysis
4. Save to Google Sheets
- Google Sheets Node:
- Appends or updates the extracted data to a Google Sheet
Set Up Steps
1. Configure Trustpilot Scraping
- Edit Fields1 Node:
- Set
company_id
to the Trustpilot company name
- Set
max_page
to limit the number of pages scraped.
2. Configure Google Sheets
- Google Sheets Node:
- Update the
documentId
with your Google Sheet ID
- Ensure the sheet has the required columns (
Id
, Data
, Nome
, etc.).
3. Configure OpenAI
- OpenAI Chat Model Node:
- Sentiment Analysis Node:
- Ensure the
categories
match your desired sentiment labels (Positive, Neutral, Negative
).
Key Components
- Nodes:
- HTTP Request/HTML: Scrape and parse Trustpilot reviews.
- Information Extractor: Extract structured review data using DeepSeek.
- Sentiment Analysis: Classify review sentiment.
- Google Sheets: Save and update review data.
- Credentials:
- OpenAI API key.
- DeepSeek API key.
- Google Sheets OAuth2.