This workflow allows you to scrape website content, clean the HTML, extract structured information using GPT-4o-mini, and store the results along with SEO keywords into Airtable. Ideal for building keyword lists and organizing web content for SEO research.
Ensure your Airtable table has the following fields:
Field Name | Type | Notes |
---|---|---|
Website Name | String | Name or URL of the website |
Data | String | Cleaned website text |
Keyword | String | Extracted SEO keyword list |
Status | Options | Values: Todo, In progress, Done |
✅ Form Trigger:
Collects website URL from the user.
✅ HTTP Request:
Fetches the website content.
✅ HTML Cleaner (Code Node):
Strips out styles, tags, and whitespace to get clean text.
✅ Topic Extractor (AI Agent + GPT-4o-mini):
Extracts topic-wise information from the cleaned website content.
✅ Text Cleaner (Code Node):
Removes unwanted symbols like ###
and **
.
✅ Keyword Extractor (AI Agent + GPT-4o-mini):
Generates a list of 90 important SEO keywords.
✅ Airtable Upsert:
Stores the cleaned data, keywords, and status in Airtable.
✅ Automatic website content scraping
✅ Clean HTML and extract plain text
✅ Use GPT-4o-mini for topic-wise information extraction
✅ Generate 90-keyword SEO lists
✅ Store and manage data in Airtable
Current Name | Suggested Name |
---|---|
Website Name | Website URL Input Form |
HTTP Request | Fetch Website Content |
Code | HTML to Plain Text Cleaner |
Split Out1 | Clean Text Splitter |
AI Agent1 | Topic Extractor (GPT-4o-mini) |
Code1 | Text Cleanup Formatter |
Split Out2 | Final Text Splitter |
AI Agent | Keyword Extractor (GPT-4o-mini) |
Airtable | Airtable Data Upsert |
Wait1 | Delay Before Merge |
Merge | Combine Data for Airtable |