The News Site from Colt, a telecom company, does not offer an RSS feed, therefore web scraping is the choice to extract and process the news.
The goal is to get only the newest posts, a summary of each post and their respective (technical) keywords.
Note that the news site offers the links to each news post, but not the individual news. We collect first the links and dates of each post before extracting the newest ones.
The result is sent to a SQL database, in this case a NocoDB database.
This process happens each week thru a cron job.
Requirements:
Assumptions:
"Warnings"
Implement complex processes faster with n8n