Task:
Create a simple API endpoint using the Webhook and Respond to Webhook nodes
Why:
You can prototype or replace a backend process with a single workflow
Main use cases:
Replace backend logic with a workflow
Task:
Merge two datasets into one based on matching rules
Why:
A powerful capability of n8n is to easily branch out the workflow in order to process different datasets. Even more powerful is the ability to join them back together with SQL-like joining logic.
Main use cases:
Appending data sets
Keep only new items
Keep only existing items
This workflow will backup your workflows to Github. It uses the public api to export all of the workflow data using the n8n node.
It then loops over the data checks in Github to see if a file exists that uses the workflow name. Once checked it will then update the file on Github if it exists, Create a new file if it doesn't exist and if it's the same it will ignore the file.
Config Options
repo_owner - Github owner
repo_name - Github repository name
repo_path - Path within the Github repository
>This workflow has been updated to use the n8n node and the code node so requires at least version 0.198.0 of n8n
This workflow allows extracting data from multiple pages website.
The workflow:
1) Starts in a country list at https://www.theswiftcodes.com/browse-by-country/.
2) Loads every country page (https://www.theswiftcodes.com/albania/)
3) Paginates every page in the country page.
4) Extracts data from the country page.
5) Saves data to MongoDB.
6) Paginates through all pages in all countries.
It uses getWorkflowStaticData('global') method to recover the next page (saved from the previous page), and it goes ahead with all the pages.
There is a first section where the countries list is recovered and extracted.
Later, I try to read if a local cache page is available and I recover the cached page from the disk.
Finally, I save data to MongoDB, and we paginate all the pages in the country and for all the countries.
I have applied a cache system to save a visited page to n8n local disk. If I relaunch workflow, we check if a cache file exists to discard non-required requests to the webpage.
If the data present in the website changes, you can apply a Cron node to check the website once per week.
Finally, before inserting data in MongoDB, the best way to avoid duplicates is to check that swift_code (the primary value of the collection) doesn't exist.
I recommend using a proxy for all requests to avoid IP blocks. A good solution for proxy plus IP rotation is scrapoxy.io.
This workflow is perfect for small data requirements. If you need to scrape dynamic data, you can use a Headless browser or any other service.
If you want to scrape huge lists of URIs, I recommend using Scrapy + Scrapoxy.
This workflow allows to scrape Google Maps data in an efficient way using SerpAPI.
You'll get all data from Gmaps at a cheaper cost than Google Maps API.
Add as input, your Google Maps search URL and you'll get a list of places with many data points such as:
phone number
website
rating
reviews
address
And much more.
Full guide to implement the workflow is here:
https://lempire.notion.site/Scrape-Google-Maps-places-with-n8n-b7f1785c3d474e858b7ee61ad4c21136?pvs=4
Temporary solution using the undocumented REST API for backups using Google drive.
Please note that there are issues with this workflow. It does not support versioning, so please know that it will create multiple copies of the workflows so if you run this daily it will make the folder grow quickly. Once I figure out how to version in Gdrive I'll update it here.
Video Guide
I prepared a detailed guide explaining how to set up and implement this scenario, enabling you to chat with your documents stored in Supabase using n8n.
Youtube Link
Who is this for?
This workflow is ideal for researchers, analysts, business owners, or anyone managing a large collection of documents. It's particularly beneficial for those who need quick contextual information retrieval from text-heavy files stored in Supabase, without needing additional services like Google Drive.
What problem does this workflow solve?
Manually retrieving and analyzing specific information from large document repositories is time-consuming and inefficient. This workflow automates the process by vectorizing documents and enabling AI-powered interactions, making it easy to query and retrieve context-based information from uploaded files.
What this workflow does
The workflow integrates Supabase with an AI-powered chatbot to process, store, and query text and PDF files. The steps include:
Fetching and comparing files to avoid duplicate processing.
Handling file downloads and extracting content based on the file type.
Converting documents into vectorized data for contextual information retrieval.
Storing and querying vectorized data from a Supabase vector store.
File Extraction and Processing: Automates handling of multiple file formats (e.g., PDFs, text files), and extracts document content.
Vectorized Embeddings Creation: Generates embeddings for processed data to enable AI-driven interactions.
Dynamic Data Querying: Allows users to query their document repository conversationally using a chatbot.
Setup
N8N Workflow
Fetch File List from Supabase:
Use Supabase to retrieve the stored file list from a specified bucket.
Add logic to manage empty folder placeholders returned by Supabase, avoiding incorrect processing.
Compare and Filter Files:
Aggregate the files retrieved from storage and compare them to the existing list in the Supabase files table.
Exclude duplicates and skip placeholder files to ensure only unprocessed files are handled.
Handle File Downloads:
Download new files using detailed storage configurations for public/private access.
Adjust the storage settings and GET requests to match your Supabase setup.
File Type Processing:
Use a Switch node to target specific file types (e.g., PDFs or text files).
Employ relevant tools to process the content:
For PDFs, extract embedded content.
For text files, directly process the text data.
Content Chunking:
Break large text data into smaller chunks using the Text Splitter node.
Define chunk size (default: 500 tokens) and overlap to retain necessary context across chunks.
Vector Embedding Creation:
Generate vectorized embeddings for the processed content using OpenAI's embedding tools.
Ensure metadata, such as file ID, is included for easy data retrieval.
Store Vectorized Data:
Save the vectorized information into a dedicated Supabase vector store.
Use the default schema and table provided by Supabase for seamless setup.
AI Chatbot Integration:
Add a chatbot node to handle user input and retrieve relevant document chunks.
Use metadata like file ID for targeted queries, especially when multiple documents are involved.
Testing
Upload sample files to your Supabase bucket.
Verify if files are processed and stored successfully in the vector store.
Ask simple conversational questions about your documents using the chatbot (e.g., "What does Chapter 1 say about the Roman Empire?").
Test for accuracy and contextual relevance of retrieved results.
This creates a git backup of the workflows and credentials.
It uses the n8n export command with git diff, so you can run as many times as you want, but only when there are changes they will create a commit.
Setup
You need some access to the server.
Create a repository in some remote place to host your project, like Github, Gitlab, or your favorite private repo.
Clone the repository in the server in a place that the n8n has access. In the example, it's the ., and the repository name is repo. Change it in the commands and in the workflow commands (you can set it as a variable in the wokflow). Checkout to another branch if you won't use the master one.
cd .
git clone repository
Or you could git init and then add the remote (git remote add origin YOUR_REPO_URL), whatever pleases you more.
As the server, check if everything is ok for beeing able to commit. Very likely you'll need to setup the user email and name. Try to create a commit, and push it to upstream, and everything you need (like config a user to comit) will appear in way. I strong suggest testing with exporting the commands to garantee it will work too.
cd ./repo
git commit -c "Initial commmit" --allow-empty
-u is the same as --set-upstream
git push -u origin master
Testing to push to upstream with the first exported data
npx n8n export:workflow --backup --output ./repo/workflows/
npx n8n export:credentials --backup --output repo/credentials/
cd ./repo
git add .
git commit -c "manual backup: first export"
git push
After that, if everything is ok, the workflow should work just fine.
Adjustments
Adjust the path in used in the workflow. See the the git -C PATH command is the same as cd PATH; git ....
Also, adjust the cron to run as you need. As I said in the beginning, you can run it even for every minute, but it will create commits only when there are changes.
Credentials encryption
The default for exporting the credentials is to do them encrypted. You can add the flag --decrypted to the n8n export:credentials command if you need to save them in plain. But as general rule, it's better to save the encryption key, that you only need to do that once, and them export it safely encrypted.
This n8n workflow demonstrates how to build a simple uptime monitoring service using scheduled triggers.
Useful for webmasters with a handful of sites who want a cost-effective solution without the need for all the bells and whistles.
How it works
Scheduled trigger reads a list of website urls in a Google Sheet every 5 minutes
Each website url is checked using the HTTP node which determines if the website is either in the UP or DOWN state.
An email and Slack message are sent for websites which are in the DOWN state.
The Google Sheet is updated with the website's state and a log created.
Logs can be used to determine total % of UP and DOWN time over a period.
Requirements
Google Sheet for storing websites to monitor and their states
Gmail for email alerts
Slack for channel alerts
Customising the workflow
Don't use Google Sheets? This can easily be exchanged with Excel or Airtable.
Video Guide
I prepared a detailed guide that showed the whole process of building a resume analyzer.
Who is this for?
This workflow is ideal for developers, data analysts, and business owners who want to enable conversational interactions with their database. Itβs particularly useful for cases where users need to extract, analyze, or aggregate data without writing SQL queries manually.
What problem does this workflow solve?
Accessing and analyzing database data often requires SQL expertise or dedicated reports, which can be time-consuming. This workflow empowers users to interact with a database conversationally through an AI-powered agent. It dynamically generates SQL queries based on user requests, streamlining data retrieval and analysis.
What this workflow does
This workflow integrates OpenAI with a Supabase database, enabling users to interact with their data via an AI agent. The agent can:
Retrieve records from the database.
Extract and analyze JSON data stored in tables.
Provide summaries, aggregations, or specific data points based on user queries.
Dynamic SQL Querying: The agent uses user prompts to create and execute SQL queries on the database.
Understand JSON Structure: The workflow identifies JSON schema from sample records, enabling the agent to parse and analyze JSON fields effectively.
Database Schema Exploration: It provides the agent with tools to retrieve table structures, column details, and relationships for precise query generation.
Setup
Preparation
Create Accounts:
N8N: For workflow automation.
Supabase: For database hosting and management.
OpenAI: For building the conversational AI agent.
Configure Database Connection:
Set up a PostgreSQL database in Supabase.
Use appropriate credentials (username, password, host, and database name) in your workflow.
N8N Workflow
AI agent with tools:
Code Tool:
Execute SQL queries based on user input.
Database Schema Tool:
Retrieve a list of all tables in the database.
Use a predefined SQL query to fetch table definitions, including column names, types, and references.
Table Definition:
Retrieve a list of columns with types for one table.
This workflow is a modification of the previous template on how to create an SQL agent with LangChain and SQLite.
The key difference β the agent has access only to the database schema, not to the actual data. To achieve this, SQL queries are made outside the AI Agent node, and the results are never passed back to the agent.
This approach allows the agent to generate SQL queries based on the structure of tables and their relationships, without having to access the actual data.
This makes the process more secure and efficient, especially in cases where data confidentiality is crucial.
π Setup
To get started with this workflow, youβll need to set up a free MySQL server and import your database (check Step 1 and 2 in this tutorial).
Of course, you can switch MySQL to another SQL database such as PostgreSQL, the principle remains the same. The key is to download the schema once and save it locally to avoid repeated remote connections.
Run the top part of the workflow once to download and store the MySQL chinook database schema file on the server.
With this approach, we avoid the need to repeatedly connect to a remote db4free database and fetch the schema every time. As a result, we reach greater processing speed and efficiency.
π£οΈ Chat with your data
Start a chat: send a message in the chat window.
The workflow loads the locally saved MySQL database schema, without having the ability to touch the actual data. The file contains the full structure of your MySQL database for analysis.
The Langchain AI Agent receives the schema, your input and begins to work.
The AI Agent generates SQL queries and brief comments based solely on the schema and the userβs message.
An IF node checks whether the AI Agent has generated a query. When:
Yes: the AI Agent passes the SQL query to the next MySQL node for execution.
No: You get a direct answer from the Agent without further action.
The workflow formats the results of the SQL query, ensuring they are convenient to read and easy to understand.
Once formatted, you get both the Agent answer and the query result in the chat window.
π Example queries
Try these sample queries to see the schema-driven AI Agent in action:
Would you please list me all customers from Germany?
What are the music genres in the database?
What tables are available in the database?
Please describe the relationships between tables. - In this example, the AI Agent does not need to create the SQL query.
And if you prefer to keep the data private, you can manually execute the generated SQL query in your own environment using any database client or tool you trust ποΈ
π The AI Agent memory node does not store the actual data as we run SQL-queries outside the agent. It contains the database schema, user questions and the initial Agent reply. Actual SQL query results are passed to the chat window, but the values are not stored in the Agent memory.