This n8n template from Intuz provides a complete solution to automate the extraction of critical information from PDF documents like faxes, or any PDFs.
It uses the power of Google Gemini's multimodal capabilities to read the document, identify key fields, and organize the data into a structured format, saving it directly to a Google Sheet.
Who's this workflow for?
- Healthcare Administrators
- Medical Billing Teams
- Legal Assistants
- Data Entry Professionals
- Office Managers
How it works
1. Upload via Web Form: The process starts when a user uploads a fax (as a PDF file) through a simple, secure web form generated by n8n.
2. AI Document Analysis: The PDF is sent directly to Google Gemini's advanced multimodal model, which reads the entire document—including text, tables, and form fields. It extracts all relevant information based on a detailed prompt.
3. AI Data Structuring: The raw extracted text is then passed to a second AI step. This step cleans the information and strictly structures it into a predictable JSON format (e.g., Patient ID, Name, DOB, etc.).
4. Save to Google Sheets: The final, structured data is automatically appended as a new, clean row in your designated Google Sheet, creating an organized and usable dataset from the unstructured fax.
Key Requirements to Use This Template
1. n8n Instance & Required Nodes:
- An active n8n account (Cloud or self-hosted).
- This workflow uses the official n8n LangChain integration (@n8n/n8n-nodes-langchain). If you are using a self-hosted version of n8n, please ensure this package is installed.
2. Google Accounts:
- Google Drive Account: For temporarily storing the uploaded file.
- Google Gemini AI Account: A Google Cloud account with the Vertex AI API (for Gemini models) enabled and an associated API Key.
- Google Sheets Account: A pre-made Google Sheet with columns that match the data you want to extract.
Customer Setup Guide:
Here is a detailed, step-by-step guide to help you configure and run this workflow.
1. Before You Begin: Prerequisites
Please ensure you have the following ready:
- The FAX-Content-Extraction.json file we provided.
- Active accounts for n8n, Google Drive, Google Cloud (for Gemini AI), and Google Sheets.
- A Google Sheet created with header columns that match the data you want to extract (e.g., Patient ID, Patient Name, Date of Birth, etc.).
2. Step-by-Step Configuration
Step 1: Import the Workflow
- Open your n8n canvas.
- Click "Import from File" and select the FAX-Content-Extraction.json file.
- The workflow will appear on your canvas.
Step 2: Set Up the Form Trigger
The workflow starts with the "On form submission" node.
- Click on this node.
- In the settings panel, you will see a "Form URL". Copy this URL. This is the link to the web form where you will upload your fax files.
Step 3: Configure the Google Drive Node
- Click on the "Upload file" (Google Drive) node.
- Credentials: Select your Google Drive account from the "Credentials" dropdown or click "Create New" to connect your account.
- Folder ID: In the "Folder ID" field, choose the specific Google Drive folder where you want the uploaded faxes to be saved.
Step 4: Configure the Google Gemini AI Nodes (Very Important)
This workflow uses AI in two places, and both need to be connected.
- First AI Call (PDF Reading):
- Click on the "Call Gemini 2.0 Flash with PDF Capabilities" (HTTP Request) node.
- Under "Authentication", make sure "Predefined Credential Type" is selected.
- For "Credential Type", choose "Google Palm API".
- In the "Credentials" dropdown, select your Google Gemini API key or click "Create New" to add it.
- Second AI Call (Data Structuring):
- Click on the "Google Gemini Chat Model" node (it's connected below the "Basic LLM Chain" node).
- In the "Credentials" dropdown, select the same Google Gemini API key you used before.
Step 5: (Optional) Customize What Data is Extracted
You have full control over what information the AI looks for.
- To change the extraction rules: Click on the "Define Prompt" node. You can edit the text in the "Value" field to tell the AI what to look for (e.g., "Extract only the patient's name and medication list").
- To change the final output columns: Click on the "Basic LLM Chain" node. In the "Text" field, you can edit the JSON schema to add, remove, or rename the fields you want in your final output. The keys here MUST match the column headers in your Google Sheet.
Step 6: Configure the Final Google Sheets Node
- Click on the "Append row in sheet" node.
- Credentials: Select your Google Sheets account from the "Credentials" dropdown.
- Document ID: Select your target spreadsheet from the "Document" dropdown list.
- Sheet Name: Select the specific sheet within that document.
- Columns: Ensure that the fields listed here match the columns in your sheet and the schema from the "Basic LLM Chain" node.
4. Running the Workflow
- Save and Activate: Click "Save" and then toggle the workflow to "Active".
- Open the Form: Open the Form URL you copied in Step 2 in a new browser tab.
- Upload a File: Upload a sample fax PDF and submit the form.
- Check Your Sheet: After a minute, a new row with the extracted data should appear in your Google Sheet.
Connect with us
For Custom Worflow Automation
Click here- Get Started