Back to Integrations
integrationHTTP Request node
integration

HTTP Request and Information Extractor integration

Save yourself the work of writing custom integrations for HTTP Request and Information Extractor and use n8n instead. Build adaptable and scalable Development, Core Nodes, AI, and Langchain workflows that work with your technology stack. All within a building experience you will love.

How to connect HTTP Request and Information Extractor

  • Step 1: Create a new workflow
  • Step 2: Add and configure nodes
  • Step 3: Connect
  • Step 4: Customize and extend your integration
  • Step 5: Test and activate your workflow

Step 1: Create a new workflow and add the first step

In n8n, click the "Add workflow" button in the Workflows tab to create a new workflow. Add the starting point – a trigger on when your workflow should run: an app event, a schedule, a webhook call, another workflow, an AI chat, or a manual trigger. Sometimes, the HTTP Request node might already serve as your starting point.

HTTP Request and Information Extractor integration: Create a new workflow and add the first step

Step 2: Add and configure HTTP Request and Information Extractor nodes

You can find HTTP Request and Information Extractor in the nodes panel. Drag them onto your workflow canvas, selecting their actions. Click each node, choose a credential, and authenticate to grant n8n access. Configure HTTP Request and Information Extractor nodes one by one: input data on the left, parameters in the middle, and output data on the right.

HTTP Request and Information Extractor integration: Add and configure HTTP Request and Information Extractor nodes

Step 3: Connect HTTP Request and Information Extractor

A connection establishes a link between HTTP Request and Information Extractor (or vice versa) to route data through the workflow. Data flows from the output of one node to the input of another. You can have single or multiple connections for each node.

HTTP Request and Information Extractor integration: Connect HTTP Request and Information Extractor

Step 4: Customize and extend your HTTP Request and Information Extractor integration

Use n8n's core nodes such as If, Split Out, Merge, and others to transform and manipulate data. Write custom JavaScript or Python in the Code node and run it as a step in your workflow. Connect HTTP Request and Information Extractor with any of n8n’s 1000+ integrations, and incorporate advanced AI logic into your workflows.

HTTP Request and Information Extractor integration: Customize and extend your HTTP Request and Information Extractor integration

Step 5: Test and activate your HTTP Request and Information Extractor workflow

Save and run the workflow to see if everything works as expected. Based on your configuration, data should flow from HTTP Request to Information Extractor or vice versa. Easily debug your workflow: you can check past executions to isolate and fix the mistake. Once you've tested everything, make sure to save your workflow and activate it.

HTTP Request and Information Extractor integration: Test and activate your HTTP Request and Information Extractor workflow

Ultimate Scraper Workflow for n8n

What this template does
The Ultimate Scraper for n8n uses Selenium and AI to retrieve any information displayed on a webpage. You can also use session cookies to log in to the targeted webpage for more advanced scraping needs.

⚠️ Important: This project requires specific setup instructions. Please follow the guidelines provided in the GitHub repository: n8n Ultimate Scraper Setup : https://github.com/Touxan/n8n-ultimate-scraper/tree/main.

The workflow version on n8n and the GitHub project may differ; however, the most up-to-date version will always be the one available on the GitHub repository : https://github.com/Touxan/n8n-ultimate-scraper/tree/main.

How to use
Deploy the project with all the requirements and request your webhook.

Example of request:
curl -X POST http://localhost:5678/webhook-test/yourwebhookid
-H "Content-Type: application/json"
-d '{
"subject": "Hugging Face",
"Url": "github.com",
"Target data": [
{
"DataName": "Followers",
"description": "The number of followers of the GitHub page"
},
{
"DataName": "Total Stars",
"description": "The total numbers of stars on the different repos"
}
],
"cookie": []
}'

Or to just scrap a url :
curl -X POST http://localhost:5678/webhook-test/67d77918-2d5b-48c1-ae73-2004b32125f0
-H "Content-Type: application/json"
-d '{
"Target Url": "https://github.com",
"Target data": [
{
"DataName": "Followers",
"description": "The number of followers of the GitHub page"
},
{
"DataName": "Total Stars",
"description": "The total numbers of stars on the different repo"
}
],
"cookies": []
}'
`

Nodes used in this workflow

Popular HTTP Request and Information Extractor workflows

+6

API Schema Extractor

This workflow automates the process of discovering and extracting APIs from various services, followed by generating custom schemas. It works in three distinct stages: research, extraction, and schema generation, with each stage tracking progress in a Google Sheet. 🙏 Jim Le deserves major kudos for helping to build this sophisticated three-stage workflow that cleverly automates API documentation processing using a smart combination of web scraping, vector search, and LLM technologies. How it works Stage 1 - Research: Fetches pending services from a Google Sheet Uses Google search to find API documentation Employs Apify for web scraping to filter relevant pages Stores webpage contents and metadata in Qdrant (vector database) Updates progress status in Google Sheet (pending, ok, or error) Stage 2 - Extraction: Processes services that completed research successfully Queries vector store to identify products and offerings Further queries for relevant API documentation Uses Gemini (LLM) to extract API operations Records extracted operations in Google Sheet Updates progress status (pending, ok, or error) Stage 3 - Generation: Takes services with successful extraction Retrieves all API operations from the database Combines and groups operations into a custom schema Uploads final schema to Google Drive Updates final status in sheet with file location Ideal for: Development teams needing to catalog multiple APIs API documentation initiatives Creating standardized API schema collections Automating API discovery and documentation Accounts required: Google account (for Sheets and Drive access) Apify account (for web scraping) Qdrant database Gemini API access Set up instructions: Prepare your Google Sheets document with the services information. Here's an example of a Google Sheet – you can copy it and change or remove the values under the columns. Also, make sure to update Google Sheets nodes with the correct Google Sheet ID. Configure Google Sheets OAuth2 credentials, required third-party services (Apify, Qdrant) and Gemini. Ensure proper permissions for Google Drive access.

Ultimate Scraper Workflow for n8n

What this template does The Ultimate Scraper for n8n uses Selenium and AI to retrieve any information displayed on a webpage. You can also use session cookies to log in to the targeted webpage for more advanced scraping needs. ⚠️ Important: This project requires specific setup instructions. Please follow the guidelines provided in the GitHub repository: n8n Ultimate Scraper Setup : https://github.com/Touxan/n8n-ultimate-scraper/tree/main. The workflow version on n8n and the GitHub project may differ; however, the most up-to-date version will always be the one available on the GitHub repository : https://github.com/Touxan/n8n-ultimate-scraper/tree/main. How to use Deploy the project with all the requirements and request your webhook. Example of request: curl -X POST http://localhost:5678/webhook-test/yourwebhookid \ -H "Content-Type: application/json" \ -d '{ "subject": "Hugging Face", "Url": "github.com", "Target data": [ { "DataName": "Followers", "description": "The number of followers of the GitHub page" }, { "DataName": "Total Stars", "description": "The total numbers of stars on the different repos" } ], "cookie": [] }' Or to just scrap a url : curl -X POST http://localhost:5678/webhook-test/67d77918-2d5b-48c1-ae73-2004b32125f0 \ -H "Content-Type: application/json" \ -d '{ "Target Url": "https://github.com", "Target data": [ { "DataName": "Followers", "description": "The number of followers of the GitHub page" }, { "DataName": "Total Stars", "description": "The total numbers of stars on the different repo" } ], "cookies": [] }' `

Deduplicate Scraping AI Grants for Eligibility using AI

This n8n template scrapes a list of AI grants from grants.gov and qualifies them using AI; determining interest and eligibility for the business. It then sends an email alert of interesting items to team members in an email. The template also shows how you can use the "Remove Duplicates" node to simplify deduplication of external listings without the need to manage this yourself. Not particularly interested in AI Grants? This template works for other tender websites as long as you're able to scrape them. How it works A scheduled trigger is set to fetch a list of AI grants listed on the grants.gov website in the past day. A Remove Duplicates node is used to track Grant IDs to filter out those already processed by the workflow. New grants are summarized and analysed by AI nodes to determine eligibility and interest which is then saved to an Airtable database. Another scheduled trigger starts a little later than the first to collect and summarize the new grants The results are then compiled into an email template using the HTML node, in the form of a newsletter designed to alert and brief team members of new AI grants. This email is then sent to a list of subscribers using the gmail node. How to use Make a copy of sample Airtable here: https://airtable.com/appiNoPRvhJxz9crl/shrRdP6zstgsxjDKL The filters for fetching the grants is currently set to the "AI" category. Feel free to change this to include more categories. Not interested in grants, this template can works for other sources of leads just change the endpoint and how you're defining the item ID to track. Requirements Airtable for database OpenAI for LLM Note: These are not hard requirements and can be exchanged for services available to you. customising the workflow "Eligibility" criteria at this stage may be better served by identifying hard blockers instead ie. certifications, geographical considerations or certain legal checks. Be sure to mention any hard blockers into the Eligibility prompt. Not particularly interested in AI prompts? This template works for other tender websites as long as you're able to scrape them.
+10

Scale Deal Flow with a Pitch Deck AI Vision, Chatbot and QDrant Vector Store

Are you a popular tech startup accelerator (named after a particular higher order function) overwhelmed with 1000s of pitch decks on a daily basis? Wish you could filter through them quickly using AI but the decks are unparseable through conventional means? Then you're in luck! This n8n template uses Multimodal LLMs to parse and extract valuable data from even the most overly designed pitch decks in quick fashion. Not only that, it'll also create the foundations of a RAG chatbot at the end so you or your colleagues can drill down into the details if needed. With this template, you'll scale your capacity to find interesting companies you'd otherwise miss! Requires n8n v1.62.1+ How It Works Airtable is used as the pitch deck database and PDF decks are downloaded from it. An AI Vision model is used to transcribe each page of the pitch deck into markdown. An Information Extractor is used to generate a report from the transcribed markdown and update required information back into pitch deck database. The transcribed markdown is also uploaded to a vector store to build an AI chatbot which can be used to ask questions on the pitch deck. Check out the sample Airtable here: https://airtable.com/appCkqc2jc3MoVqDO/shrS21vGqlnqzzNUc How To Use This template depends on the availability of the Airtable - make a duplicate of the airtable (link) and its columns before running the workflow. When a new pitchdeck is received, enter the company name into the Name column and upload the pdf into the File column. Leave all other columns blank. If you have the Airtable trigger active, the execution should start immediately once the file is uploaded. Otherwise, click the manual test trigger to start the workflow. When manually triggered, all "new" pitch decks will be handled by the workflow as separate executions. Requirements OpenAI for LLM Airtable For Database and Interface Qdrant for Vector Store Customising This Workflow Extend this starter template by adding more AI agents to validate claims made in the pitch deck eg. Linkedin Profiles, Page visits, Reviews etc.
+3

Transcribing Bank Statements To Markdown Using Gemini Vision AI

This n8n workflow demonstrates an approach to parsing bank statement PDFs with multimodal LLMs as an alternative to traditional OCR. This allows for much more accurate data extraction from the document especially when it comes to tables and complex layouts. Multimodal Parsing is better than traditiona OCR because: It reduces complexity and overhead by avoiding the need to preprocess the document into text format such as markdown before passing to the LLM. It handles non-standard PDF formats which may produce garbled output via traditional OCR text conversion. It's orders of magnitude cheaper than premium OCR models that still require post-processing cleanup and formatting. LLMs can format to any schema or language you desire! How it works You can use the example bank statement created specifically for this workflow here: https://drive.google.com/file/d/1wS9U7MQDthj57CvEcqG_Llkr-ek6RqGA/view?usp=sharing A PDF bank statement is imported via Google Drive. For this demo, I've created a mock bank statement which includes complex table layouts of 5 columns. Typically, OCR will be unable to align the columns correctly and mistake some deposits for withdrawals. Because multimodal LLMs do not accept PDFs directly, well have to convert the PDF to a series of images. We can achieve this by using a tool such as Stirling PDF. Stirling PDF is self-hostable which is handy for sensitive data such as bank statements. Stirling PDF will return our PDF as a series of JPGs (one for each page) in a zipped file. We can use n8n's decompress node to extract the images and ensure they are ordered by using the Sort node. Next, we'll resize each page using the Edit Image node to ensure the right balance between resolution limits and processing speed. Each resized page image is then passed into the Basic LLM node which will use our multimodal LLM of choice - Gemini 1.5 Pro. In the LLM node's options, we'll add a "user message" of type binary (data) which is how we add our image data as an input. Our prompt will instruct the multimodal LLM to transcribe each page to markdown. Note, you do not need to do this - you can just ask for data points to extract directly! Our goal for this template is to demonstrate the LLMs ability to accurately read the page. Finally, with our markdown version of all pages, we can pass this to another LLM node to extract required data such as deposit line items. Requirements Google Gemini API for Multimodal LLM. Google Drive access for document storage. Stirling PDF instance for PDF to Image conversion Customising the workflow At time of writing, Gemini 1.5 Pro is the most accurate in text document parsing with a relatively low cost. If you are not using Google Gemini however you can switch to other multimodal LLMs such as OpenAI GPT or Antrophic Claude. If you don't need the markdown, simply asking what to extract directly in the LLM's prompt is also acceptable and would save a few extra steps. Not parsing any bank statements any time soon? This template also works for Invoices, inventory lists, contracts, legal documents etc.

Build your own HTTP Request and Information Extractor integration

Create custom HTTP Request and Information Extractor workflows by choosing triggers and actions. Nodes come with global operations and settings, as well as app-specific parameters that can be configured. You can also use the HTTP Request node to query data from any app or service with a REST API.

HTTP Request and Information Extractor integration details

Use case

Save engineering resources

Reduce time spent on customer integrations, engineer faster POCs, keep your customer-specific functionality separate from product all without having to code.

Learn more

FAQs

  • Can HTTP Request connect with Information Extractor?

  • Can I use HTTP Request’s API with n8n?

  • Can I use Information Extractor’s API with n8n?

  • Is n8n secure for integrating HTTP Request and Information Extractor?

  • How to get started with HTTP Request and Information Extractor integration in n8n.io?

Need help setting up your HTTP Request and Information Extractor integration?

Discover our latest community's recommendations and join the discussions about HTTP Request and Information Extractor integration.
Moiz Contractor
theo
Jon
Dan Burykin
Tony

Looking to integrate HTTP Request and Information Extractor in your company?

Over 3000 companies switch to n8n every single week

Why use n8n to integrate HTTP Request with Information Extractor

Build complex workflows, really fast

Build complex workflows, really fast

Handle branching, merging and iteration easily.
Pause your workflow to wait for external events.

Code when you need it, UI when you don't

Simple debugging

Your data is displayed alongside your settings, making edge cases easy to track down.

Use templates to get started fast

Use 1000+ workflow templates available from our core team and our community.

Reuse your work

Copy and paste, easily import and export workflows.

Implement complex processes faster with n8n

red iconyellow iconred iconyellow icon