Dropbox node
HTTP Request node
Merge node
+5

Manipulate PDF with Adobe developer API

Published 1 month ago

Created by

valerian
digi-stud.io

Template description

Adobe developer API

Did you know that Adobe provides an API to perform all sort of manipulation on PDF files :

  • Split PDF, Combine PDF
  • OCR
  • Insert page, delete page, replace page, reorder page
  • Content extraction (text content, tables, pictures)
  • ...

The free tier allows up to 500 PDF operation / month. As it comes directly from Adobe, it works often better than other alternatives.

Adobe documentation:

What does this workflow do

The API is a bit painful to use. To perform a transformation on a PDF it requires to

  • Authenticate and get a temporal token
  • Register a new asset (file)
  • Upload you PDF to the registered asset
  • Perform a query according to the transformation requested
  • Wait for the query to be proccessed by Adobe backend
  • Download the result

This workflow is a generic wrapper to perform all these steps for any transformation endpoint. I usually use it from other workflow with an Execute Workflow node.

Examples are given in the workflow.

Example use case

This service is useful for example to clean PDF data for an AI / RAG system.

My favorite use-case is to extract table as images and forward images to an AI for image recognition / description which is often more accuarate than feedind raw tabular data to a LLM.

Share Template

More Building Blocks workflow templates

Webhook node
Respond to Webhook node

Creating an API endpoint

Task: Create a simple API endpoint using the Webhook and Respond to Webhook nodes Why: You can prototype or replace a backend process with a single workflow Main use cases: Replace backend logic with a workflow
jon-n8n
Jonathan
Customer Datastore (n8n training) node

Very quick quickstart

Want to learn the basics of n8n? Our comprehensive quick quickstart tutorial is here to guide you through the basics of n8n, step by step. Designed with beginners in mind, this tutorial provides a hands-on approach to learning n8n's basic functionalities.
deborah
Deborah
HTTP Request node
Item Lists node

Pulling data from services that n8n doesn’t have a pre-built integration for

You still can use the app in a workflow even if we don’t have a node for that or the existing operation for that. With the HTTP Request node, it is possible to call any API point and use the incoming data in your workflow Main use cases: Connect with apps and services that n8n doesn’t have integration with Web scraping How it works This workflow can be divided into three branches, each serving a distinct purpose: 1.Splitting into Items (HTTP Request - Get Mock Albums): The workflow initiates with a manual trigger (On clicking 'execute'). It performs an HTTP request to retrieve mock albums data from "https://jsonplaceholder.typicode.com/albums." The obtained data is split into items using the Item Lists node, facilitating easier management. 2.Data Scraping (HTTP Request - Get Wikipedia Page and HTML Extract): Another branch of the workflow involves fetching a random Wikipedia page using an HTTP request to "https://en.wikipedia.org/wiki/Special:Random." The HTML Extract node extracts the article title from the fetched Wikipedia page. 3.Handling Pagination (The final branch deals with handling pagination for a GitHub API request): It sends an HTTP request to "https://api.github.com/users/that-one-tom/starred," with parameters like the page number and items per page dynamically set by the Set node. The workflow uses conditions (If - Are we finished?) to check if there are more pages to retrieve and increments the page number accordingly (Set - Increment Page). This process repeats until all pages are fetched, allowing for comprehensive data retrieval.
jon-n8n
Jonathan

More AI workflow templates

OpenAI Chat Model node
SerpApi (Google Search) node

AI agent chat

This workflow employs OpenAI's language models and SerpAPI to create a responsive, intelligent conversational agent. It comes equipped with manual chat triggers and memory buffer capabilities to ensure seamless interactions. To use this template, you need to be on n8n version 1.50.0 or later.
n8n-team
n8n Team
HTTP Request node
Merge node
+7

Scrape and summarize webpages with AI

This workflow integrates both web scraping and NLP functionalities. It uses HTML parsing to extract links, HTTP requests to fetch essay content, and AI-based summarization using GPT-4o. It's an excellent example of an end-to-end automated task that is not only efficient but also provides real value by summarizing valuable content. Note that to use this template, you need to be on n8n version 1.50.0 or later.
n8n-team
n8n Team
HTTP Request node
Markdown node
+5

AI agent that can scrape webpages

⚙️🛠️🚀🤖🦾 This template is a PoC of a ReAct AI Agent capable of fetching random pages (not only Wikipedia or Google search results). On the top part there's a manual chat node connected to a LangChain ReAct Agent. The agent has access to a workflow tool for getting page content. The page content extraction starts with converting query parameters into a JSON object. There are 3 pre-defined parameters: url** – an address of the page to fetch method** = full / simplified maxlimit** - maximum length for the final page. For longer pages an error message is returned back to the agent Page content fetching is a multistep process: An HTTP Request mode tries to get the page content. If the page content was successfuly retrieved, a series of post-processing begin: Extract HTML BODY; content Remove all unnecessary tags to recude the page size Further eliminate external URLs and IMG scr values (based on the method query parameter) Remaining HTML is converted to Markdown, thus recuding the page lengh even more while preserving the basic page structure The remaining content is sent back to an Agent if it's not too long (maxlimit = 70000 by default, see CONFIG node). NB: You can isolate the HTTP Request part into a separate workflow. Check the Workflow Tool description, it guides the agent to provide a query string with several parameters instead of a JSON object. Please reach out to Eduard is you need further assistance with you n8n workflows and automations! Note that to use this template, you need to be on n8n version 1.19.4 or later.
eduard
Eduard

Implement complex processes faster with n8n

red icon yellow icon red icon yellow icon