pdfseparate
from poppler-utils, and custom command execution. Make sure to install all required dependencies locally.This template is designed for developers, back-office teams, and automation builders (especially in Thailand or Thai-speaking environments) who need to process multi-file, multi-page Thai PDFs and automatically export structured results to Google Sheets.
It is ideal for:
Typhoon OCR is one of the most accurate OCR tools for Thai text, but integrating it into an end-to-end workflow usually requires manual scripting and handling multi-page PDFs. This template solves that by:
doc/multipage
folderpdfinfo
and pdfseparate
to break PDFs into pagesInstall Requirements
typhoon-ocr
: pip install typhoon-ocr
pdfinfo
, pdfseparate
Create folders
/doc/multipage
for incoming files/doc/tmp
for split pages/doc/multipage/Completed
for processed filesGoogle Sheet
book_id | date | subject | to | attach | detail | signed_by | signed_by2 | contact_phone | contact_email | contact_fax | download_url
API Keys
TYPHOON_OCR_API_KEY
and OPENAI_API_KEY
(or use credentials in n8n)Typhoon is a multilingual LLM and NLP toolkit optimized for Thai. It includes typhoon-ocr
, a Python OCR package designed for Thai-centric documents. It is open-source, highly accurate, and works well in automation pipelines. Perfect for government paperwork, PDF reports, and multi-language documents in Southeast Asia.
You can also deploy this workflow easily using the Docker image provided in my GitHub repository: https://github.com/Jaruphat/n8n-ffmpeg-typhoon-ollama
This Docker setup already includes n8n, ffmpeg, Typhoon OCR, and Ollama combined together, so you can run the whole environment without installing each dependency manually.