Host Your Own AI Deep Research Agent with n8n, Apify and OpenAI o3

Created by

Jimleuk

Last update

Last update 7 months ago

How it works

A form is used to first capture the user's research query and how deep they'd like the researcher to go.
Once submitted, a blank Notion page is created which will later hold the final report and the researcher gets to work.
The user's query goes through a recursive series of web serches and web scraping to collect data on the research topic to generate partial learnings.
Once complete, all learnings are combined and given to a reasoning LLM to generate the final report.
The report is then written to the placeholder Notion page created earlier.

How to use

Duplicate this Notion database template and make sure all Notion related nodes point to it.
Sign-up for APIFY.com API Key for web search and scraping services.
Ensure you have access to OpenAI's o3-mini model. Alternatively, switch this out for o1 series.
You must publish this workflow and ensure the form url is publically accessible.

On depth & breadth configuration

For more detailed reports, increase depth and breadth but be warned the workflow will take exponentially longer and cost more to complete. The recommended defaults are usually good enough.

Depth=1 & Breadth=2 - will take about 5 - 10mins.
Depth=1 & Breadth=3 - will take about 15 - 20mins.
Dpeth=3 & Breadth=5 - will take about 2+ hours!

Customising this workflow

I deliberately chose not to use AI-powered scrapers like Firecrawl as I felt these were quite costly and quotas would be quickly exhausted. However, feel free to switch web search and scraping services which suit your environment.
Maybe you don't decide to source the web and instead, data collection comes from internal documents instead. This template gives you freedom to change this.
Experiment with different Reasoning/Thinking models such as Deepseek and Google's Gemini 2.0.
Finally, the LLM prompts could definitely be improved. Refine them to fit your use-case.

Credits

This template is largely based off the work by David Zhang (dzhng) and his open source implementation of Deep Research: https://github.com/dzhng/deep-research