A/B test AI prompts with Supabase, Langchain Agent & OpenAI GPT-4o

Created by

Last update

Last update 15 days ago

Split Test AI Prompts Using Supabase & Langchain Agent

This workflow allows you to A/B test different prompts for an AI chatbot powered by Langchain and OpenAI. It uses Supabase to persist session state and randomly assigns users to either a baseline or alternative prompt, ensuring consistent prompt usage across the conversation.

🧠 Use Case

Prompt optimization is crucial for maximizing the performance of AI assistants. This workflow helps you run controlled experiments on different prompt versions, giving you a reliable way to compare performance over time.

⚙️ How It Works

When a message is received, the system checks whether the session already exists in the Supabase table.
If not, it randomly assigns the session to either the baseline or alternative prompt.
The selected prompt is passed into a Langchain Agent using the OpenAI Chat Model.
Postgres is used as chat memory for multi-turn conversation support.

🧪 Features

Randomized A/B split test per session
Supabase database for session persistence
Langchain Agent + OpenAI GPT-4o integration
PostgreSQL memory for maintaining chat context
Fully documented with sticky notes

🛠️ Setup Instructions

Create a Supabase table named split_test_sessions with the following columns:
- session_id (text)
- show_alternative (boolean)
Add credentials for:
- Supabase
- OpenAI
- PostgreSQL (for chat memory)
Modify the "Define Path Values" node to set your baseline and alternative prompts.
Activate the workflow.
Send messages to test both prompt paths in action.

🔄 Next Steps

Add tracking for conversions or feedback scores to compare outcomes.
Modify the prompt content or model settings (e.g. temperature, model version).
Expand to multi-variant tests beyond A/B.

📚 Learn More

How This Workflow Uses Supabase + OpenAI for Prompt Testing