Benchmark Content Safety Guardrails with Automated Test Suite & Reports

Created by

Patrick Siewert

Last update

Last update a day ago

🛡️ Evaluate Guardrails Node Accuracy with Automated Test Suite

This workflow benchmarks the n8n Guardrails node across multiple safety categories -including PII, NSFW, jailbreak attempts, secret keys, and unsafe URLs.
It runs 36 structured test cases, classifies each as PASS or VIOLATION, calculates accuracy metrics, and emails a detailed HTML report.

🔄 How it works

The workflow loops through 36 predefined test prompts.
Each prompt is checked by the Guardrails node for violations.
Results are recorded as PASS or VIOLATION.
The system calculates metrics (accuracy, precision, recall, F1).
A formatted Markdown → HTML report is generated and sent via Gmail.

⚙️ Set up steps

Add your OpenAI and Gmail credentials in n8n.
Replace YOUR_MAIL_HERE in the Gmail node with your own address.
(Optional) Change the model in the OpenAI Chat Model node.
- Default: gpt-4o-mini
- You can switch to gpt-5 or another available model if needed.
Click Execute Workflow: test cases will run automatically.
Check your inbox for the results.

🧠 Who it’s for

AI safety testers and workflow developers
n8n users experimenting with the Guardrails node
Teams validating LLM moderation, filtering, or compliance setups

🧩 Requirements

n8n v1.119+
Guardrails node enabled
OpenAI credentials (optional but recommended)
Email integration (Gmail or SendGrid)

💡 Why it’s useful

Use this test suite to understand how accurately the Guardrails node identifies unsafe content across different categories. The generated metrics help you fine-tune thresholds, compare models, and strengthen AI moderation workflows.