Stop treating document review as a manual task. Let AI extract, classify, and route every contract, invoice, and NDA automatically.
Legal and financial document review is slow, inconsistent, and expensive when done by hand. This workflow accepts any document via webhook, runs it through Claude Sonnet 4.6 for structured extraction and risk classification, logs the result to a compliance audit sheet, and fires alerts before a human ever opens the file.
The workflow operates in six stages:
1. Ingestion & Early Validation
The webhook receives the document payload and immediately validates and deduplicates before responding. Invalid payloads return a 400. Duplicate submissions within a 30-day window (matched by SHA-256 fingerprint) return a 200 without reprocessing. Only clean, unique requests receive a 202 Accepted and continue to processing.
2. Document Acquisition
Routes to one of two paths depending on the payload. URL submissions pass through an SSRF guard (blocks private ranges, hex/octal/decimal IP obfuscation, and non-HTTPS schemes), download the file, check it stays under 10 MB, and extract text via PDF parser. Inline text submissions pass directly to the prompt builder.
3. LLM Extraction & Classification
The prompt is assembled with explicit prompt-injection defences using document delimiters, then sent to Claude Sonnet 4.6 via the Basic LLM Chain. The response is parsed and normalised into a structured output item covering document type, parties, dates, obligations, governing law, total value, and a plain-English summary.
4. Risk Routing & Alerting
Classifies the document as LOW, MEDIUM, or HIGH risk based on explicit criteria embedded in the prompt (unlimited liability, GDPR exposure, lock-in periods, penalty clauses, and more). HIGH risk triggers an email and Slack alert chain in parallel with logging. A separate branch monitors token usage and fires a Slack nudge if input tokens exceed 40,000.
5. Compliance Logging
Every processed document is appended to a Google Sheets audit log with job ID, risk level, parties, risk factors, governing law, token counts, and a truncated summary. Both the HIGH risk alert path and the LOW/MEDIUM path write to the same sheet.
6. Delivery & Callback
If the original payload included a callback URL, the full extraction result is POSTed back to the upstream system after passing a second SSRF guard. The callback includes the job ID for end-to-end traceability.
GSHEETS_SPREADSHEET_ID, GSHEETS_SHEET_NAME, ALERT_FROM_EMAIL, and ALERT_TO_EMAILALLOWED_DOWNLOAD_DOMAINS as a comma-separated list of trusted document domainsprocess-documentPrepare Claude Prompt node to match your organisation's legal thresholdsSnapshot Parsed Result to add Teams messages, PagerDuty alerts, or HubSpot ticket creation for HIGH risk documentsPrepare Claude Prompt to extract additional data points specific to your document types