This workflow enables GDPR-compliant document processing by detecting, masking, and securely handling personally identifiable information (PII) before AI analysis.
It ensures that sensitive data is never exposed to AI systems by replacing it with tokens, while still allowing controlled re-injection of original values when permitted. The workflow also maintains full audit logs for compliance and traceability.
Document Upload & Configuration
Receives documents via webhook and initializes configuration such as document ID, thresholds, and database tables.
Text Extraction
Extracts raw text from uploaded documents for processing.
Multi-Detector PII Detection
Detects emails, phone numbers, ID numbers, and addresses using regex and AI-based detection.
PII Aggregation & Conflict Resolution
Merges detections, resolves overlaps, removes duplicates, and builds a unified PII map.
Tokenization & Vault Storage
Replaces sensitive data with secure tokens and stores original values in a database vault.
Masking & Validation
Generates masked text and verifies that all PII has been successfully removed before AI processing.
AI Processing (Masked Data)
Processes the document using AI while preserving tokens to prevent exposure of sensitive information.
Re-Injection Controller
Determines which fields are allowed to restore original PII based on permissions.
Secure Retrieval & Restoration
Retrieves original values from the vault and restores them only where permitted.
Audit Logging
Stores metadata, detected PII types, and re-injection events for compliance tracking.
Error Handling & Alerts
Blocks processing and triggers alerts if masking fails or compliance rules are violated.