Why GPT wrappers fail compliance review
Every pilot in regulated finance hits the same wall. The problem isn't the model — it's the architecture.
Every regulated-industry AI pilot we've seen starts the same way: wrap a foundation model in a retrieval layer, tune a system prompt, ship a demo that wows the executive team. Then legal and compliance get their turn, and the project quietly stalls.
The failure mode isn't the model's capability. It's the architecture.
The three questions that kill the pilot
Can you prove the model didn't invent a citation? A retrieval-augmented pipeline still lets the model paraphrase, hedge, or fabricate near-regulations that look plausible to a non-lawyer but fail a word-for-word match. "Most of the time it's fine" is not an answer that survives a regulatory audit.
Did any customer PII reach the provider? Even if the model is hosted in your tenancy, a retrieval step that pulls in customer records means names, balances, and account identifiers were in the model's context. That's a data-residency and privacy concern regardless of provider.
Who approved what, and can you prove it hasn't been altered? A table of approvals isn't enough. Regulators will ask for tamper-evident evidence — hash chains, WORM archives, real identity bindings. Most pilots have none of this.
The architectural pivot
The fix isn't a better prompt. It's separating concerns:
- A deterministic logic engine decides which clauses go into a document. Rules are authored by compliance, versioned, and date-effective.
- An AI model only phrases the content — it never selects, omits, or reasons about what to include.
- A PII firewall keeps customer data out of the model entirely. Tokens like
{{CLIENT_NAME}}are injected locally after generation. - An auditor — a second model with veto power — reviews every output against the source plan and blocks discrepancies.
- A hash-chained audit trail records every step with cryptographic continuity.
None of this is research. It's applied software engineering. The teams shipping AI into regulated workflows today are the ones who stopped trying to make the model "safer" and started making the architecture load-bearing.
What this looks like in practice
At render time, the model has seen: the customer's jurisdiction, the document intent, the selected regulation IDs, and relational flags like "VIP customer" or "joint account holder." It has not seen: a name, a number, an address, a balance, or any identifier.
The output is a markdown draft with placeholder tokens. A local injector replaces those tokens from an encrypted customer record. A WORM archive captures the pre-delivery state. An approver signs via an identity-bound token — their email is in the audit chain, not just "admin."
When a regulator asks "prove this customer received compliant disclosure on this date," you replay the hash chain, verify the seal, and show the exact rules engine snapshot that ran. You don't explain the model. You don't need to.