AI in Production -- Part 5
Governance and Compliance: What Legal Will Ask Before Your AI Goes Live
The Problem
Someone from Legal walks over to your desk. They’ve heard about the AI feature. They have a list of questions.
What data are you sending to the model? Is any of it personal data? Where does the provider store it? For how long? Do users know their input is being processed by a third-party AI? Can they opt out? What happens if a user asks to delete their data — does that include what you sent to the AI?
If you can’t answer these questions, the feature doesn’t go live. Not because Legal is being difficult. Because in the EU, these questions have legal answers, and getting them wrong has real consequences.
This isn’t a corner case. Every AI feature that processes user input in the EU touches GDPR almost by definition. When a user types a message, pastes a document, or submits a form that gets sent to an AI model, you are processing personal data on their behalf. The AI provider is a data processor under your control. You are the controller. The obligations are yours.
Beyond GDPR, the EU AI Act adds another layer. Most enterprise AI features fall into the “limited risk” or “general purpose” categories — they don’t require a conformity assessment — but they do require transparency obligations: users must know they’re interacting with AI.
The good news is that most of this is solvable by design. Build it right from the start and the compliance review is fast. Skip it and you’re doing emergency surgery on production code.

The Five Questions You Need to Answer
Before writing any compliance code, answer these clearly. The code follows from the answers.
1. What personal data goes to the AI? Map every field. User names? Email addresses? Free-text input that could contain anything? Documents that users upload? Log this explicitly — “user input (free text, may contain PII)” is an honest answer. “Nothing sensitive” is usually wrong.
2. What does the provider do with it? Read the provider’s data processing terms. Most major providers offer a Data Processing Agreement (DPA) and options for data residency. Know whether your data is used for model training (opt out if available), where it’s stored, and how long it’s retained. This goes in your Records of Processing Activities (RoPA).
3. Do users know? Your privacy policy must mention AI processing. If users interact directly with an AI feature, they need to know — the EU AI Act requires it. A small “Powered by AI” label and a privacy policy link is usually enough for limited-risk systems.
4. Can users opt out or request deletion? If a user exercises their right to erasure under GDPR, what happens to data you already sent to the AI? You can’t unsend it, but you can document what was sent, when, and under what legal basis. You can also stop sending that user’s data going forward.
5. What’s your legal basis for processing? Usually legitimate interest or contractual necessity for B2B, consent for consumer products. This determines your obligations. Pick one and document it — “we’re not sure” is not a legal basis.
Execute
Scrub PII before it leaves your system
The simplest protection: don’t send what you don’t need. Before passing user input to the AI, strip or mask fields that are identifiable but irrelevant to the AI task.
public static class PiiScrubber
{
// These patterns cover common EU PII formats.
// Extend based on what your actual data looks like.
private static readonly Regex EmailPattern =
new(@"\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,}\b",
RegexOptions.IgnoreCase | RegexOptions.Compiled);
private static readonly Regex IbanPattern =
new(@"\b[A-Z]{2}\d{2}[A-Z0-9]{4,30}\b",
RegexOptions.Compiled);
private static readonly Regex PhonePattern =
new(@"\b(\+\d{1,3}[\s.-])?\(?\d{3}\)?[\s.-]?\d{3}[\s.-]?\d{4}\b",
RegexOptions.Compiled);
public static string Scrub(string text)
{
text = EmailPattern.Replace(text, "[EMAIL]");
text = IbanPattern.Replace(text, "[IBAN]");
text = PhonePattern.Replace(text, "[PHONE]");
return text;
}
}
Use it in your AI service before every call:
public async Task<string?> SummarizeAsync(
string text,
CancellationToken cancellationToken = default)
{
// Scrub before it leaves your boundary
var sanitized = PiiScrubber.Scrub(text);
// ... proceed with sanitized input
}
This won’t catch everything — free text is unpredictable. But it removes the most common structured PII patterns and shows intent: you designed the system to minimize data exposure.
![A conveyor belt where documents enter a machine labeled "PII SCRUBBER". Email addresses and phone numbers are being replaced with [EMAIL] and [PHONE] tags. A small robot operates the machine with a satisfied expression.](/images/pii.jpg)
Audit every AI call
You need to be able to answer “what did we send to the AI, and when?” for any user. This means a durable audit log — not just application logs that rotate and disappear.
public record AiAuditEntry
{
public Guid Id { get; init; } = Guid.NewGuid();
public string UserId { get; init; } = default!;
public string Feature { get; init; } = default!; // e.g. "document-summary"
public string InputHash { get; init; } = default!; // SHA-256 of input, not the input itself
public int EstimatedTokens { get; init; }
public string Model { get; init; } = default!;
public string LegalBasis { get; init; } = default!; // e.g. "legitimate_interest"
public DateTimeOffset Timestamp { get; init; } = DateTimeOffset.UtcNow;
}
Store the hash of the input, not the input itself. This lets you prove a specific input was sent (by hashing the original again) without storing a second copy of personal data. Less data stored, less exposure.
public class AiAuditService
{
private readonly IRepository<AiAuditEntry> _repository;
public async Task LogAsync(
string userId,
string feature,
string originalInput,
int estimatedTokens,
string model,
string legalBasis,
CancellationToken ct = default)
{
var hash = Convert.ToHexString(
SHA256.HashData(Encoding.UTF8.GetBytes(originalInput)));
await _repository.InsertAsync(new AiAuditEntry
{
UserId = userId,
Feature = feature,
InputHash = hash,
EstimatedTokens = estimatedTokens,
Model = model,
LegalBasis = legalBasis
}, ct);
}
}
Consent middleware
For consumer products where consent is your legal basis, check it before every AI call. Don’t assume consent persists — users can withdraw it.
public class AiConsentMiddleware
{
private readonly RequestDelegate _next;
private readonly IConsentStore _consent;
public AiConsentMiddleware(RequestDelegate next, IConsentStore consent)
{
_next = next;
_consent = consent;
}
public async Task InvokeAsync(HttpContext context)
{
// Only applies to AI endpoints
if (!context.Request.Path.StartsWithSegments("/api/ai"))
{
await _next(context);
return;
}
var userId = context.User.FindFirst("sub")?.Value;
if (userId is null)
{
context.Response.StatusCode = 401;
return;
}
var hasConsent = await _consent.HasAiConsentAsync(userId);
if (!hasConsent)
{
context.Response.StatusCode = 403;
await context.Response.WriteAsJsonAsync(new
{
error = "ai_consent_required",
message = "Enable AI features in your privacy settings to use this."
});
return;
}
await _next(context);
}
}
The transparency label
The EU AI Act requires users to know they’re interacting with AI. In practice, this is a label in the UI — but include it in your API responses too so every client surface can show it:
public record AiSummarizeResponse(
string? Summary,
bool AiAvailable,
// Clients must display this when AiGenerated is true
bool AiGenerated = true,
string AiDisclosure = "This summary was generated by an AI system."
);
The DPA checklist
Before connecting to any AI provider, verify these:
- DPA signed — you have a Data Processing Agreement with the provider
- Data residency confirmed — data stays in the EU (or you have adequate safeguards for transfers)
- Training opt-out — your data is not used to train the provider’s models (most enterprise tiers offer this)
- Retention period documented — you know how long the provider keeps request data
- Incident notification — the provider will notify you of breaches within 72 hours (required to notify your users)
- Sub-processor list — you know who the provider shares data with
This list goes in your privacy documentation. If the provider can’t answer these questions, pick a different provider.

EU AI Act: the quick version
Most enterprise AI features are limited risk under the EU AI Act. This means:
- You must tell users they’re interacting with AI (transparency obligation)
- No conformity assessment required
- No CE marking required
- No registration in the EU database required
High-risk would apply if you’re using AI for employment decisions, credit scoring, critical infrastructure, or law enforcement. If that’s your use case, the requirements are significantly heavier and outside the scope of this article.
If you’re in scope, link to the EU AI Act text and document your classification decision. “We assessed this as limited risk because X” is what an auditor needs to see.
Checklist
- Do you know exactly what personal data goes to the AI model?
- Have you signed a DPA with your AI provider?
- Is data residency in the EU confirmed?
- Do users know their input is processed by AI? (EU AI Act transparency)
- Do you have an audit log of AI calls per user?
- Is there a way for users to opt out or withdraw consent?
- Is your legal basis for AI processing documented?
- Have you classified your system under the EU AI Act?
If Legal asks you these questions next week, can you answer all of them?
Before the Next Article
Compliance handled. Now the practical question: how do you actually add an AI feature to a system that already exists? Not a greenfield project — your current production API, with its existing database, auth layer, and deployment pipeline.
That’s article 6. Integration patterns for AI in existing systems.
If this series helps you, consider buying me a coffee.
This is article 5 of the AI in Production series. Next: Integrating AI into Existing Systems — patterns for adding AI without breaking what already works.
Loading comments...