Continuous Learning When AI Writes Half Your Code

In the last article our QuantumAPI team turned a crying-wolf AI reviewer into one that only flags real issues. Small PRs from SDD, useful reviews from GOTCHA. The First and Second Ways were working.

Then, three weeks later, production went down at 2 in the morning. The key rotation job had silently crashed. The on-call engineer opened the code. It was clean. It was reviewed. It was merged by a colleague. And it was completely incomprehensible. “Why ML-KEM-768 and not 1024? Why this retry policy? What is ‘gradual rollover’?” Nobody knew. The original dev had left the team two months before.

This is the Third Way broken. The code exists, but the knowledge does not. Your team ships faster with AI, but nobody is learning. That is not a productivity win. That is technical debt wearing a tuxedo.

PROBLEM

A lone developer at 2am staring at a thick dusty book labeled "Our Codebase" under a single desk lamp, while an AI robot next to them flips through pages at supersonic speed — flat vector editorial illustration

AI code generation is fast. Human learning is not.

When your team goes from writing 50% of the code to reviewing 50% of the code, something invisible happens: the act of writing, which is how engineers internalise decisions, stops happening for half the codebase. The code lives in the repo. The knowledge lives nowhere.

Symptoms on our QuantumAPI team after 3 months of AI-heavy work:

“Why did we choose X?” asked in Slack every other day
Postmortems that take 2 days because nobody remembers the design decision
Junior devs merging AI-generated code they cannot explain (as we saw in the Article 1 incident)
Senior devs quietly rewriting AI code from scratch because it is “easier than understanding it”

Gene Kim’s Third Way is continuous learning. Postmortems without blame. Safe experiments. The team gets better over time. But if the code is generated faster than the team can read it, you are not learning. You are accumulating.

This is not a problem we fix with tools. It is a problem we fix by changing how we use the tool.

SOLUTION

Split panel — left: a passive developer catching falling code thrown by an AI robot; right: the same developer at a small whiteboard with the AI robot pointing at a diagram, explaining patiently — flat vector editorial illustration

The shift is simple: stop asking AI to write things and start asking it to teach them.

Generator mode (what most teams do)	Tutor mode (what this article is about)
“Implement this spec"	"Explain each block as you implement this spec”
One-shot output: a full PR	Interactive: block → explanation → dev acknowledgement → next block
Human reads diff, maybe	Human writes a learning note the AI reviews
Postmortems take days	Postmortems take hours because decisions are documented
Bus factor: 0-1	Bus factor: 3+

Three concrete techniques, in order of effort:

Explain-before-merge — a short learning note is generated and attached to every PR touching AI code. The dev has to read it and tick a box before merging.
AI-assisted postmortems — when something breaks, the AI reconstructs the decisions it made when generating the code, and the humans compare with the incident.
Tutor mode — for new features and juniors, use a GOTCHA prompt tuned to explain while generating, not generate then explain.

None of these slow the team down by much. All of them raise the floor on what every dev understands. Let’s see the QuantumAPI team apply them.

EXECUTE

Step 1 — Explain-before-merge

The team adds a step to their Azure DevOps pipeline, right after the GOTCHA AI review from Article 3. A second prompt produces a learning note for the PR:

prompt = """
You are a senior engineer documenting decisions for a junior dev.
Based on the spec and the diff, answer these three questions in
15 lines total, plain English:

1. What decisions did the code make? Why?
2. What reasonable alternatives were rejected, and why?
3. Where can this break in production?

Be concrete. Name specific libraries, flags, and numbers.
"""

The output for the encryption PR gets saved to .ai/learning-notes/spec-001-encrypt-pii.md in the repo:

# Learning note — spec-001-encrypt-pii

## What decisions did the code make?
- Chose ML-KEM-768 (FIPS 203) over ML-KEM-512 and ML-KEM-1024.
  768 is the NIST-recommended balance: enough post-quantum security
  (category 3) without the 2x ciphertext size of 1024.
- Used the QuantumAPI .NET SDK v2.3 instead of direct REST calls.
  SDK handles retries, vault ID binding, and context propagation.
- EF Core value converters wired at entity-config level, not in a
  base class. Reason: easier to audit which exact fields are encrypted.

## Rejected alternatives
- Encrypting the whole row (rejected: breaks indexing on non-PII columns).
- AES-only at-rest via disk encryption (rejected: not field-level,
  does not satisfy the compliance audit requirement).
- ML-KEM-1024 (rejected: the 8-week deadline does not justify the
  larger keysize; we can migrate later via vault rotation).

## Where can this break?
- If the QuantumAPI endpoint is unreachable, ALL reads on 5 fields
  will throw. There is no fallback. Add a circuit breaker in Spec D.
- The vault ID is read from an env var. If the var is missing, the
  service fails at startup — not silent.
- Nullable encrypted fields: the value converter handles null, but
  ONLY because the reviewer caught it (see PR #412 comment 2).

15 lines. Written by the AI. Read by the human. The pull request template now requires the reviewer to tick:

I have read .ai/learning-notes/spec-001-encrypt-pii.md and can explain it in my own words

It is a human gate, not an automated one. The bot cannot tick it. If the reviewer lies, that is on them — but in practice, the check creates just enough friction that people actually read the note.

Step 2 — AI-assisted postmortems (short template)

Three weeks after the rotation bug took down production, the team ran a postmortem. The difference from the Article 1 version: this time the learning note existed.

They used a short GOTCHA prompt to pull the decision history into the postmortem template:

# Postmortem: Key rotation crash — 2026-04-12

## Incident
ML-KEM-768 keypair rotation crashed 8 days into a 90-day cycle.
Service ran on stale keypair until detected 3 weeks later.

## What the AI chose (from learning note)
- 90-day rotation, manual trigger from a cron job
- No retry on rotation failure (not in spec)
- No alert on rotation failure (not in spec)

## What actually happened
- Scaleway had a 2-minute network blip on day 8
- Rotation request timed out
- Cron job logged the error and exited 0 (masking the failure)

## Root cause
Not the AI code. Not the human review. The spec said "rotate every
90 days" and stopped there. It never said "alert on failure" or
"retry on transient errors". The AI did exactly what the spec said.

## Actions
- [ ] Update spec template to require an "On failure" section
- [ ] Migrate to QuantumAPI's Vault Intelligence automated rotation
      with gradual rollover (zero-downtime)
- [ ] Add Rotation Health metric (0-100) to team dashboard

The postmortem took 90 minutes instead of 2 days. The learning note made the “why” recoverable. The gap was not in the code — it was in the spec. That is the kind of insight that only comes when the team actually learns from its code.

Step 3 — Tutor mode

A junior developer at a whiteboard with an AI robot avatar pointing to a diagram showing keypair rotation with arrows labeled "Old key (decrypt only)" and "New key (encrypt + decrypt)" — flat vector editorial illustration

A junior just joined the team. Their first task: integrate QuantumAPI’s Vault Intelligence automated key rotation with gradual rollover into the .NET 10 service. In generator mode, the AI would produce 80 lines of working code and the junior would stare at them blankly.

Instead, the junior opens the AI chat with this GOTCHA prompt:

# GOTCHA Tutor Mode

## G — Goals
Teach me how to integrate QuantumAPI Vault Intelligence automated
rotation with gradual rollover into our .NET 10 service.
Goal is for ME to be able to write the code by the end, not for
you to write it.

## O — Orchestration
1. Explain the concept in 3 sentences
2. Write ONE small block of code
3. Pause and ask: "does this block make sense?"
4. If I say yes, move on. If no, explain smaller.
5. At the end, ask me to explain back what we built.

## T — Tools
Chat only. No full-file outputs.

## C — Context
.NET 10, EF Core 10, QuantumAPI SDK v2.3. I know C# basics.
I do NOT know what "gradual rollover" means. I am new to PQC.

## H — Heuristics
- NEVER write a complete implementation in one message
- NEVER skip the "does this make sense?" pause
- If I answer wrong, back up and re-explain with a smaller step
- At the end, verify I can explain the code WITHOUT looking at it
- Use analogies from web dev if it helps (session tokens, JWT, etc.)

## A — Args
Spec: specs/004-key-rotation/spec.md
My experience level: 1 year of .NET, 0 experience with PQC

The session takes 30 minutes. Instead of a 120-line dump, the junior gets:

Explanation: “Gradual rollover means two keys are active at once — the old one for decrypting existing data, the new one for encrypting everything from now on. Imagine rotating the SSL certificate on a live service without dropping connections.”
Block 1: A 5-line IKeyRotationClient interface. “Does this make sense?”
Block 2: Implementation of BeginRotation(). “Try to guess what CompleteRotation() will do.”
Block 3: The junior writes CompleteRotation() themselves. The AI reviews it.
Recap: “Now explain back what gradual rollover does and why we need two keys.”

At the end, the junior understands what they shipped. They can debug it at 2am. Bus factor went from 0 to 1. Multiply across the team and you are back at 3+.

Step 4 — Results after 3 months

Metric	Before tutor mode	After tutor mode
Bus factor on AI-written code	1	3+
Postmortem duration	2 days	90 minutes
Juniors debugging AI code without help	10%	70%
“What does this do?” questions in Slack	15/week	3/week
Learning notes in repo	0	47
Time added per PR	—	+4 minutes (note review)

A line chart showing two curves over 3 months — "Code generated" flat and high from the start, "Knowledge transferred" climbing steeply from low to meeting it near the end — flat vector editorial illustration

Third Way restored. The team ships AI-written code and understands it. Continuous learning is continuous again.

TEMPLATE

Reusable GOTCHA tutor mode prompt. Save as .ai/tutor-prompt.md:

# GOTCHA Tutor Mode — <your team>

## G — Goals
Teach the developer how to <task>. Goal is for THEM to write the
code by the end, not for the AI to write it.

## O — Orchestration
1. Explain the concept in 3 sentences.
2. Write ONE small block of code.
3. Pause and ask: "does this make sense?"
4. Continue only after confirmation.
5. Ask the dev to write the last block themselves.
6. Verify they can explain it back.

## T — Tools
Chat only. No full-file outputs.

## C — Context
<Stack, versions, dev's experience level, domain constraints>

## H — Heuristics
- NEVER write a complete implementation in one message
- NEVER skip the "does this make sense?" pause
- Back up and re-explain when the dev answers wrong
- At the end, verify understanding WITHOUT looking at the code
- Use analogies from what the dev already knows

## A — Args
- Spec path: <runtime>
- Dev experience level: <runtime>
- Topic to teach: <runtime>

Rule of thumb: if a tutor mode session produces working code in under 10 minutes, it is probably generator mode in disguise. Real learning takes 20-60 minutes per block of new knowledge. That is a feature, not a bug.

CHALLENGE

Open the last PR with AI-generated code that you merged. Can you explain every decision in it to a junior? Not read — explain. If not, that code is technical debt wearing a tuxedo. Write a learning note for it this week. Tomorrow.

In the next article we step back and look honestly at SDD, ATLAS, and GOTCHA. When does each one help? When does each one fail? Is any of them a silver bullet? (Spoiler: no.) We compare without pretending.

→ Article 5: SDD, ATLAS, GOTCHA: When to Use What (And When They All Fail) (coming soon)

If this series helps you, consider sponsoring me on GitHub or buying me a coffee.

This is part 4 of 6 in the series “The Three Ways in the AI Era”. Previous: AI Code Review That Doesn’t Cry Wolf.