The Three Ways in the AI Era -- Part 6

Your AI-Native DevOps Playbook

#devops #ai #sdd #atlas #gotcha #three-ways #playbook #dora

Five articles in. SDD for flow. GOTCHA for feedback. Tutor mode for learning. Honest comparison. The ideas are all there.

Now the last mile: how do you put everything into one playbook your team can use on Monday? Not a book. Not a framework-of-frameworks. One short file, with templates, metrics, and decision rules. This article is that file.

PROBLEM

An open toolbox on a workbench with the pieces from the series scattered around it — a spec document, a GOTCHA prompt scroll, a learning note, a small checklist, a metrics chart. A developer looking at the pieces, trying to figure out how to pack them — flat vector editorial illustration

Knowledge without a playbook is just reading.

I have seen teams read the best DevOps books, bookmark every post, attend every conference — and still work the same way on Monday. Because the gap between “I understand this” and “I do this” is a playbook. A short, concrete artifact that lives in the repo and tells you exactly what to do when a PR opens, when an incident happens, when a junior joins.

A good playbook has four properties:

  1. It fits on one page
  2. It links to the heavy templates, not inline them
  3. It tells you when to apply each piece, not just how
  4. It is versioned with the code, not with Confluence

The QuantumAPI team built theirs over 6 months of trial and error. They kept what worked, dropped what did not. What is left is below. Copy it, adjust it, ship.

SOLUTION

A clean diagram showing four stacked layers labeled from bottom to top: "Ritual layer", "Metrics layer", "Templates layer", "Decision layer". Arrows on the side show dependencies — decisions depend on metrics, metrics track templates, templates enable rituals — flat editorial illustration

The playbook has four layers, each answering a different question:

LayerQuestion it answersArtifact
DecisionWhen do we use which framework?Framework Selector
TemplatesWhat does each artifact look like?Spec, GOTCHA prompt, learning note, postmortem
MetricsHow do we know it is working?DORA + AI-specific
RitualWhat ceremonies stay? What go?Keep/drop list

You need all four. Templates without decisions become ceremony. Metrics without templates are guesses. Rituals without metrics are theatre.

EXECUTE

Layer 1 — Metrics (DORA + AI)

The team kept the four DORA metrics and added four AI-specific ones:

DORA classicTargetAI-specific additionTarget
Deployment frequencyDaily+AI PRs merged without rework> 70%
Lead time for changes< 1 dayAI review signal-to-noise> 80%
Change failure rate< 15%Incidents traced to AI codeTrack, not target
Mean time to restore< 1 hourLearning note coverage on AI PRs> 90%

All eight are measurable from Azure DevOps + Application Insights + repo data. No new tooling.

One rule: do not target the AI-specific metrics too aggressively. If you push “AI PRs merged without rework” too hard, devs will reject AI-generated code they should have used. The metrics are a thermometer, not a dashboard to game.

Layer 2 — Templates (one file, references everywhere)

The team created one file: .ai/playbook.md. It is an index, not a book:

# AI-Native DevOps Playbook — QuantumAPI Team

## Before you start a task
- Framework Selector → .ai/decisions/framework-selector.md
- Spec template → .ai/templates/spec.md
- Current sprint specs → specs/

## During development
- GOTCHA code review prompt → .ai/review-prompt.md
- Learning note prompt → .ai/learning-note-prompt.md
- Tutor mode prompt → .ai/tutor-prompt.md

## After merge
- Learning note per AI PR → .ai/learning-notes/<spec-id>.md
- Short postmortem template → .ai/templates/postmortem.md

## Metrics dashboards
- DORA → Azure DevOps Analytics
- AI-specific → Application Insights workbook "AI Engineering"

## Who owns what
- Framework Selector: @tech-lead
- GOTCHA prompts: @security-guild (reviewed quarterly)
- Learning notes: whoever merges the PR
- Metrics dashboards: @devops-lead

25 lines. Every link points to a file already in the repo. When a new dev joins, they read this once. When a question comes up in Slack, the answer is “check .ai/playbook.md”.

All the actual templates live in the articles that introduced them:

Layer 3 — Rituals (what stays, what goes)

After 6 months, the team had tried 14 different ceremonies. They kept 4 and dropped 10.

Keep:

RitualCadenceWhy
Learning note tick on every AI PRPer PRLowest-cost knowledge transfer
AI retro (15 min)WeeklyPrompts get stale; this is where you tune them
Framework auditQuarterlyAre we using SDD/GOTCHA/ATLAS where they help?
Short postmortem for AI-touched incidentsPer incidentLearning notes make this 90min, not 2 days

Drop:

RitualWhy it was cut
Daily AI task recap in standupNoise
Per-PR ATLAS write-upOverhead > benefit below sprint-scale
Spec for every bug fixSpec ceremony
Prompt approval committeeBottleneck with no signal
Monthly framework auditToo frequent; quarterly enough
Mandatory “AI assistant” slot in 1:1sPerformative

The keep list is small on purpose. Every ritual costs focus. The bar to keep one is: does this create information the team acts on?

Layer 4 — The final state of the QuantumAPI team

A clean dashboard display showing metrics — "Deployment freq: 12/day", "Lead time: 4h", "Change failure rate: 3%", "AI merge-without-rework: 78%", "Bus factor on AI code: 3.2", "Juniors debugging AI code solo: 75%". Team icons around the dashboard looking focused, not celebrating — flat editorial illustration

After 6 months running the playbook:

MetricBaseline (Article 1)After playbook
Deployment frequency1/day12/day
Lead time2 days → 6 days (AI broken)4 hours
Change failure rate10% → 30% (AI broken)3%
AI PRs merged without rework78%
AI review signal-to-noise14%87%
Incidents traced to AI code5/month1/month
Bus factor on AI-written code13.2
Juniors debugging AI code solo10%75%

Note what happened. The lead time got worse when AI was added without structure, from 2 days to 6. After the playbook, it is 4 hours. That 36x gap between “AI with structure” and “AI without structure” is the whole story of this series.

What is NOT in the playbook (and why)

  • No AI governance board. Tried it. Became a bottleneck with no measurable value. The security guild owns the GOTCHA prompts for high-risk domains. Done.
  • No SLA for AI review. The pipeline already runs it in 70 seconds. An SLA would be ceremony on top of a working system.
  • No prompt approval committee. Prompts live in the repo. They go through code review like everything else.
  • No separate “AI usage” metric in performance reviews. AI is a tool. Performance is about outcomes.

Every one of these was tried and dropped. The common thread: if a ceremony does not produce information the team acts on, cut it. The Article 5 anti-patterns — Framework Tower, Checklist Theatre, Spec Ceremony — are always waiting to come back.

TEMPLATE

Copy this into .ai/playbook.md in your repo today:

# AI-Native DevOps Playbook — <your team>

## Before you start a task
- Framework Selector → .ai/decisions/framework-selector.md
- Spec template → .ai/templates/spec.md
- Current sprint specs → specs/

## During development
- GOTCHA code review prompt → .ai/review-prompt.md
- Learning note prompt → .ai/learning-note-prompt.md
- Tutor mode prompt → .ai/tutor-prompt.md

## After merge
- Learning note per AI PR → .ai/learning-notes/<spec-id>.md
- Short postmortem template → .ai/templates/postmortem.md

## Metrics
- DORA: deployment freq, lead time, change failure rate, MTTR
- AI: PRs merged without rework, review signal-to-noise,
  incidents traced to AI, learning note coverage

## Rituals
Keep: learning note tick on every AI PR, weekly 15-min AI retro,
quarterly framework audit, short postmortem for AI incidents.
Drop: everything else.

## Who owns what
- Framework Selector: <role>
- GOTCHA prompts: <role or guild>
- Learning notes: whoever merges the PR
- Metrics dashboards: <role>

Adjust the owners. Fill in the links as you create them. That is the whole playbook.

CHALLENGE

This week: create .ai/playbook.md in your repo. Even if it is empty scaffolding. The file existing is what turns ideas into a habit. Next week: fill in one section. The week after: another. In 6 weeks you have what the QuantumAPI team has.

A neatly organised toolbox closing at the end of the day, all pieces from the series in their compartments — spec, prompt, learning note, checklist, metrics chart — a developer's hand closing the lid, calm and satisfied. Warm evening tones — flat vector editorial illustration


The DevOps Handbook turned 10 this year. The Three Ways still hold. AI did not break them. It amplified whatever you already do. Good teams with structure ship faster with AI. Teams without structure ship technical debt wearing a tuxedo.

You have the structure now. Decision layer, templates, metrics, rituals. One file in the repo. The hard part is not learning it. The hard part is keeping it simple when the next shiny framework shows up.

Ship well.


If this series helped you, consider sponsoring me on GitHub or buying me a coffee.

This is part 6 of 6 — the final article in the series “The Three Ways in the AI Era”. Previous: SDD, ATLAS, GOTCHA: When to Use What. Series index: The Three Ways in the AI Era.

Comments

Loading comments...