The Infrastructure Hub -- Part 4
Pipelines from Backstage — AI, Change Requests, and the Enterprise Reality
The Problem
Your infrastructure team uses three CI/CD platforms. Client A is on Azure DevOps. Client B uses GitHub Actions. Client C has everything on GitLab — self-hosted, because their compliance team said so.
Three UIs. Three authentication flows. Three ways to check logs. That’s annoying, but it’s not the real problem.
The real problem is what happens after the pipeline runs terraform plan.
In a startup, you push to main and it deploys. In an enterprise, you push to main and… nothing happens. Because before anything reaches production, you need:
- A Change Request (CR) with a description of what changes, why, and what’s the rollback plan
- A risk assessment — is this a standard change or does it need CAB approval?
- Evidence — the plan output, security scan results, test results, who approved the PR
- CAB approval — the Change Advisory Board reviews high-risk changes before they go live
- A post-implementation review — did the change work? Any incidents?
Today, engineers do this manually. They copy the terraform plan output into a ServiceNow ticket. They write a risk assessment from memory. They attach screenshots of pipeline runs as evidence. They wait for the CAB meeting next Thursday. Then they deploy on Friday afternoon because that’s the first available window.
This process exists for good reasons — production changes need control. But the way most enterprises do it wastes hours of engineering time on paperwork that AI could prepare in seconds.
The Solution
We combine three things:
- Backstage as the unified pipeline UI — one place to see all pipelines across Azure DevOps, GitHub Actions, and GitLab CI
- AI that does the boring work — generates change request documentation, summarizes Terraform plans in plain English, diagnoses pipeline failures, and prepares risk assessments
- A change management workflow that adapts to the enterprise — from fully automated (low-risk standard changes) to CAB-reviewed (high-risk production changes)
The key insight: AI doesn’t replace the CAB. AI prepares everything the CAB needs to make a fast decision. The engineer pushes code. AI generates the CR, the risk assessment, the evidence package, and the rollback plan. The CAB gets a complete, well-structured request instead of a rushed ticket written at 4pm on Wednesday.
And for standard changes — the ones that your organization has pre-approved (like updating a tag value or scaling a node pool) — AI classifies them automatically and they flow through without a meeting.
Execute
Step 1: Connect pipelines to the catalog
Each module’s catalog-info.yaml already exists from article 1. Add annotations that tell Backstage where the pipeline lives:
# Azure DevOps
apiVersion: backstage.io/v1alpha1
kind: Component
metadata:
name: tf-azurerm-storage-account
title: Azure Storage Account Module
annotations:
dev.azure.com/project-repo: acme-org/acme-infra-modules
dev.azure.com/build-definition: tf-storage-account-ci
forge.io/change-policy: standard # or "cab-required" or "auto-approve"
tags:
- terraform-module
- azure
- client-acme
spec:
type: terraform-module
lifecycle: production
owner: team-acme
system: client-acme-infrastructure
# GitHub Actions
apiVersion: backstage.io/v1alpha1
kind: Component
metadata:
name: tf-scaleway-instance
title: Scaleway Instance Module
annotations:
github.com/project-slug: victorZKov/forge
github.com/workflows: tf-scaleway-instance-ci.yml
forge.io/change-policy: standard
spec:
type: terraform-module
lifecycle: production
owner: team-globex
system: client-globex-infrastructure
# GitLab CI
apiVersion: backstage.io/v1alpha1
kind: Component
metadata:
name: tf-aws-vpc
title: AWS VPC Module
annotations:
gitlab.com/project-slug: client-initech/infra-modules
gitlab.com/pipeline-branch: main
forge.io/change-policy: cab-required # VPC changes always need CAB
spec:
type: terraform-module
lifecycle: production
owner: team-initech
system: client-initech-infrastructure
Notice forge.io/change-policy. This annotation controls what happens after the pipeline runs. We’ll use it in Step 5.
Step 2: Install CI/CD plugins and wire the entity page
Install the community plugins for all three platforms:
# Azure DevOps
yarn --cwd packages/app add @backstage-community/plugin-azure-devops
yarn --cwd packages/backend add @backstage-community/plugin-azure-devops-backend
# GitHub Actions
yarn --cwd packages/app add @backstage-community/plugin-github-actions
# GitLab
yarn --cwd packages/app add @immobiliarelabs/backstage-plugin-gitlab
yarn --cwd packages/backend add @immobiliarelabs/backstage-plugin-gitlab-backend
Configure credentials in app-config.yaml:
integrations:
azure:
- host: dev.azure.com
credentials:
- organizations:
- acme-org
personalAccessToken: ${AZURE_DEVOPS_PAT}
github:
- host: github.com
token: ${GITHUB_TOKEN}
gitlab:
- host: gitlab.com
token: ${GITLAB_TOKEN}
Wire the plugins into the entity page with EntitySwitch — Backstage detects which annotations are present and renders the right plugin:
// packages/app/src/components/catalog/EntityPage.tsx
import {
EntityAzurePipelinesContent,
isAzureDevOpsAvailable,
} from '@backstage-community/plugin-azure-devops';
import {
EntityGithubActionsContent,
isGithubActionsAvailable,
} from '@backstage-community/plugin-github-actions';
import {
EntityGitlabPipelinesTable,
isGitlabAvailable,
} from '@immobiliarelabs/backstage-plugin-gitlab';
const cicdContent = (
<EntitySwitch>
<EntitySwitch.Case if={isAzureDevOpsAvailable}>
<EntityAzurePipelinesContent defaultLimit={10} />
</EntitySwitch.Case>
<EntitySwitch.Case if={isGithubActionsAvailable}>
<EntityGithubActionsContent />
</EntitySwitch.Case>
<EntitySwitch.Case if={isGitlabAvailable}>
<EntityGitlabPipelinesTable />
</EntitySwitch.Case>
<EntitySwitch.Case>
<EmptyState
title="No CI/CD configured"
description="Add pipeline annotations to this module's catalog-info.yaml"
missing="info"
/>
</EntitySwitch.Case>
</EntitySwitch>
);
This is standard Backstage. Nothing new here. The real value starts now.
Step 3: AI-powered Terraform plan summaries
When a pipeline runs terraform plan, the output is technical. A 200-line plan with resource addresses, attribute changes, and force-replacement warnings. The engineer understands it. The CAB member reviewing the change request? Maybe not.
AI reads the plan and produces a summary that anyone can understand:
// plugins/pipeline-ai-backend/src/services/PlanSummaryService.ts
import { CatalogClient } from '@backstage/catalog-client';
interface PlanSummary {
humanReadable: string;
riskLevel: 'low' | 'medium' | 'high' | 'critical';
resourcesCreated: number;
resourcesModified: number;
resourcesDestroyed: number;
destructiveChanges: string[];
rollbackStrategy: string;
}
export class PlanSummaryService {
constructor(
private readonly aiBaseUrl: string,
private readonly catalogClient: CatalogClient,
) {}
async summarizePlan(
planJson: string,
entityRef: string,
): Promise<PlanSummary> {
// Get module context from catalog
const entity = await this.catalogClient.getEntityByRef(entityRef);
const moduleDescription = entity?.metadata.description || '';
const client = entity?.metadata.tags
?.find(t => t.startsWith('client-'))
?.replace('client-', '') || 'unknown';
const changePolicy = entity?.metadata.annotations?.['forge.io/change-policy'] || 'standard';
const response = await fetch(`${this.aiBaseUrl}/api/pipeline/summarize-plan`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
plan: planJson,
context: {
module: entity?.metadata.name,
description: moduleDescription,
client,
changePolicy,
},
prompt: `You are reviewing a Terraform plan for an infrastructure module.
Module: ${entity?.metadata.name}
Client: ${client}
Description: ${moduleDescription}
Change policy: ${changePolicy}
Analyze this Terraform plan and provide:
1. A plain-English summary of what will change (2-3 sentences, understandable by a non-technical CAB reviewer)
2. Risk level: low (tags, descriptions), medium (config changes, scaling), high (networking, security groups, IAM), critical (destroy/recreate stateful resources)
3. List any destructive changes (resources being destroyed or replaced)
4. A rollback strategy specific to these changes
Be direct. No filler. If this plan destroys a database, say it clearly.`,
}),
});
return response.json();
}
}
The AI service (same .NET service from the IDP series) processes the plan:
// AiService/Controllers/PipelineController.cs
using Microsoft.AspNetCore.Mvc;
using Microsoft.Extensions.AI;
namespace AiService.Controllers;
[ApiController]
[Route("api/pipeline")]
public class PipelineController : ControllerBase
{
private readonly IChatClient _chat;
private readonly ILogger<PipelineController> _logger;
public PipelineController(IChatClient chat, ILogger<PipelineController> logger)
{
_chat = chat;
_logger = logger;
}
[HttpPost("summarize-plan")]
public async Task<IActionResult> SummarizePlan([FromBody] PlanSummaryRequest request)
{
var messages = new List<ChatMessage>
{
new(ChatRole.System, request.Prompt),
new(ChatRole.User, $"Terraform plan output:\n\n{request.Plan}"),
};
var response = await _chat.GetResponseAsync(messages);
// Parse structured output from AI response
var summary = ParsePlanSummary(response.Text, request.Plan);
_logger.LogInformation(
"Plan summary for {Module}: risk={Risk}, destroy={Destroy}",
request.Context.Module,
summary.RiskLevel,
summary.ResourcesDestroyed);
return Ok(summary);
}
private PlanSummary ParsePlanSummary(string aiResponse, string rawPlan)
{
// Count resources from the raw plan
var created = CountPattern(rawPlan, "will be created");
var modified = CountPattern(rawPlan, "will be updated");
var destroyed = CountPattern(rawPlan, "will be destroyed");
var replaced = CountPattern(rawPlan, "must be replaced");
// AI determines risk level and writes the summary
var riskLevel = destroyed + replaced > 0 ? "critical"
: aiResponse.Contains("high", StringComparison.OrdinalIgnoreCase) ? "high"
: modified > 3 ? "medium"
: "low";
return new PlanSummary
{
HumanReadable = aiResponse,
RiskLevel = riskLevel,
ResourcesCreated = created,
ResourcesModified = modified,
ResourcesDestroyed = destroyed + replaced,
};
}
private static int CountPattern(string text, string pattern) =>
text.Split(pattern).Length - 1;
}
The pipeline posts this summary as a PR comment. Here’s what it looks like:
Terraform Plan Summary (AI-generated)
This plan modifies the AKS cluster node pool for Client ACME. It will scale the default node pool from 3 to 5 nodes and update the Kubernetes version from 1.29 to 1.30. The cluster will perform a rolling upgrade — no downtime expected, but pods will be rescheduled during the process.
Metric Value Resources created 0 Resources modified 2 Resources destroyed 0 Risk level Medium Rollback strategy: Scale node pool back to 3 nodes via
terraform applywith previous variable values. Kubernetes downgrade from 1.30 to 1.29 is not supported — would need cluster recreation.
The CAB reviewer reads this in 30 seconds instead of parsing 200 lines of Terraform output.
Step 4: AI pipeline failure diagnosis
When a pipeline fails, the engineer opens the logs, scrolls through 500 lines, and figures out what went wrong. Sometimes it’s obvious (syntax error). Sometimes it takes 20 minutes (provider version conflict, state lock, permission issue).
AI reads the logs, the module code, and the recent changes — and tells you what happened:
// plugins/pipeline-ai-backend/src/services/FailureDiagnosisService.ts
export class FailureDiagnosisService {
constructor(
private readonly aiBaseUrl: string,
private readonly catalogClient: CatalogClient,
) {}
async diagnoseFailure(input: {
entityRef: string;
pipelineLogs: string;
recentCommits: string[];
platform: string;
}): Promise<FailureDiagnosis> {
const entity = await this.catalogClient.getEntityByRef(input.entityRef);
const response = await fetch(`${this.aiBaseUrl}/api/pipeline/diagnose`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
logs: input.pipelineLogs,
module: entity?.metadata.name,
description: entity?.metadata.description,
recentCommits: input.recentCommits,
platform: input.platform,
prompt: `You are diagnosing a CI/CD pipeline failure for a Terraform module.
Module: ${entity?.metadata.name}
Platform: ${input.platform}
Recent commits: ${input.recentCommits.join('\n')}
Analyze the pipeline logs and provide:
1. Root cause (one sentence)
2. Evidence (the specific log lines that confirm the cause)
3. Fix (specific steps to resolve — not generic advice)
4. Prevention (what to add to the pipeline or CLAUDE.md to prevent this)
If the root cause is in the recent commits, say which commit caused it.`,
}),
});
return response.json();
}
}
The diagnosis shows up in the Backstage entity page, right next to the failed pipeline run:
// plugins/pipeline-dashboard/src/components/FailureDiagnosisCard.tsx
import React, { useEffect, useState } from 'react';
import { InfoCard, WarningPanel } from '@backstage/core-components';
import { useApi, discoveryApiRef, fetchApiRef } from '@backstage/core-plugin-api';
import { useEntity } from '@backstage/plugin-catalog-react';
interface Diagnosis {
rootCause: string;
evidence: string[];
fix: string[];
prevention: string;
}
export const FailureDiagnosisCard = ({ runId }: { runId: string }) => {
const { entity } = useEntity();
const discoveryApi = useApi(discoveryApiRef);
const fetchApi = useApi(fetchApiRef);
const [diagnosis, setDiagnosis] = useState<Diagnosis | null>(null);
const [loading, setLoading] = useState(true);
useEffect(() => {
const load = async () => {
const baseUrl = await discoveryApi.getBaseUrl('pipeline-ai');
const res = await fetchApi.fetch(
`${baseUrl}/diagnose/${entity.metadata.name}/${runId}`,
);
if (res.ok) {
setDiagnosis(await res.json());
}
setLoading(false);
};
load();
}, [discoveryApi, fetchApi, entity.metadata.name, runId]);
if (loading) return <InfoCard title="AI Diagnosis">Analyzing failure...</InfoCard>;
if (!diagnosis) return null;
return (
<WarningPanel
title="AI Failure Diagnosis"
message={diagnosis.rootCause}
severity="error"
>
<div>
<h4>Evidence</h4>
<pre>{diagnosis.evidence.join('\n')}</pre>
<h4>How to fix</h4>
<ol>
{diagnosis.fix.map((step, i) => <li key={i}>{step}</li>)}
</ol>
<h4>Prevention</h4>
<p>{diagnosis.prevention}</p>
</div>
</WarningPanel>
);
};
Real example of what the diagnosis produces:
Root cause: Azure provider 4.x removed the
enable_rbacargument fromazurerm_kubernetes_cluster. It’s now always enabled.Evidence:
Error: Unsupported argument "enable_rbac" on main.tf line 47, in resource "azurerm_kubernetes_cluster":Fix:
- Remove
enable_rbac = truefrommain.tfline 47- Run
terraform planto confirm no state changes- Commit: “Remove deprecated enable_rbac argument (always true in provider 4.x)”
Prevention: Add a
terraform validatestep beforeterraform planin the pipeline. The scaffolder templates from article 2 already include this — this module was created before the golden path existed.
Step 5: The enterprise change management workflow
Here’s where it gets real. In a startup, steps 3 and 4 are enough. But in an enterprise — a bank, an energy company, a healthcare provider — you don’t just deploy because the plan looks good. You need a Change Request.
The workflow depends on the organization. Some companies have three tiers. Some have five. Some require a CAB meeting for everything. Others have pre-approved standard changes that flow automatically. AI adapts to whatever your organization needs.
Here’s the model we use:
| Change type | Risk | What happens |
|---|---|---|
| Standard | Low — tags, descriptions, scaling within limits | Auto-approved. AI creates the CR, attaches evidence, closes it. No human in the loop. |
| Normal | Medium — config changes, new resources, security group rules | AI prepares the full CR package. Team lead approves in Backstage. No CAB meeting needed. |
| Emergency | Unplanned — incident fix, hotfix | Deploy first, document after. AI creates the post-hoc CR with all evidence. |
| CAB-required | High — networking, IAM, destroy/recreate, cross-client | AI prepares the CR + risk assessment + rollback plan. Goes to CAB queue. CAB reviews in Backstage, not in a meeting room. |
The forge.io/change-policy annotation in the catalog tells AI the default policy for each module. But AI can override it — if a “standard” module has a plan that destroys resources, AI escalates it to “cab-required” automatically.
// plugins/change-management-backend/src/services/ChangeRequestService.ts
import { CatalogClient } from '@backstage/catalog-client';
import { PlanSummary } from '../types';
interface ChangeRequest {
id: string;
module: string;
client: string;
type: 'standard' | 'normal' | 'emergency' | 'cab-required';
status: 'draft' | 'pending-approval' | 'approved' | 'rejected' | 'implemented' | 'closed';
summary: string;
riskAssessment: RiskAssessment;
evidence: Evidence;
rollbackPlan: string;
createdBy: string;
approvedBy?: string;
implementedAt?: string;
}
interface RiskAssessment {
level: string;
factors: string[];
blastRadius: string;
affectedServices: string[];
}
interface Evidence {
planSummary: PlanSummary;
prUrl: string;
prApprovers: string[];
securityScan: string;
testResults: string;
pipelineRunUrl: string;
}
export class ChangeRequestService {
constructor(
private readonly aiBaseUrl: string,
private readonly catalogClient: CatalogClient,
private readonly db: any,
) {}
async createFromPipeline(input: {
entityRef: string;
planSummary: PlanSummary;
prUrl: string;
prApprovers: string[];
pipelineRunUrl: string;
triggeredBy: string;
}): Promise<ChangeRequest> {
const entity = await this.catalogClient.getEntityByRef(input.entityRef);
const defaultPolicy = entity?.metadata.annotations?.['forge.io/change-policy'] || 'normal';
// AI decides the actual change type based on the plan
const changeType = this.determineChangeType(defaultPolicy, input.planSummary);
// AI generates the risk assessment
const riskAssessment = await this.generateRiskAssessment(
entity, input.planSummary,
);
// AI generates the rollback plan
const rollbackPlan = await this.generateRollbackPlan(
entity, input.planSummary,
);
const cr: ChangeRequest = {
id: `CR-${Date.now()}`,
module: entity?.metadata.name || 'unknown',
client: entity?.metadata.tags
?.find(t => t.startsWith('client-'))
?.replace('client-', '') || 'unknown',
type: changeType,
status: changeType === 'standard' ? 'approved' : 'pending-approval',
summary: input.planSummary.humanReadable,
riskAssessment,
evidence: {
planSummary: input.planSummary,
prUrl: input.prUrl,
prApprovers: input.prApprovers,
securityScan: 'passed', // from pipeline
testResults: 'passed', // from pipeline
pipelineRunUrl: input.pipelineRunUrl,
},
rollbackPlan,
createdBy: input.triggeredBy,
approvedBy: changeType === 'standard' ? 'auto-approved' : undefined,
};
await this.db.changeRequests.insert(cr);
return cr;
}
private determineChangeType(
defaultPolicy: string,
plan: PlanSummary,
): ChangeRequest['type'] {
// AI can escalate but never downgrade
if (plan.riskLevel === 'critical' || plan.resourcesDestroyed > 0) {
return 'cab-required';
}
if (plan.riskLevel === 'high') {
return defaultPolicy === 'cab-required' ? 'cab-required' : 'normal';
}
if (plan.riskLevel === 'low' && defaultPolicy === 'standard') {
return 'standard';
}
return defaultPolicy as ChangeRequest['type'];
}
private async generateRiskAssessment(
entity: any,
plan: PlanSummary,
): Promise<RiskAssessment> {
const response = await fetch(`${this.aiBaseUrl}/api/pipeline/risk-assessment`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
module: entity?.metadata.name,
client: entity?.metadata.tags?.find((t: string) => t.startsWith('client-')),
plan,
prompt: `Assess the risk of this infrastructure change.
Consider:
- Blast radius: how many services or users are affected if this goes wrong?
- Reversibility: can we undo this change quickly?
- Timing: is this a high-traffic period?
- Dependencies: do other modules depend on this one?
Return: risk level, risk factors (list), blast radius (sentence), affected services (list).
Be honest. If it's low risk, say so. Don't inflate risk to look thorough.`,
}),
});
return response.json();
}
private async generateRollbackPlan(
entity: any,
plan: PlanSummary,
): Promise<string> {
const response = await fetch(`${this.aiBaseUrl}/api/pipeline/rollback-plan`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
module: entity?.metadata.name,
plan,
prompt: `Write a rollback plan for this Terraform change.
Be specific. Include:
1. Exact steps to rollback (terraform commands, git commands)
2. Expected time to rollback
3. What to verify after rollback
4. Any data that cannot be recovered (if resources are destroyed)
Keep it short. An engineer at 2am should be able to follow this.`,
}),
});
const data = await response.json();
return data.rollbackPlan;
}
}
Step 6: The CAB review UI in Backstage
The CAB doesn’t need to open ServiceNow. They don’t need a meeting room. They review changes in Backstage, where all the context lives:
// plugins/change-management/src/components/ChangeRequestReview.tsx
import React, { useEffect, useState } from 'react';
import {
Page, Header, Content, InfoCard,
Table, TableColumn, StatusOK, StatusError,
StatusPending, StatusWarning,
} from '@backstage/core-components';
import { Button, Chip, Typography } from '@material-ui/core';
import { useApi, discoveryApiRef, fetchApiRef, identityApiRef } from '@backstage/core-plugin-api';
interface ChangeRequest {
id: string;
module: string;
client: string;
type: string;
status: string;
summary: string;
riskAssessment: {
level: string;
factors: string[];
blastRadius: string;
};
evidence: {
prUrl: string;
prApprovers: string[];
securityScan: string;
testResults: string;
pipelineRunUrl: string;
};
rollbackPlan: string;
createdBy: string;
}
const RiskChip = ({ level }: { level: string }) => {
const colors: Record<string, 'default' | 'primary' | 'secondary'> = {
low: 'default',
medium: 'primary',
high: 'secondary',
critical: 'secondary',
};
return <Chip label={level.toUpperCase()} color={colors[level] || 'default'} size="small" />;
};
export const ChangeRequestReview = () => {
const discoveryApi = useApi(discoveryApiRef);
const fetchApi = useApi(fetchApiRef);
const identityApi = useApi(identityApiRef);
const [requests, setRequests] = useState<ChangeRequest[]>([]);
useEffect(() => {
const load = async () => {
const baseUrl = await discoveryApi.getBaseUrl('change-management');
const res = await fetchApi.fetch(`${baseUrl}/requests?status=pending-approval`);
const data = await res.json();
setRequests(data.requests);
};
load();
}, [discoveryApi, fetchApi]);
const handleApprove = async (crId: string) => {
const { userEntityRef } = await identityApi.getBackstageIdentity();
const baseUrl = await discoveryApi.getBaseUrl('change-management');
await fetchApi.fetch(`${baseUrl}/requests/${crId}/approve`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ approvedBy: userEntityRef }),
});
setRequests(prev => prev.filter(r => r.id !== crId));
};
const handleReject = async (crId: string, reason: string) => {
const baseUrl = await discoveryApi.getBaseUrl('change-management');
await fetchApi.fetch(`${baseUrl}/requests/${crId}/reject`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ reason }),
});
setRequests(prev => prev.filter(r => r.id !== crId));
};
return (
<Page themeId="tool">
<Header
title="Change Advisory Board"
subtitle={`${requests.length} changes pending review`}
/>
<Content>
{requests.map(cr => (
<InfoCard
key={cr.id}
title={`${cr.id} — ${cr.module}`}
subheader={`Client: ${cr.client} | Type: ${cr.type} | By: ${cr.createdBy}`}
>
<Typography variant="body1" paragraph>
{cr.summary}
</Typography>
<Typography variant="h6">Risk Assessment</Typography>
<RiskChip level={cr.riskAssessment.level} />
<Typography variant="body2">
Blast radius: {cr.riskAssessment.blastRadius}
</Typography>
<ul>
{cr.riskAssessment.factors.map((f, i) => (
<li key={i}>{f}</li>
))}
</ul>
<Typography variant="h6">Evidence</Typography>
<ul>
<li>PR: <a href={cr.evidence.prUrl}>View PR</a> (approved by {cr.evidence.prApprovers.join(', ')})</li>
<li>Security scan: {cr.evidence.securityScan}</li>
<li>Tests: {cr.evidence.testResults}</li>
<li>Pipeline: <a href={cr.evidence.pipelineRunUrl}>View run</a></li>
</ul>
<Typography variant="h6">Rollback Plan</Typography>
<pre style={{ background: '#f5f5f5', padding: '12px', borderRadius: '4px' }}>
{cr.rollbackPlan}
</pre>
<div style={{ marginTop: '16px', display: 'flex', gap: '8px' }}>
<Button
variant="contained"
color="primary"
onClick={() => handleApprove(cr.id)}
>
Approve
</Button>
<Button
variant="outlined"
color="secondary"
onClick={() => handleReject(cr.id, 'Needs more context')}
>
Reject
</Button>
</div>
</InfoCard>
))}
</Content>
</Page>
);
};
Step 7: The full pipeline flow
Here’s how it all connects. When an engineer pushes a change to a Terraform module:
1. Push to branch → PR created
2. Pipeline runs: terraform fmt → terraform validate → terraform plan
3. AI reads the plan → generates human-readable summary → posts as PR comment
4. PR approved by team → merge to main
5. Main pipeline runs: terraform plan (again, for the CR)
6. AI creates Change Request:
- Reads plan summary (from step 3)
- Reads module context from catalog
- Generates risk assessment
- Generates rollback plan
- Attaches all evidence (PR, approvers, scan, tests, pipeline URL)
- Determines change type (standard / normal / cab-required)
7. Route based on type:
- Standard → auto-approved → terraform apply
- Normal → team lead approves in Backstage → terraform apply
- CAB-required → CAB reviews in Backstage → approve/reject → terraform apply
8. Post-implementation:
- AI verifies the apply succeeded
- CR status updated to "implemented"
- If apply fails → AI diagnoses failure → CR updated with incident details
The enterprise reality spectrum
Not every organization is the same. Here’s how the same system adapts:
Startup / small team: Skip steps 5-7. The plan summary and failure diagnosis are enough. You deploy on merge.
Mid-size company: Use “standard” and “normal” change types. Team leads approve normal changes. No CAB meetings. AI documentation gives you audit trail for compliance without the overhead.
Regulated enterprise (bank, energy, healthcare): Full CAB workflow. But the CAB meets less often because AI prepares everything. A change that took 3 days to get through CAB now takes 3 hours — because the CR is complete, well-structured, and includes risk assessment and rollback plan. The CAB reviewer spends 2 minutes reading instead of 20 minutes asking questions.
MSP managing multiple clients: Each client can have different change policies. Client ACME wants CAB for everything. Client Globex trusts auto-approval for standard changes. The forge.io/change-policy annotation per module handles this — same Backstage, different rules.
The point is: AI doesn’t remove the process. AI removes the paperwork. The decisions stay with humans. But humans get better information, faster.
The unified dashboard
The pipeline dashboard from the platform team’s perspective now includes change request status:
// Extended PipelineRun interface
interface PipelineRun {
module: string;
client: string;
platform: 'azure-devops' | 'github' | 'gitlab';
status: 'success' | 'failed' | 'running' | 'pending';
branch: string;
startedAt: string;
duration: string;
url: string;
// New: change management fields
changeRequest?: {
id: string;
type: string;
status: string;
riskLevel: string;
};
aiDiagnosis?: string; // populated when status === 'failed'
}
One table. All pipelines. All platforms. All clients. With change request status, risk level, and AI diagnosis for failures. No more switching between ServiceNow, Azure DevOps, GitHub, and GitLab.
Checklist
- Every Terraform module has pipeline annotations in
catalog-info.yaml -
forge.io/change-policyannotation set per module - CI/CD plugins installed for all platforms used
- AI plan summary posts as PR comment on every
terraform plan - AI failure diagnosis triggers automatically on pipeline failures
- Change Request created automatically after merge to main
- Standard changes auto-approve and deploy
- Normal changes route to team lead for approval
- CAB-required changes appear in the CAB review page
- Rollback plan generated for every change request
- Evidence package complete: PR, approvers, scan, tests, pipeline URL
Challenge
Before the next article:
- Add
forge.io/change-policyto one of your modules - Set up the AI plan summary — even without the full CR workflow, having readable plan summaries in your PRs is a quick win
- Think about your organization’s change types — what would be “standard” (auto-approve), “normal” (team lead), and “cab-required”?
In the next article, we build Secrets and Post-Quantum Identities — manage secrets, rotate credentials, and use AI to detect secret sprawl and expired tokens across your infrastructure. Because your API keys in Key Vault are fine today, but they won’t be forever.
The full code is on GitHub.
If this series helps you, consider buying me a coffee.
This is article 4 of the Infrastructure Hub series. Previous: Multi-tenant Infrastructure. Next: Secrets and Post-Quantum Identities — protect your infrastructure credentials for the quantum era.
Loading comments...