The Infrastructure Hub -- Part 4

Pipelines from Backstage — AI, Change Requests, and the Enterprise Reality

#platform-engineering #backstage #pipelines #ci-cd #azure-devops #github-actions #gitlab-ci #change-management #ai

The Problem

Your infrastructure team uses three CI/CD platforms. Client A is on Azure DevOps. Client B uses GitHub Actions. Client C has everything on GitLab — self-hosted, because their compliance team said so.

Three UIs. Three authentication flows. Three ways to check logs. That’s annoying, but it’s not the real problem.

The real problem is what happens after the pipeline runs terraform plan.

In a startup, you push to main and it deploys. In an enterprise, you push to main and… nothing happens. Because before anything reaches production, you need:

  1. A Change Request (CR) with a description of what changes, why, and what’s the rollback plan
  2. A risk assessment — is this a standard change or does it need CAB approval?
  3. Evidence — the plan output, security scan results, test results, who approved the PR
  4. CAB approval — the Change Advisory Board reviews high-risk changes before they go live
  5. A post-implementation review — did the change work? Any incidents?

Today, engineers do this manually. They copy the terraform plan output into a ServiceNow ticket. They write a risk assessment from memory. They attach screenshots of pipeline runs as evidence. They wait for the CAB meeting next Thursday. Then they deploy on Friday afternoon because that’s the first available window.

This process exists for good reasons — production changes need control. But the way most enterprises do it wastes hours of engineering time on paperwork that AI could prepare in seconds.

The Solution

We combine three things:

  1. Backstage as the unified pipeline UI — one place to see all pipelines across Azure DevOps, GitHub Actions, and GitLab CI
  2. AI that does the boring work — generates change request documentation, summarizes Terraform plans in plain English, diagnoses pipeline failures, and prepares risk assessments
  3. A change management workflow that adapts to the enterprise — from fully automated (low-risk standard changes) to CAB-reviewed (high-risk production changes)

The key insight: AI doesn’t replace the CAB. AI prepares everything the CAB needs to make a fast decision. The engineer pushes code. AI generates the CR, the risk assessment, the evidence package, and the rollback plan. The CAB gets a complete, well-structured request instead of a rushed ticket written at 4pm on Wednesday.

And for standard changes — the ones that your organization has pre-approved (like updating a tag value or scaling a node pool) — AI classifies them automatically and they flow through without a meeting.

Execute

Step 1: Connect pipelines to the catalog

Each module’s catalog-info.yaml already exists from article 1. Add annotations that tell Backstage where the pipeline lives:

# Azure DevOps
apiVersion: backstage.io/v1alpha1
kind: Component
metadata:
  name: tf-azurerm-storage-account
  title: Azure Storage Account Module
  annotations:
    dev.azure.com/project-repo: acme-org/acme-infra-modules
    dev.azure.com/build-definition: tf-storage-account-ci
    forge.io/change-policy: standard  # or "cab-required" or "auto-approve"
  tags:
    - terraform-module
    - azure
    - client-acme
spec:
  type: terraform-module
  lifecycle: production
  owner: team-acme
  system: client-acme-infrastructure
# GitHub Actions
apiVersion: backstage.io/v1alpha1
kind: Component
metadata:
  name: tf-scaleway-instance
  title: Scaleway Instance Module
  annotations:
    github.com/project-slug: victorZKov/forge
    github.com/workflows: tf-scaleway-instance-ci.yml
    forge.io/change-policy: standard
spec:
  type: terraform-module
  lifecycle: production
  owner: team-globex
  system: client-globex-infrastructure
# GitLab CI
apiVersion: backstage.io/v1alpha1
kind: Component
metadata:
  name: tf-aws-vpc
  title: AWS VPC Module
  annotations:
    gitlab.com/project-slug: client-initech/infra-modules
    gitlab.com/pipeline-branch: main
    forge.io/change-policy: cab-required  # VPC changes always need CAB
spec:
  type: terraform-module
  lifecycle: production
  owner: team-initech
  system: client-initech-infrastructure

Notice forge.io/change-policy. This annotation controls what happens after the pipeline runs. We’ll use it in Step 5.

Step 2: Install CI/CD plugins and wire the entity page

Install the community plugins for all three platforms:

# Azure DevOps
yarn --cwd packages/app add @backstage-community/plugin-azure-devops
yarn --cwd packages/backend add @backstage-community/plugin-azure-devops-backend

# GitHub Actions
yarn --cwd packages/app add @backstage-community/plugin-github-actions

# GitLab
yarn --cwd packages/app add @immobiliarelabs/backstage-plugin-gitlab
yarn --cwd packages/backend add @immobiliarelabs/backstage-plugin-gitlab-backend

Configure credentials in app-config.yaml:

integrations:
  azure:
    - host: dev.azure.com
      credentials:
        - organizations:
            - acme-org
          personalAccessToken: ${AZURE_DEVOPS_PAT}
  github:
    - host: github.com
      token: ${GITHUB_TOKEN}
  gitlab:
    - host: gitlab.com
      token: ${GITLAB_TOKEN}

Wire the plugins into the entity page with EntitySwitch — Backstage detects which annotations are present and renders the right plugin:

// packages/app/src/components/catalog/EntityPage.tsx
import {
  EntityAzurePipelinesContent,
  isAzureDevOpsAvailable,
} from '@backstage-community/plugin-azure-devops';
import {
  EntityGithubActionsContent,
  isGithubActionsAvailable,
} from '@backstage-community/plugin-github-actions';
import {
  EntityGitlabPipelinesTable,
  isGitlabAvailable,
} from '@immobiliarelabs/backstage-plugin-gitlab';

const cicdContent = (
  <EntitySwitch>
    <EntitySwitch.Case if={isAzureDevOpsAvailable}>
      <EntityAzurePipelinesContent defaultLimit={10} />
    </EntitySwitch.Case>
    <EntitySwitch.Case if={isGithubActionsAvailable}>
      <EntityGithubActionsContent />
    </EntitySwitch.Case>
    <EntitySwitch.Case if={isGitlabAvailable}>
      <EntityGitlabPipelinesTable />
    </EntitySwitch.Case>
    <EntitySwitch.Case>
      <EmptyState
        title="No CI/CD configured"
        description="Add pipeline annotations to this module's catalog-info.yaml"
        missing="info"
      />
    </EntitySwitch.Case>
  </EntitySwitch>
);

This is standard Backstage. Nothing new here. The real value starts now.

Step 3: AI-powered Terraform plan summaries

When a pipeline runs terraform plan, the output is technical. A 200-line plan with resource addresses, attribute changes, and force-replacement warnings. The engineer understands it. The CAB member reviewing the change request? Maybe not.

AI reads the plan and produces a summary that anyone can understand:

// plugins/pipeline-ai-backend/src/services/PlanSummaryService.ts
import { CatalogClient } from '@backstage/catalog-client';

interface PlanSummary {
  humanReadable: string;
  riskLevel: 'low' | 'medium' | 'high' | 'critical';
  resourcesCreated: number;
  resourcesModified: number;
  resourcesDestroyed: number;
  destructiveChanges: string[];
  rollbackStrategy: string;
}

export class PlanSummaryService {
  constructor(
    private readonly aiBaseUrl: string,
    private readonly catalogClient: CatalogClient,
  ) {}

  async summarizePlan(
    planJson: string,
    entityRef: string,
  ): Promise<PlanSummary> {
    // Get module context from catalog
    const entity = await this.catalogClient.getEntityByRef(entityRef);
    const moduleDescription = entity?.metadata.description || '';
    const client = entity?.metadata.tags
      ?.find(t => t.startsWith('client-'))
      ?.replace('client-', '') || 'unknown';
    const changePolicy = entity?.metadata.annotations?.['forge.io/change-policy'] || 'standard';

    const response = await fetch(`${this.aiBaseUrl}/api/pipeline/summarize-plan`, {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({
        plan: planJson,
        context: {
          module: entity?.metadata.name,
          description: moduleDescription,
          client,
          changePolicy,
        },
        prompt: `You are reviewing a Terraform plan for an infrastructure module.

Module: ${entity?.metadata.name}
Client: ${client}
Description: ${moduleDescription}
Change policy: ${changePolicy}

Analyze this Terraform plan and provide:

1. A plain-English summary of what will change (2-3 sentences, understandable by a non-technical CAB reviewer)
2. Risk level: low (tags, descriptions), medium (config changes, scaling), high (networking, security groups, IAM), critical (destroy/recreate stateful resources)
3. List any destructive changes (resources being destroyed or replaced)
4. A rollback strategy specific to these changes

Be direct. No filler. If this plan destroys a database, say it clearly.`,
      }),
    });

    return response.json();
  }
}

The AI service (same .NET service from the IDP series) processes the plan:

// AiService/Controllers/PipelineController.cs
using Microsoft.AspNetCore.Mvc;
using Microsoft.Extensions.AI;

namespace AiService.Controllers;

[ApiController]
[Route("api/pipeline")]
public class PipelineController : ControllerBase
{
    private readonly IChatClient _chat;
    private readonly ILogger<PipelineController> _logger;

    public PipelineController(IChatClient chat, ILogger<PipelineController> logger)
    {
        _chat = chat;
        _logger = logger;
    }

    [HttpPost("summarize-plan")]
    public async Task<IActionResult> SummarizePlan([FromBody] PlanSummaryRequest request)
    {
        var messages = new List<ChatMessage>
        {
            new(ChatRole.System, request.Prompt),
            new(ChatRole.User, $"Terraform plan output:\n\n{request.Plan}"),
        };

        var response = await _chat.GetResponseAsync(messages);

        // Parse structured output from AI response
        var summary = ParsePlanSummary(response.Text, request.Plan);

        _logger.LogInformation(
            "Plan summary for {Module}: risk={Risk}, destroy={Destroy}",
            request.Context.Module,
            summary.RiskLevel,
            summary.ResourcesDestroyed);

        return Ok(summary);
    }

    private PlanSummary ParsePlanSummary(string aiResponse, string rawPlan)
    {
        // Count resources from the raw plan
        var created = CountPattern(rawPlan, "will be created");
        var modified = CountPattern(rawPlan, "will be updated");
        var destroyed = CountPattern(rawPlan, "will be destroyed");
        var replaced = CountPattern(rawPlan, "must be replaced");

        // AI determines risk level and writes the summary
        var riskLevel = destroyed + replaced > 0 ? "critical"
            : aiResponse.Contains("high", StringComparison.OrdinalIgnoreCase) ? "high"
            : modified > 3 ? "medium"
            : "low";

        return new PlanSummary
        {
            HumanReadable = aiResponse,
            RiskLevel = riskLevel,
            ResourcesCreated = created,
            ResourcesModified = modified,
            ResourcesDestroyed = destroyed + replaced,
        };
    }

    private static int CountPattern(string text, string pattern) =>
        text.Split(pattern).Length - 1;
}

The pipeline posts this summary as a PR comment. Here’s what it looks like:

Terraform Plan Summary (AI-generated)

This plan modifies the AKS cluster node pool for Client ACME. It will scale the default node pool from 3 to 5 nodes and update the Kubernetes version from 1.29 to 1.30. The cluster will perform a rolling upgrade — no downtime expected, but pods will be rescheduled during the process.

MetricValue
Resources created0
Resources modified2
Resources destroyed0
Risk levelMedium

Rollback strategy: Scale node pool back to 3 nodes via terraform apply with previous variable values. Kubernetes downgrade from 1.30 to 1.29 is not supported — would need cluster recreation.

The CAB reviewer reads this in 30 seconds instead of parsing 200 lines of Terraform output.

Step 4: AI pipeline failure diagnosis

When a pipeline fails, the engineer opens the logs, scrolls through 500 lines, and figures out what went wrong. Sometimes it’s obvious (syntax error). Sometimes it takes 20 minutes (provider version conflict, state lock, permission issue).

AI reads the logs, the module code, and the recent changes — and tells you what happened:

// plugins/pipeline-ai-backend/src/services/FailureDiagnosisService.ts
export class FailureDiagnosisService {
  constructor(
    private readonly aiBaseUrl: string,
    private readonly catalogClient: CatalogClient,
  ) {}

  async diagnoseFailure(input: {
    entityRef: string;
    pipelineLogs: string;
    recentCommits: string[];
    platform: string;
  }): Promise<FailureDiagnosis> {
    const entity = await this.catalogClient.getEntityByRef(input.entityRef);

    const response = await fetch(`${this.aiBaseUrl}/api/pipeline/diagnose`, {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({
        logs: input.pipelineLogs,
        module: entity?.metadata.name,
        description: entity?.metadata.description,
        recentCommits: input.recentCommits,
        platform: input.platform,
        prompt: `You are diagnosing a CI/CD pipeline failure for a Terraform module.

Module: ${entity?.metadata.name}
Platform: ${input.platform}
Recent commits: ${input.recentCommits.join('\n')}

Analyze the pipeline logs and provide:
1. Root cause (one sentence)
2. Evidence (the specific log lines that confirm the cause)
3. Fix (specific steps to resolve — not generic advice)
4. Prevention (what to add to the pipeline or CLAUDE.md to prevent this)

If the root cause is in the recent commits, say which commit caused it.`,
      }),
    });

    return response.json();
  }
}

The diagnosis shows up in the Backstage entity page, right next to the failed pipeline run:

// plugins/pipeline-dashboard/src/components/FailureDiagnosisCard.tsx
import React, { useEffect, useState } from 'react';
import { InfoCard, WarningPanel } from '@backstage/core-components';
import { useApi, discoveryApiRef, fetchApiRef } from '@backstage/core-plugin-api';
import { useEntity } from '@backstage/plugin-catalog-react';

interface Diagnosis {
  rootCause: string;
  evidence: string[];
  fix: string[];
  prevention: string;
}

export const FailureDiagnosisCard = ({ runId }: { runId: string }) => {
  const { entity } = useEntity();
  const discoveryApi = useApi(discoveryApiRef);
  const fetchApi = useApi(fetchApiRef);
  const [diagnosis, setDiagnosis] = useState<Diagnosis | null>(null);
  const [loading, setLoading] = useState(true);

  useEffect(() => {
    const load = async () => {
      const baseUrl = await discoveryApi.getBaseUrl('pipeline-ai');
      const res = await fetchApi.fetch(
        `${baseUrl}/diagnose/${entity.metadata.name}/${runId}`,
      );
      if (res.ok) {
        setDiagnosis(await res.json());
      }
      setLoading(false);
    };
    load();
  }, [discoveryApi, fetchApi, entity.metadata.name, runId]);

  if (loading) return <InfoCard title="AI Diagnosis">Analyzing failure...</InfoCard>;
  if (!diagnosis) return null;

  return (
    <WarningPanel
      title="AI Failure Diagnosis"
      message={diagnosis.rootCause}
      severity="error"
    >
      <div>
        <h4>Evidence</h4>
        <pre>{diagnosis.evidence.join('\n')}</pre>
        <h4>How to fix</h4>
        <ol>
          {diagnosis.fix.map((step, i) => <li key={i}>{step}</li>)}
        </ol>
        <h4>Prevention</h4>
        <p>{diagnosis.prevention}</p>
      </div>
    </WarningPanel>
  );
};

Real example of what the diagnosis produces:

Root cause: Azure provider 4.x removed the enable_rbac argument from azurerm_kubernetes_cluster. It’s now always enabled.

Evidence:

Error: Unsupported argument "enable_rbac"
  on main.tf line 47, in resource "azurerm_kubernetes_cluster":

Fix:

  1. Remove enable_rbac = true from main.tf line 47
  2. Run terraform plan to confirm no state changes
  3. Commit: “Remove deprecated enable_rbac argument (always true in provider 4.x)”

Prevention: Add a terraform validate step before terraform plan in the pipeline. The scaffolder templates from article 2 already include this — this module was created before the golden path existed.

Step 5: The enterprise change management workflow

Here’s where it gets real. In a startup, steps 3 and 4 are enough. But in an enterprise — a bank, an energy company, a healthcare provider — you don’t just deploy because the plan looks good. You need a Change Request.

The workflow depends on the organization. Some companies have three tiers. Some have five. Some require a CAB meeting for everything. Others have pre-approved standard changes that flow automatically. AI adapts to whatever your organization needs.

Here’s the model we use:

Change typeRiskWhat happens
StandardLow — tags, descriptions, scaling within limitsAuto-approved. AI creates the CR, attaches evidence, closes it. No human in the loop.
NormalMedium — config changes, new resources, security group rulesAI prepares the full CR package. Team lead approves in Backstage. No CAB meeting needed.
EmergencyUnplanned — incident fix, hotfixDeploy first, document after. AI creates the post-hoc CR with all evidence.
CAB-requiredHigh — networking, IAM, destroy/recreate, cross-clientAI prepares the CR + risk assessment + rollback plan. Goes to CAB queue. CAB reviews in Backstage, not in a meeting room.

The forge.io/change-policy annotation in the catalog tells AI the default policy for each module. But AI can override it — if a “standard” module has a plan that destroys resources, AI escalates it to “cab-required” automatically.

// plugins/change-management-backend/src/services/ChangeRequestService.ts
import { CatalogClient } from '@backstage/catalog-client';
import { PlanSummary } from '../types';

interface ChangeRequest {
  id: string;
  module: string;
  client: string;
  type: 'standard' | 'normal' | 'emergency' | 'cab-required';
  status: 'draft' | 'pending-approval' | 'approved' | 'rejected' | 'implemented' | 'closed';
  summary: string;
  riskAssessment: RiskAssessment;
  evidence: Evidence;
  rollbackPlan: string;
  createdBy: string;
  approvedBy?: string;
  implementedAt?: string;
}

interface RiskAssessment {
  level: string;
  factors: string[];
  blastRadius: string;
  affectedServices: string[];
}

interface Evidence {
  planSummary: PlanSummary;
  prUrl: string;
  prApprovers: string[];
  securityScan: string;
  testResults: string;
  pipelineRunUrl: string;
}

export class ChangeRequestService {
  constructor(
    private readonly aiBaseUrl: string,
    private readonly catalogClient: CatalogClient,
    private readonly db: any,
  ) {}

  async createFromPipeline(input: {
    entityRef: string;
    planSummary: PlanSummary;
    prUrl: string;
    prApprovers: string[];
    pipelineRunUrl: string;
    triggeredBy: string;
  }): Promise<ChangeRequest> {
    const entity = await this.catalogClient.getEntityByRef(input.entityRef);
    const defaultPolicy = entity?.metadata.annotations?.['forge.io/change-policy'] || 'normal';

    // AI decides the actual change type based on the plan
    const changeType = this.determineChangeType(defaultPolicy, input.planSummary);

    // AI generates the risk assessment
    const riskAssessment = await this.generateRiskAssessment(
      entity, input.planSummary,
    );

    // AI generates the rollback plan
    const rollbackPlan = await this.generateRollbackPlan(
      entity, input.planSummary,
    );

    const cr: ChangeRequest = {
      id: `CR-${Date.now()}`,
      module: entity?.metadata.name || 'unknown',
      client: entity?.metadata.tags
        ?.find(t => t.startsWith('client-'))
        ?.replace('client-', '') || 'unknown',
      type: changeType,
      status: changeType === 'standard' ? 'approved' : 'pending-approval',
      summary: input.planSummary.humanReadable,
      riskAssessment,
      evidence: {
        planSummary: input.planSummary,
        prUrl: input.prUrl,
        prApprovers: input.prApprovers,
        securityScan: 'passed', // from pipeline
        testResults: 'passed',  // from pipeline
        pipelineRunUrl: input.pipelineRunUrl,
      },
      rollbackPlan,
      createdBy: input.triggeredBy,
      approvedBy: changeType === 'standard' ? 'auto-approved' : undefined,
    };

    await this.db.changeRequests.insert(cr);

    return cr;
  }

  private determineChangeType(
    defaultPolicy: string,
    plan: PlanSummary,
  ): ChangeRequest['type'] {
    // AI can escalate but never downgrade
    if (plan.riskLevel === 'critical' || plan.resourcesDestroyed > 0) {
      return 'cab-required';
    }
    if (plan.riskLevel === 'high') {
      return defaultPolicy === 'cab-required' ? 'cab-required' : 'normal';
    }
    if (plan.riskLevel === 'low' && defaultPolicy === 'standard') {
      return 'standard';
    }
    return defaultPolicy as ChangeRequest['type'];
  }

  private async generateRiskAssessment(
    entity: any,
    plan: PlanSummary,
  ): Promise<RiskAssessment> {
    const response = await fetch(`${this.aiBaseUrl}/api/pipeline/risk-assessment`, {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({
        module: entity?.metadata.name,
        client: entity?.metadata.tags?.find((t: string) => t.startsWith('client-')),
        plan,
        prompt: `Assess the risk of this infrastructure change.

Consider:
- Blast radius: how many services or users are affected if this goes wrong?
- Reversibility: can we undo this change quickly?
- Timing: is this a high-traffic period?
- Dependencies: do other modules depend on this one?

Return: risk level, risk factors (list), blast radius (sentence), affected services (list).
Be honest. If it's low risk, say so. Don't inflate risk to look thorough.`,
      }),
    });

    return response.json();
  }

  private async generateRollbackPlan(
    entity: any,
    plan: PlanSummary,
  ): Promise<string> {
    const response = await fetch(`${this.aiBaseUrl}/api/pipeline/rollback-plan`, {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({
        module: entity?.metadata.name,
        plan,
        prompt: `Write a rollback plan for this Terraform change.

Be specific. Include:
1. Exact steps to rollback (terraform commands, git commands)
2. Expected time to rollback
3. What to verify after rollback
4. Any data that cannot be recovered (if resources are destroyed)

Keep it short. An engineer at 2am should be able to follow this.`,
      }),
    });

    const data = await response.json();
    return data.rollbackPlan;
  }
}

Step 6: The CAB review UI in Backstage

The CAB doesn’t need to open ServiceNow. They don’t need a meeting room. They review changes in Backstage, where all the context lives:

// plugins/change-management/src/components/ChangeRequestReview.tsx
import React, { useEffect, useState } from 'react';
import {
  Page, Header, Content, InfoCard,
  Table, TableColumn, StatusOK, StatusError,
  StatusPending, StatusWarning,
} from '@backstage/core-components';
import { Button, Chip, Typography } from '@material-ui/core';
import { useApi, discoveryApiRef, fetchApiRef, identityApiRef } from '@backstage/core-plugin-api';

interface ChangeRequest {
  id: string;
  module: string;
  client: string;
  type: string;
  status: string;
  summary: string;
  riskAssessment: {
    level: string;
    factors: string[];
    blastRadius: string;
  };
  evidence: {
    prUrl: string;
    prApprovers: string[];
    securityScan: string;
    testResults: string;
    pipelineRunUrl: string;
  };
  rollbackPlan: string;
  createdBy: string;
}

const RiskChip = ({ level }: { level: string }) => {
  const colors: Record<string, 'default' | 'primary' | 'secondary'> = {
    low: 'default',
    medium: 'primary',
    high: 'secondary',
    critical: 'secondary',
  };
  return <Chip label={level.toUpperCase()} color={colors[level] || 'default'} size="small" />;
};

export const ChangeRequestReview = () => {
  const discoveryApi = useApi(discoveryApiRef);
  const fetchApi = useApi(fetchApiRef);
  const identityApi = useApi(identityApiRef);
  const [requests, setRequests] = useState<ChangeRequest[]>([]);

  useEffect(() => {
    const load = async () => {
      const baseUrl = await discoveryApi.getBaseUrl('change-management');
      const res = await fetchApi.fetch(`${baseUrl}/requests?status=pending-approval`);
      const data = await res.json();
      setRequests(data.requests);
    };
    load();
  }, [discoveryApi, fetchApi]);

  const handleApprove = async (crId: string) => {
    const { userEntityRef } = await identityApi.getBackstageIdentity();
    const baseUrl = await discoveryApi.getBaseUrl('change-management');
    await fetchApi.fetch(`${baseUrl}/requests/${crId}/approve`, {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({ approvedBy: userEntityRef }),
    });
    setRequests(prev => prev.filter(r => r.id !== crId));
  };

  const handleReject = async (crId: string, reason: string) => {
    const baseUrl = await discoveryApi.getBaseUrl('change-management');
    await fetchApi.fetch(`${baseUrl}/requests/${crId}/reject`, {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({ reason }),
    });
    setRequests(prev => prev.filter(r => r.id !== crId));
  };

  return (
    <Page themeId="tool">
      <Header
        title="Change Advisory Board"
        subtitle={`${requests.length} changes pending review`}
      />
      <Content>
        {requests.map(cr => (
          <InfoCard
            key={cr.id}
            title={`${cr.id} — ${cr.module}`}
            subheader={`Client: ${cr.client} | Type: ${cr.type} | By: ${cr.createdBy}`}
          >
            <Typography variant="body1" paragraph>
              {cr.summary}
            </Typography>

            <Typography variant="h6">Risk Assessment</Typography>
            <RiskChip level={cr.riskAssessment.level} />
            <Typography variant="body2">
              Blast radius: {cr.riskAssessment.blastRadius}
            </Typography>
            <ul>
              {cr.riskAssessment.factors.map((f, i) => (
                <li key={i}>{f}</li>
              ))}
            </ul>

            <Typography variant="h6">Evidence</Typography>
            <ul>
              <li>PR: <a href={cr.evidence.prUrl}>View PR</a> (approved by {cr.evidence.prApprovers.join(', ')})</li>
              <li>Security scan: {cr.evidence.securityScan}</li>
              <li>Tests: {cr.evidence.testResults}</li>
              <li>Pipeline: <a href={cr.evidence.pipelineRunUrl}>View run</a></li>
            </ul>

            <Typography variant="h6">Rollback Plan</Typography>
            <pre style={{ background: '#f5f5f5', padding: '12px', borderRadius: '4px' }}>
              {cr.rollbackPlan}
            </pre>

            <div style={{ marginTop: '16px', display: 'flex', gap: '8px' }}>
              <Button
                variant="contained"
                color="primary"
                onClick={() => handleApprove(cr.id)}
              >
                Approve
              </Button>
              <Button
                variant="outlined"
                color="secondary"
                onClick={() => handleReject(cr.id, 'Needs more context')}
              >
                Reject
              </Button>
            </div>
          </InfoCard>
        ))}
      </Content>
    </Page>
  );
};

Step 7: The full pipeline flow

Here’s how it all connects. When an engineer pushes a change to a Terraform module:

1. Push to branch → PR created
2. Pipeline runs: terraform fmt → terraform validate → terraform plan
3. AI reads the plan → generates human-readable summary → posts as PR comment
4. PR approved by team → merge to main
5. Main pipeline runs: terraform plan (again, for the CR)
6. AI creates Change Request:
   - Reads plan summary (from step 3)
   - Reads module context from catalog
   - Generates risk assessment
   - Generates rollback plan
   - Attaches all evidence (PR, approvers, scan, tests, pipeline URL)
   - Determines change type (standard / normal / cab-required)
7. Route based on type:
   - Standard → auto-approved → terraform apply
   - Normal → team lead approves in Backstage → terraform apply
   - CAB-required → CAB reviews in Backstage → approve/reject → terraform apply
8. Post-implementation:
   - AI verifies the apply succeeded
   - CR status updated to "implemented"
   - If apply fails → AI diagnoses failure → CR updated with incident details

The enterprise reality spectrum

Not every organization is the same. Here’s how the same system adapts:

Startup / small team: Skip steps 5-7. The plan summary and failure diagnosis are enough. You deploy on merge.

Mid-size company: Use “standard” and “normal” change types. Team leads approve normal changes. No CAB meetings. AI documentation gives you audit trail for compliance without the overhead.

Regulated enterprise (bank, energy, healthcare): Full CAB workflow. But the CAB meets less often because AI prepares everything. A change that took 3 days to get through CAB now takes 3 hours — because the CR is complete, well-structured, and includes risk assessment and rollback plan. The CAB reviewer spends 2 minutes reading instead of 20 minutes asking questions.

MSP managing multiple clients: Each client can have different change policies. Client ACME wants CAB for everything. Client Globex trusts auto-approval for standard changes. The forge.io/change-policy annotation per module handles this — same Backstage, different rules.

The point is: AI doesn’t remove the process. AI removes the paperwork. The decisions stay with humans. But humans get better information, faster.

The unified dashboard

The pipeline dashboard from the platform team’s perspective now includes change request status:

// Extended PipelineRun interface
interface PipelineRun {
  module: string;
  client: string;
  platform: 'azure-devops' | 'github' | 'gitlab';
  status: 'success' | 'failed' | 'running' | 'pending';
  branch: string;
  startedAt: string;
  duration: string;
  url: string;
  // New: change management fields
  changeRequest?: {
    id: string;
    type: string;
    status: string;
    riskLevel: string;
  };
  aiDiagnosis?: string;  // populated when status === 'failed'
}

One table. All pipelines. All platforms. All clients. With change request status, risk level, and AI diagnosis for failures. No more switching between ServiceNow, Azure DevOps, GitHub, and GitLab.

Checklist

  • Every Terraform module has pipeline annotations in catalog-info.yaml
  • forge.io/change-policy annotation set per module
  • CI/CD plugins installed for all platforms used
  • AI plan summary posts as PR comment on every terraform plan
  • AI failure diagnosis triggers automatically on pipeline failures
  • Change Request created automatically after merge to main
  • Standard changes auto-approve and deploy
  • Normal changes route to team lead for approval
  • CAB-required changes appear in the CAB review page
  • Rollback plan generated for every change request
  • Evidence package complete: PR, approvers, scan, tests, pipeline URL

Challenge

Before the next article:

  1. Add forge.io/change-policy to one of your modules
  2. Set up the AI plan summary — even without the full CR workflow, having readable plan summaries in your PRs is a quick win
  3. Think about your organization’s change types — what would be “standard” (auto-approve), “normal” (team lead), and “cab-required”?

In the next article, we build Secrets and Post-Quantum Identities — manage secrets, rotate credentials, and use AI to detect secret sprawl and expired tokens across your infrastructure. Because your API keys in Key Vault are fine today, but they won’t be forever.

The full code is on GitHub.

If this series helps you, consider buying me a coffee.

This is article 4 of the Infrastructure Hub series. Previous: Multi-tenant Infrastructure. Next: Secrets and Post-Quantum Identities — protect your infrastructure credentials for the quantum era.

Comments

Loading comments...