The Infrastructure Hub -- Part 9

The Infrastructure Hub Reference Architecture

#platform-engineering #backstage #architecture #terraform #kubernetes #quantumapi #reference

The Problem

You’ve built eight things. A Terraform catalog, golden path templates, multi-tenant client management, pipeline orchestration with CAB workflows, quantum-safe secrets, signed approvals, an infrastructure chat, and drift detection.

Each article showed one piece. Now you need the complete picture: how these pieces fit together, what the database looks like with everything running, how the AI service has grown, and how to deploy the full platform.

The IDP series reference architecture covered the application-focused platform: catalog enrichment, AI scaffolding, code review, TechDocs RAG, governance, and incident response. This article extends that architecture with everything from the Infrastructure Hub series.

The Solution

The Infrastructure Hub adds three layers to the existing IDP platform:

  1. Infrastructure catalog — Terraform modules as first-class entities, with golden path templates and multi-tenant scoping
  2. Infrastructure operations — Pipeline orchestration, CAB workflows with signed approvals, and drift detection
  3. Infrastructure intelligence — AI-powered secret scanning, plan summaries, failure diagnosis, and conversational access to all infrastructure data

All of this runs on the same Backstage instance, the same .NET AI service, and the same PostgreSQL database as the IDP. No new services to deploy — just new plugins, new endpoints, and new tables.

Execute

The Complete Architecture

graph TB
    subgraph "Developer / Platform Team"
        Browser[Browser]
    end

    subgraph "Backstage"
        FE[Frontend - React]
        BE[Backend - Node.js]

        subgraph "IDP Plugins (series 1)"
            P1[Catalog Enricher]
            P2[AI Scaffolder]
            P3[Code Review]
            P4[TechDocs RAG]
            P5[Governance]
            P6[Incident Response]
        end

        subgraph "Infra Hub Plugins (series 2)"
            I1[TF Module Templates]
            I2[Secret Scanner]
            I3[Pipeline Dashboard]
            I4[CAB Workflow]
            I5[Infra Chat]
            I6[Drift Dashboard]
        end
    end

    subgraph "AI Service (.NET)"
        API[".NET Minimal API :5100"]
        E_IDP["/api/enrich, /api/scaffold,\n/api/review, /api/ask,\n/api/incident/analyze"]
        E_INFRA["/api/scan-secrets,\n/api/scaffold-terraform,\n/api/pipeline/*,\n/api/cab/*,\n/api/infra/chat,\n/api/drift/*,\n/api/ssh/issue"]
    end

    subgraph "Data"
        PG[(PostgreSQL + pgvector)]
        QV[QuantumVault - Secrets]
        QS[QuantumAPI - Signing]
        QC[QuantumAPI - SSH Certs]
        AI[AI Provider]
    end

    subgraph "External"
        GH[GitHub API]
        ADO[Azure DevOps]
        GHA[GitHub Actions]
        GL[GitLab CI]
        AZ[Azure / Scaleway / AWS]
    end

    Browser --> FE --> BE
    BE --> P1 & P2 & P3 & P4 & P5 & P6
    BE --> I1 & I2 & I3 & I4 & I5 & I6
    P1 & P2 & P3 & P4 & P5 & P6 --> API
    I2 & I3 & I4 & I5 & I6 --> API
    API --> E_IDP & E_INFRA
    API --> PG
    API --> AI
    API --> QV & QS & QC
    I3 --> ADO & GHA & GL
    I6 --> AZ

New Endpoints in the AI Service

The Infrastructure Hub adds these endpoints to the existing AI service:

EndpointArticlePurpose
POST /api/scaffold-terraform2Generate Terraform module from description
POST /api/pipeline/summarize-plan4Human-readable Terraform plan summary
POST /api/pipeline/diagnose4AI diagnosis of pipeline failures
POST /api/pipeline/risk-assessment4Risk assessment for change requests
POST /api/pipeline/rollback-plan4Generate rollback plan for a change
POST /api/scan-secrets5Scan Terraform files for secret issues
POST /api/ssh/issue5Issue ML-DSA SSH certificate
POST /api/cab/approve6Sign CAB approval with ML-DSA
GET /api/cab/verify/{id}6Verify approval signature
POST /api/cab/seal-evidence6Seal evidence package with signature
GET /api/cab/report6Generate compliance report
POST /api/infra/chat7Multi-turn infrastructure conversation
POST /api/drift/analyze8Analyze Terraform plan for drift
GET /api/drift/results8Fetch all drift scan results

Combined with the IDP series endpoints, the AI service now has 22 endpoints in one Program.cs. The pattern is the same for every endpoint: read config, create client, build prompt with context, call model, return structured JSON.

New Database Tables

Three tables added in this series:

-- Article 6: Signed CAB approvals
CREATE TABLE cab_approvals (
    id SERIAL PRIMARY KEY,
    change_request_id VARCHAR(100) NOT NULL UNIQUE,
    approved_by VARCHAR(255) NOT NULL,
    approved_at TIMESTAMPTZ NOT NULL,
    module VARCHAR(255) NOT NULL,
    client VARCHAR(100),
    risk_level VARCHAR(20) NOT NULL,
    plan_hash VARCHAR(64) NOT NULL,
    payload_json TEXT NOT NULL,
    signature TEXT NOT NULL,
    key_id VARCHAR(100) NOT NULL
);

CREATE INDEX idx_approvals_module ON cab_approvals(module);
CREATE INDEX idx_approvals_client ON cab_approvals(client);
CREATE INDEX idx_approvals_date ON cab_approvals(approved_at);

-- Article 8: Drift detection results
CREATE TABLE drift_results (
    id SERIAL PRIMARY KEY,
    module VARCHAR(255) NOT NULL UNIQUE,
    client VARCHAR(100),
    drift_detected BOOLEAN NOT NULL,
    resource_count INTEGER NOT NULL DEFAULT 0,
    risk VARCHAR(20) NOT NULL DEFAULT 'none',
    summary TEXT,
    analysis_json TEXT,
    detected_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE INDEX idx_drift_module ON drift_results(module);
CREATE INDEX idx_drift_client ON drift_results(client);
CREATE INDEX idx_drift_risk ON drift_results(risk);

Combined with the IDP series tables:

TableSeriesPurpose
doc_chunksIDP art. 5Vector embeddings for RAG
ai_usage_logIDP art. 6Governance — usage tracking
ai_policiesIDP art. 6Governance — per-team policies
cab_approvalsInfra art. 6Signed CAB approvals
drift_resultsInfra art. 8Latest drift scan per module

Five tables. One PostgreSQL instance (with pgvector extension). Backstage uses its own tables for the catalog, and the AI service uses these five for intelligence and operations.

New Backstage Plugin Registration

Add the infrastructure plugins to packages/backend/src/index.ts:

// --- Infrastructure Hub plugins (series 2) ---

// Modules (extend existing plugins)
import { secretScannerModule } from '@internal/plugin-secret-scanner';
backend.add(secretScannerModule);                       // Article 5

// Standalone plugins (own routes)
import { aiIncidentPlugin } from '@internal/plugin-ai-incident';
backend.add(aiIncidentPlugin);                          // IDP Article 7

// Frontend-only plugins (registered in App.tsx, not here):
// - Infra Chat (/infra-chat)                           // Article 7
// - Drift Dashboard (/drift)                           // Article 8
// - CAB Review (/cab)                                  // Article 4+6
// - Governance Dashboard (/ai-governance)              // IDP Article 6

The infrastructure plugins follow the same distinction as the IDP:

  • Modules (createBackendModule): secret scanner extends the catalog
  • Standalone plugins (createBackendPlugin): pipeline dashboard, CAB workflow have their own routes
  • Frontend-only: infra chat, drift dashboard, governance dashboard read from the AI service through the proxy

QuantumAPI Integration Map

QuantumAPI appears in three roles across the Infrastructure Hub:

QuantumVault (Secrets)
├── Pipeline credentials (art. 5) — ARM_CLIENT_SECRET, DB_PASSWORD, etc.
├── Terraform state encryption keys (art. 5) — ML-KEM wrapped AES keys
├── Cosign signing keys (quantum-05) — for image signing
└── Bootstrap: only QUANTUMAPI_KEY in CI/CD platforms

QuantumAPI Signing (ML-DSA)
├── CAB approval signatures (art. 6) — every approval cryptographically signed
├── Evidence package sealing (art. 6) — tamper-proof audit trail
└── Verification endpoint — auditors can verify without internal access

QuantumAPI SSH (ML-DSA Certificates)
├── Short-lived certificates (art. 5) — 8h validity, auto-expire
├── CA trust model — hosts trust QuantumAPI CA, not individual keys
└── Backstage widget — engineers request access from the catalog

QuantumAPI Local Installation

For sovereign cloud, air-gapped environments, or organizations that can’t send data to external APIs, QuantumAPI offers a local installation option.

The local install runs the same services — Vault, Signing, SSH CA, Encryption — inside your own infrastructure. The API is identical. Your code doesn’t change. You point the endpoints at https://quantumapi.internal instead of https://api.quantumapi.eu.

Configuration change in the AI service:

# Cloud (default)
QUANTUMAPI__APIKEY=qid_your_key
# Endpoints default to api.quantumapi.eu

# Local installation
QUANTUMAPI__APIKEY=qid_your_local_key
QUANTUMAPI__ENDPOINT=https://quantumapi.internal

For the qapi CLI in pipelines:

# Cloud
export QAPI_API_KEY=qid_your_key

# Local
export QAPI_API_KEY=qid_your_local_key
export QAPI_ENDPOINT=https://quantumapi.internal

Everything in this series — state encryption, signed approvals, SSH certificates, secret scanning — works the same way with a local install. The cryptographic guarantees (ML-KEM, ML-DSA, QRNG) are the same because the algorithms run locally.

Use cases for local installation:

  • Government / defense — data cannot leave the network
  • Financial services — regulatory requirement to keep all key material on-premise
  • EU sovereign cloud — data residency requirements beyond what cloud-hosted QuantumAPI offers
  • Air-gapped environments — no internet access from the infrastructure network

The Complete Environment Variables

# === AI Provider ===
AI_PROVIDER=openai            # or "azure"
AI_ENDPOINT=https://api.scaleway.ai/v1
AI_KEY=your-key
AI_CHAT_MODEL=mistral-small-3.2-24b-instruct-2506
AI_EMBEDDING_MODEL=bge-multilingual-gemma2

# === PostgreSQL (shared) ===
POSTGRES_HOST=localhost
POSTGRES_PORT=5432
POSTGRES_USER=forge
POSTGRES_PASSWORD=your-password

# === QuantumAPI ===
QUANTUMAPI_KEY=qid_your_key
# QUANTUMAPI_ENDPOINT=https://quantumapi.internal  # Only for local install

# === GitHub ===
GITHUB_TOKEN=ghp_your-token

# === OIDC (Backstage auth) ===
OIDC_METADATA_URL=https://auth.quantumapi.eu/.well-known/openid-configuration
OIDC_CLIENT_ID=your-client-id
OIDC_CLIENT_SECRET=your-client-secret
BACKEND_SECRET=change-this

# === Webhooks ===
AI_CODE_REVIEW_WEBHOOK_SECRET=your-webhook-secret

Same variables as the IDP series, plus QUANTUMAPI_KEY (and optionally QUANTUMAPI_ENDPOINT). The infrastructure plugins don’t need additional env vars — they read from the same AI service config.

The Two Series Together

#IDP SeriesInfra Hub Series
1Why Your IDP Doesn’t HelpYour Infrastructure Has No Catalog
2Teaching Your Catalog to ThinkGolden Path Terraform Modules
3AI-Powered Software TemplatesMulti-tenant Infrastructure
4The AI Code Review PluginPipelines from Backstage
5TechDocs RAGSecrets & Post-Quantum Identities
6The AI Governance DashboardCAB Automation
7AI-Assisted Incident ResponseChat with Your Infrastructure
8The Reference ArchitectureDrift Detection
9This article

The IDP series builds the platform for applications — services, APIs, code. The Infra Hub series extends it for infrastructure — Terraform modules, pipelines, cloud resources. Same Backstage. Same AI service. Same philosophy: AI as the engine, humans in control.

Cost with Infrastructure Plugins

Adding the infrastructure features to the cost estimate from IDP article 8:

FeatureFrequencyTokens/callMonthly cost
IDP features (from series 1)~$9
Terraform scaffolding~5 modules/month~2K input, ~3K output~$0.11
Plan summaries~100 plans/month~4K input, ~1K output~$1.30
Secret scanning12h cycle, ~30 modules~3K input, ~500 output~$0.95
Drift analysisDaily, ~30 modules~4K input, ~1K output~$4.50
Infra chat~300 questions/month~5K input, ~1K output~$4.80
CAB signing~50 approvals/monthN/A (QuantumAPI call)~$0 (included in tier)

Total: ~$21/month for a 20-developer team managing ~30 Terraform modules across multiple clients. The QuantumAPI calls (signing, SSH certs, vault) are included in the business tier.

Security Reminder

The same security gaps from IDP article 8 apply here, plus:

  • No auth on drift/chat/CAB endpoints — the AI service has no authentication. In production, add JWT validation or API key checks.
  • Terraform plan output may contain secrets — the plan text sent to the AI model can include secret values (e.g., old vs new password). Consider scrubbing plan output before sending to the AI. The PII scrubber from the AI in Production series works here too.
  • CAB signatures depend on QuantumAPI availability — if QuantumAPI is down, approvals can’t be signed. The signing endpoint returns 503 and the UI blocks the approval. This is intentional (unsigned = unapproved), but plan for QuantumAPI availability in your SLA calculations.

The Series

ArticleWhat it doesNew Plugin / Endpoint
1. Your Infra Has No CatalogTerraform modules as catalog entitiesCatalog entities
2. Golden Path Terraform ModulesAI-powered module scaffolding/api/scaffold-terraform
3. Multi-tenant InfrastructurePer-client systems, teams, configCatalog model
4. Pipelines from BackstageUnified pipeline UI + CAB workflow/api/pipeline/*
5. Secrets & PQ IdentitiesQuantumVault, SSH certs, secret scanner/api/scan-secrets, /api/ssh/issue
6. CAB AutomationSigned approvals, compliance reports/api/cab/*
7. Chat with Your InfraConversational infrastructure access/api/infra/chat
8. Drift DetectionDetect and explain infrastructure drift/api/drift/*
9. Reference ArchitectureThis article — everything connected

Troubleshooting

In addition to the IDP troubleshooting section:

  • Secret scanner finds nothing — Check that modules have spec.type: terraform-module in the catalog. The scanner filters on this type.
  • CAB signatures fail — Verify QuantumApi:ApiKey is set in the AI service config. The signing endpoint returns 503 with a clear error message.
  • Drift scan shows no results — The GitHub Actions workflow needs terraform init to succeed, which requires cloud credentials. Check the QuantumVault secret IDs in the workflow variables.
  • Infra chat gives empty answers — The chat gathers context from the catalog database and cab_approvals table. If these are empty, the AI has no data to work with. Run the catalog enricher and approve at least one change first.
  • PIPESTATUS not working — If your CI runner uses sh instead of bash, PIPESTATUS doesn’t exist. Use bash explicitly: shell: bash in GitHub Actions.

What’s Next

Two series complete. One Backstage instance. One AI service. 22 endpoints. 5 custom tables. The platform manages both applications and infrastructure, with AI assistance at every step and post-quantum security throughout.

What’s missing? The things we left out on purpose:

Kubernetes admission control. The Quantum-Safe Cloud series mentioned this gap: unsigned images can still be deployed manually. A Ratify admission webhook would reject pods with unsigned images at the cluster level.

Cost management dashboards. The governance dashboard tracks AI costs. But what about infrastructure costs? Cloud spend per client, per module, per environment. That’s a Backstage plugin that reads from Azure Cost Management, AWS Cost Explorer, or Scaleway billing APIs.

Policy as code. The CAB workflow is manual (with AI assistance). Open Policy Agent or Kyverno could automate policy enforcement: “no public storage accounts”, “all AKS clusters must have RBAC enabled”, “no modules without encryption blocks.”

Each of these could be a standalone article or a mini-series. If there’s interest, let me know.

The code is on GitHub: victorZKov/forge.

Victor

If this series helps you, consider buying me a coffee.

This is article 9 — the final article in the Infrastructure Hub series. Previous: Drift Detection.

Comments

Loading comments...