The Infrastructure Hub Reference Architecture

The Problem

You’ve built eight things. A Terraform catalog, golden path templates, multi-tenant client management, pipeline orchestration with CAB workflows, quantum-safe secrets, signed approvals, an infrastructure chat, and drift detection.

Each article showed one piece. Now you need the complete picture: how these pieces fit together, what the database looks like with everything running, how the AI service has grown, and how to deploy the full platform.

The IDP series reference architecture covered the application-focused platform: catalog enrichment, AI scaffolding, code review, TechDocs RAG, governance, and incident response. This article extends that architecture with everything from the Infrastructure Hub series.

The Solution

The Infrastructure Hub adds three layers to the existing IDP platform:

Infrastructure catalog — Terraform modules as first-class entities, with golden path templates and multi-tenant scoping
Infrastructure operations — Pipeline orchestration, CAB workflows with signed approvals, and drift detection
Infrastructure intelligence — AI-powered secret scanning, plan summaries, failure diagnosis, and conversational access to all infrastructure data

All of this runs on the same Backstage instance, the same .NET AI service, and the same PostgreSQL database as the IDP. No new services to deploy — just new plugins, new endpoints, and new tables.

Execute

The Complete Architecture

graph TB
    subgraph "Developer / Platform Team"
        Browser[Browser]
    end

    subgraph "Backstage"
        FE[Frontend - React]
        BE[Backend - Node.js]

        subgraph "IDP Plugins (series 1)"
            P1[Catalog Enricher]
            P2[AI Scaffolder]
            P3[Code Review]
            P4[TechDocs RAG]
            P5[Governance]
            P6[Incident Response]
        end

        subgraph "Infra Hub Plugins (series 2)"
            I1[TF Module Templates]
            I2[Secret Scanner]
            I3[Pipeline Dashboard]
            I4[CAB Workflow]
            I5[Infra Chat]
            I6[Drift Dashboard]
        end
    end

    subgraph "AI Service (.NET)"
        API[".NET Minimal API :5100"]
        E_IDP["/api/enrich, /api/scaffold,\n/api/review, /api/ask,\n/api/incident/analyze"]
        E_INFRA["/api/scan-secrets,\n/api/scaffold-terraform,\n/api/pipeline/*,\n/api/cab/*,\n/api/infra/chat,\n/api/drift/*,\n/api/ssh/issue"]
    end

    subgraph "Data"
        PG[(PostgreSQL + pgvector)]
        QV[QuantumVault - Secrets]
        QS[QuantumAPI - Signing]
        QC[QuantumAPI - SSH Certs]
        AI[AI Provider]
    end

    subgraph "External"
        GH[GitHub API]
        ADO[Azure DevOps]
        GHA[GitHub Actions]
        GL[GitLab CI]
        AZ[Azure / Scaleway / AWS]
    end

    Browser --> FE --> BE
    BE --> P1 & P2 & P3 & P4 & P5 & P6
    BE --> I1 & I2 & I3 & I4 & I5 & I6
    P1 & P2 & P3 & P4 & P5 & P6 --> API
    I2 & I3 & I4 & I5 & I6 --> API
    API --> E_IDP & E_INFRA
    API --> PG
    API --> AI
    API --> QV & QS & QC
    I3 --> ADO & GHA & GL
    I6 --> AZ

New Endpoints in the AI Service

The Infrastructure Hub adds these endpoints to the existing AI service:

Endpoint	Article	Purpose
`POST /api/scaffold-terraform`	2	Generate Terraform module from description
`POST /api/pipeline/summarize-plan`	4	Human-readable Terraform plan summary
`POST /api/pipeline/diagnose`	4	AI diagnosis of pipeline failures
`POST /api/pipeline/risk-assessment`	4	Risk assessment for change requests
`POST /api/pipeline/rollback-plan`	4	Generate rollback plan for a change
`POST /api/scan-secrets`	5	Scan Terraform files for secret issues
`POST /api/ssh/issue`	5	Issue ML-DSA SSH certificate
`POST /api/cab/approve`	6	Sign CAB approval with ML-DSA
`GET /api/cab/verify/{id}`	6	Verify approval signature
`POST /api/cab/seal-evidence`	6	Seal evidence package with signature
`GET /api/cab/report`	6	Generate compliance report
`POST /api/infra/chat`	7	Multi-turn infrastructure conversation
`POST /api/drift/analyze`	8	Analyze Terraform plan for drift
`GET /api/drift/results`	8	Fetch all drift scan results

Combined with the IDP series endpoints, the AI service now has 22 endpoints in one Program.cs. The pattern is the same for every endpoint: read config, create client, build prompt with context, call model, return structured JSON.

New Database Tables

Three tables added in this series:

-- Article 6: Signed CAB approvals
CREATE TABLE cab_approvals (
    id SERIAL PRIMARY KEY,
    change_request_id VARCHAR(100) NOT NULL UNIQUE,
    approved_by VARCHAR(255) NOT NULL,
    approved_at TIMESTAMPTZ NOT NULL,
    module VARCHAR(255) NOT NULL,
    client VARCHAR(100),
    risk_level VARCHAR(20) NOT NULL,
    plan_hash VARCHAR(64) NOT NULL,
    payload_json TEXT NOT NULL,
    signature TEXT NOT NULL,
    key_id VARCHAR(100) NOT NULL
);

CREATE INDEX idx_approvals_module ON cab_approvals(module);
CREATE INDEX idx_approvals_client ON cab_approvals(client);
CREATE INDEX idx_approvals_date ON cab_approvals(approved_at);

-- Article 8: Drift detection results
CREATE TABLE drift_results (
    id SERIAL PRIMARY KEY,
    module VARCHAR(255) NOT NULL UNIQUE,
    client VARCHAR(100),
    drift_detected BOOLEAN NOT NULL,
    resource_count INTEGER NOT NULL DEFAULT 0,
    risk VARCHAR(20) NOT NULL DEFAULT 'none',
    summary TEXT,
    analysis_json TEXT,
    detected_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE INDEX idx_drift_module ON drift_results(module);
CREATE INDEX idx_drift_client ON drift_results(client);
CREATE INDEX idx_drift_risk ON drift_results(risk);

Combined with the IDP series tables:

Table	Series	Purpose
`doc_chunks`	IDP art. 5	Vector embeddings for RAG
`ai_usage_log`	IDP art. 6	Governance — usage tracking
`ai_policies`	IDP art. 6	Governance — per-team policies
`cab_approvals`	Infra art. 6	Signed CAB approvals
`drift_results`	Infra art. 8	Latest drift scan per module

Five tables. One PostgreSQL instance (with pgvector extension). Backstage uses its own tables for the catalog, and the AI service uses these five for intelligence and operations.

New Backstage Plugin Registration

Add the infrastructure plugins to packages/backend/src/index.ts:

// --- Infrastructure Hub plugins (series 2) ---

// Modules (extend existing plugins)
import { secretScannerModule } from '@internal/plugin-secret-scanner';
backend.add(secretScannerModule);                       // Article 5

// Standalone plugins (own routes)
import { aiIncidentPlugin } from '@internal/plugin-ai-incident';
backend.add(aiIncidentPlugin);                          // IDP Article 7

// Frontend-only plugins (registered in App.tsx, not here):
// - Infra Chat (/infra-chat)                           // Article 7
// - Drift Dashboard (/drift)                           // Article 8
// - CAB Review (/cab)                                  // Article 4+6
// - Governance Dashboard (/ai-governance)              // IDP Article 6

The infrastructure plugins follow the same distinction as the IDP:

Modules (createBackendModule): secret scanner extends the catalog
Standalone plugins (createBackendPlugin): pipeline dashboard, CAB workflow have their own routes
Frontend-only: infra chat, drift dashboard, governance dashboard read from the AI service through the proxy

QuantumAPI Integration Map

QuantumAPI appears in three roles across the Infrastructure Hub:

QuantumVault (Secrets)
├── Pipeline credentials (art. 5) — ARM_CLIENT_SECRET, DB_PASSWORD, etc.
├── Terraform state encryption keys (art. 5) — ML-KEM wrapped AES keys
├── Cosign signing keys (quantum-05) — for image signing
└── Bootstrap: only QUANTUMAPI_KEY in CI/CD platforms

QuantumAPI Signing (ML-DSA)
├── CAB approval signatures (art. 6) — every approval cryptographically signed
├── Evidence package sealing (art. 6) — tamper-proof audit trail
└── Verification endpoint — auditors can verify without internal access

QuantumAPI SSH (ML-DSA Certificates)
├── Short-lived certificates (art. 5) — 8h validity, auto-expire
├── CA trust model — hosts trust QuantumAPI CA, not individual keys
└── Backstage widget — engineers request access from the catalog

QuantumAPI Local Installation

For sovereign cloud, air-gapped environments, or organizations that can’t send data to external APIs, QuantumAPI offers a local installation option.

The local install runs the same services — Vault, Signing, SSH CA, Encryption — inside your own infrastructure. The API is identical. Your code doesn’t change. You point the endpoints at https://quantumapi.internal instead of https://api.quantumapi.eu.

Configuration change in the AI service:

# Cloud (default)
QUANTUMAPI__APIKEY=qid_your_key
# Endpoints default to api.quantumapi.eu

# Local installation
QUANTUMAPI__APIKEY=qid_your_local_key
QUANTUMAPI__ENDPOINT=https://quantumapi.internal

For the qapi CLI in pipelines:

# Cloud
export QAPI_API_KEY=qid_your_key

# Local
export QAPI_API_KEY=qid_your_local_key
export QAPI_ENDPOINT=https://quantumapi.internal

Everything in this series — state encryption, signed approvals, SSH certificates, secret scanning — works the same way with a local install. The cryptographic guarantees (ML-KEM, ML-DSA, QRNG) are the same because the algorithms run locally.

Use cases for local installation:

Government / defense — data cannot leave the network
Financial services — regulatory requirement to keep all key material on-premise
EU sovereign cloud — data residency requirements beyond what cloud-hosted QuantumAPI offers
Air-gapped environments — no internet access from the infrastructure network

The Complete Environment Variables

# === AI Provider ===
AI_PROVIDER=openai            # or "azure"
AI_ENDPOINT=https://api.scaleway.ai/v1
AI_KEY=your-key
AI_CHAT_MODEL=mistral-small-3.2-24b-instruct-2506
AI_EMBEDDING_MODEL=bge-multilingual-gemma2

# === PostgreSQL (shared) ===
POSTGRES_HOST=localhost
POSTGRES_PORT=5432
POSTGRES_USER=forge
POSTGRES_PASSWORD=your-password

# === QuantumAPI ===
QUANTUMAPI_KEY=qid_your_key
# QUANTUMAPI_ENDPOINT=https://quantumapi.internal  # Only for local install

# === GitHub ===
GITHUB_TOKEN=ghp_your-token

# === OIDC (Backstage auth) ===
OIDC_METADATA_URL=https://auth.quantumapi.eu/.well-known/openid-configuration
OIDC_CLIENT_ID=your-client-id
OIDC_CLIENT_SECRET=your-client-secret
BACKEND_SECRET=change-this

# === Webhooks ===
AI_CODE_REVIEW_WEBHOOK_SECRET=your-webhook-secret

Same variables as the IDP series, plus QUANTUMAPI_KEY (and optionally QUANTUMAPI_ENDPOINT). The infrastructure plugins don’t need additional env vars — they read from the same AI service config.

The Two Series Together

#	IDP Series	Infra Hub Series
1	Why Your IDP Doesn’t Help	Your Infrastructure Has No Catalog
2	Teaching Your Catalog to Think	Golden Path Terraform Modules
3	AI-Powered Software Templates	Multi-tenant Infrastructure
4	The AI Code Review Plugin	Pipelines from Backstage
5	TechDocs RAG	Secrets & Post-Quantum Identities
6	The AI Governance Dashboard	CAB Automation
7	AI-Assisted Incident Response	Chat with Your Infrastructure
8	The Reference Architecture	Drift Detection
9	—	This article

The IDP series builds the platform for applications — services, APIs, code. The Infra Hub series extends it for infrastructure — Terraform modules, pipelines, cloud resources. Same Backstage. Same AI service. Same philosophy: AI as the engine, humans in control.

Cost with Infrastructure Plugins

Adding the infrastructure features to the cost estimate from IDP article 8:

Feature	Frequency	Tokens/call	Monthly cost
IDP features (from series 1)	—	—	~$9
Terraform scaffolding	~5 modules/month	~2K input, ~3K output	~$0.11
Plan summaries	~100 plans/month	~4K input, ~1K output	~$1.30
Secret scanning	12h cycle, ~30 modules	~3K input, ~500 output	~$0.95
Drift analysis	Daily, ~30 modules	~4K input, ~1K output	~$4.50
Infra chat	~300 questions/month	~5K input, ~1K output	~$4.80
CAB signing	~50 approvals/month	N/A (QuantumAPI call)	~$0 (included in tier)

Total: ~$21/month for a 20-developer team managing ~30 Terraform modules across multiple clients. The QuantumAPI calls (signing, SSH certs, vault) are included in the business tier.

Security Reminder

The same security gaps from IDP article 8 apply here, plus:

No auth on drift/chat/CAB endpoints — the AI service has no authentication. In production, add JWT validation or API key checks.
Terraform plan output may contain secrets — the plan text sent to the AI model can include secret values (e.g., old vs new password). Consider scrubbing plan output before sending to the AI. The PII scrubber from the AI in Production series works here too.
CAB signatures depend on QuantumAPI availability — if QuantumAPI is down, approvals can’t be signed. The signing endpoint returns 503 and the UI blocks the approval. This is intentional (unsigned = unapproved), but plan for QuantumAPI availability in your SLA calculations.

The Series

Article	What it does	New Plugin / Endpoint
1. Your Infra Has No Catalog	Terraform modules as catalog entities	Catalog entities
2. Golden Path Terraform Modules	AI-powered module scaffolding	`/api/scaffold-terraform`
3. Multi-tenant Infrastructure	Per-client systems, teams, config	Catalog model
4. Pipelines from Backstage	Unified pipeline UI + CAB workflow	`/api/pipeline/*`
5. Secrets & PQ Identities	QuantumVault, SSH certs, secret scanner	`/api/scan-secrets`, `/api/ssh/issue`
6. CAB Automation	Signed approvals, compliance reports	`/api/cab/*`
7. Chat with Your Infra	Conversational infrastructure access	`/api/infra/chat`
8. Drift Detection	Detect and explain infrastructure drift	`/api/drift/*`
9. Reference Architecture	This article — everything connected	—

Troubleshooting

In addition to the IDP troubleshooting section:

Secret scanner finds nothing — Check that modules have spec.type: terraform-module in the catalog. The scanner filters on this type.
CAB signatures fail — Verify QuantumApi:ApiKey is set in the AI service config. The signing endpoint returns 503 with a clear error message.
Drift scan shows no results — The GitHub Actions workflow needs terraform init to succeed, which requires cloud credentials. Check the QuantumVault secret IDs in the workflow variables.
Infra chat gives empty answers — The chat gathers context from the catalog database and cab_approvals table. If these are empty, the AI has no data to work with. Run the catalog enricher and approve at least one change first.
PIPESTATUS not working — If your CI runner uses sh instead of bash, PIPESTATUS doesn’t exist. Use bash explicitly: shell: bash in GitHub Actions.

What’s Next

Two series complete. One Backstage instance. One AI service. 22 endpoints. 5 custom tables. The platform manages both applications and infrastructure, with AI assistance at every step and post-quantum security throughout.

What’s missing? The things we left out on purpose:

Kubernetes admission control. The Quantum-Safe Cloud series mentioned this gap: unsigned images can still be deployed manually. A Ratify admission webhook would reject pods with unsigned images at the cluster level.

Cost management dashboards. The governance dashboard tracks AI costs. But what about infrastructure costs? Cloud spend per client, per module, per environment. That’s a Backstage plugin that reads from Azure Cost Management, AWS Cost Explorer, or Scaleway billing APIs.

Policy as code. The CAB workflow is manual (with AI assistance). Open Policy Agent or Kyverno could automate policy enforcement: “no public storage accounts”, “all AKS clusters must have RBAC enabled”, “no modules without encryption blocks.”

Each of these could be a standalone article or a mini-series. If there’s interest, let me know.

The code is on GitHub: victorZKov/forge.

Victor

If this series helps you, consider buying me a coffee.

This is article 9 — the final article in the Infrastructure Hub series. Previous: Drift Detection.