The Infrastructure Hub -- Part 2

Golden Path Terraform Modules

#platform-engineering #backstage #terraform #golden-path #scaffolder

The Problem

You ask three engineers to create a Terraform module. You get three different things.

Engineer A creates a single main.tf with everything in one file. No variables file, no outputs, no README. It works, but nobody else can use it without reading the code line by line.

Engineer B follows the HashiCorp structure — main.tf, variables.tf, outputs.tf, versions.tf. Has a README. But no tests, no examples, no CI pipeline. The module works today. In six months, someone updates the Azure provider and it breaks silently.

Engineer C creates a module with tests, examples, documentation, and a CI pipeline. It takes them a week. The next module they create? Also a week. Because they start from scratch every time.

The problem is not that engineers don’t know how to structure a module. The problem is that there’s no golden path — no standard template that gives you the right structure, the right tests, the right CI, and the right documentation from the start.

And if you manage infrastructure for multiple clients (like an MSP), the problem is worse. Client A uses Azure. Client B uses Scaleway. Client C uses AWS. Each cloud has different provider patterns, different resource naming, different testing approaches. Without templates, every module is a snowflake.

The Solution

Backstage Scaffolder templates that generate Terraform modules with the right structure for each cloud provider. But not just the structure — the actual resources too.

Here’s the key idea: the parameters you give the template are concrete. Cloud provider, resource type, required features. There’s nothing ambiguous. So we can ask AI to generate the main.tf with real resources, real data sources, real outputs — because the input is deterministic. We’re not asking “build me something cool.” We’re saying “create an Azure storage account module with private endpoints and lifecycle policies using azurerm 4.x.” The AI follows the latest HashiCorp patterns and the provider’s documentation.

One click, you get:

  • Standard folder structure (main.tf, variables.tf, outputs.tf, versions.tf)
  • Pre-configured provider block with the correct version constraints
  • AI-generated resources in main.tf based on your description — not a TODO placeholder
  • A README.md with inputs/outputs table and usage example
  • A catalog-info.yaml already filled with the right metadata
  • TechDocs configuration (mkdocs.yml + docs/ folder)
  • A basic test structure (using Terratest or terraform validate)
  • A CI pipeline template (GitHub Actions, Azure DevOps, or GitLab CI)

The engineer picks a cloud, describes what the module should create, and the scaffolder generates everything — structure, code, docs, CI, catalog entry. The engineer reviews the generated code, adjusts if needed, and pushes. The 90% that is standard boilerplate is done in seconds.

For MSPs, you add a “client” parameter. The module gets tagged with the client name, registered under the right system in the catalog, and the CI pipeline deploys to the client’s subscription/project.

Execute

The Multi-Cloud Template

This is a single Backstage template that handles Azure, Scaleway, AWS, and GCP. The cloud selection drives which provider block, which examples, and which CI template gets generated.

# templates/terraform-module-golden-path/template.yaml
apiVersion: scaffolder.backstage.io/v1beta3
kind: Template
metadata:
  name: terraform-module-golden-path
  title: Golden Path Terraform Module
  description: Create a new Terraform module with standard structure, docs, tests, and CI
  tags:
    - terraform
    - infrastructure
    - golden-path
spec:
  owner: team-platform
  type: terraform-module

  parameters:
    - title: Module Details
      required:
        - name
        - description
        - cloud
      properties:
        name:
          title: Module Name
          type: string
          description: "kebab-case name (e.g., vnet, storage-account, k8s-cluster)"
          pattern: '^[a-z][a-z0-9-]*$'
        description:
          title: Description
          type: string
          description: "What does this module create?"
        cloud:
          title: Cloud Provider
          type: string
          enum: ['azure', 'scaleway', 'aws', 'gcp']
          enumNames: ['Azure', 'Scaleway', 'AWS', 'GCP']
        lifecycle:
          title: Lifecycle
          type: string
          enum: ['experimental', 'production', 'deprecated']
          default: experimental
        owner:
          title: Owner
          type: string
          description: "Team that owns this module"
          default: team-platform
        client:
          title: Client (MSP only)
          type: string
          description: "Leave empty for internal modules"

    - title: CI/CD
      properties:
        ciProvider:
          title: CI Provider
          type: string
          enum: ['github-actions', 'azure-devops', 'gitlab-ci']
          enumNames: ['GitHub Actions', 'Azure DevOps', 'GitLab CI']
          default: github-actions
        includeTests:
          title: Include Terratest
          type: boolean
          default: true

    - title: Repository
      required:
        - repoUrl
      properties:
        repoUrl:
          title: Repository Location
          type: string
          ui:field: RepoUrlPicker
          ui:options:
            allowedHosts:
              - github.com

  steps:
    - id: fetch-skeleton
      name: Generate module skeleton
      action: fetch:template
      input:
        url: ./skeleton
        values:
          name: ${{ parameters.name }}
          description: ${{ parameters.description }}
          cloud: ${{ parameters.cloud }}
          owner: ${{ parameters.owner }}
          client: ${{ parameters.client }}
          lifecycle: ${{ parameters.lifecycle }}
          ciProvider: ${{ parameters.ciProvider }}
          includeTests: ${{ parameters.includeTests }}

    - id: ai-generate
      name: Generate Terraform resources with AI
      action: forge:ai-scaffold-terraform
      input:
        cloud: ${{ parameters.cloud }}
        name: ${{ parameters.name }}
        description: ${{ parameters.description }}
        workspacePath: ${{ steps['fetch-skeleton'].output.workspacePath }}

    - id: publish
      name: Publish to GitHub
      action: publish:github
      input:
        allowedHosts: ['github.com']
        repoUrl: ${{ parameters.repoUrl }}
        description: "Terraform module: ${{ parameters.description }}"
        defaultBranch: main

    - id: register
      name: Register in Catalog
      action: catalog:register
      input:
        repoContentsUrl: ${{ steps.publish.output.repoContentsUrl }}
        catalogInfoPath: /catalog-info.yaml

  output:
    links:
      - title: Repository
        url: ${{ steps.publish.output.remoteUrl }}
      - title: Catalog Entry
        icon: catalog
        entityRef: ${{ steps.register.output.entityRef }}

The Skeleton

The skeleton uses Nunjucks templates. The cloud parameter drives the provider configuration:

# skeleton/versions.tf
terraform {
  required_version = ">= 1.8"

  required_providers {
{%- if values.cloud == 'azure' %}
    azurerm = {
      source  = "hashicorp/azurerm"
      version = "~> 4.0"
    }
{%- elif values.cloud == 'scaleway' %}
    scaleway = {
      source  = "scaleway/scaleway"
      version = "~> 2.0"
    }
{%- elif values.cloud == 'aws' %}
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
{%- elif values.cloud == 'gcp' %}
    google = {
      source  = "hashicorp/google"
      version = "~> 5.0"
    }
{%- endif %}
  }
}
# skeleton/variables.tf
variable "name" {
  type        = string
  description = "Resource name"
}

{% if values.cloud == 'azure' -%}
variable "location" {
  type        = string
  default     = "westeurope"
  description = "Azure region"
}

variable "resource_group_name" {
  type        = string
  description = "Resource group to deploy into"
}
{%- elif values.cloud == 'scaleway' -%}
variable "zone" {
  type        = string
  default     = "fr-par-1"
  description = "Scaleway zone"
}

variable "project_id" {
  type        = string
  description = "Scaleway project ID"
}
{%- elif values.cloud == 'aws' -%}
variable "region" {
  type        = string
  default     = "eu-west-1"
  description = "AWS region"
}
{%- elif values.cloud == 'gcp' -%}
variable "region" {
  type        = string
  default     = "europe-west1"
  description = "GCP region"
}

variable "project" {
  type        = string
  description = "GCP project ID"
}
{%- endif %}

variable "tags" {
  type        = map(string)
  default     = {}
  description = "Resource tags"
}

The skeleton’s main.tf starts empty — it’s a placeholder that the AI step will overwrite:

# skeleton/main.tf
# This file will be replaced by AI-generated resources
# skeleton/outputs.tf
# This file will be replaced by AI-generated outputs

The AI Endpoint

The AI service from the IDP series gets a new endpoint: /api/scaffold-terraform. It receives the cloud, module name, and description, and returns the complete main.tf, variables.tf, and outputs.tf.

The prompt is specific and constrained — we’re not asking the AI to be creative. We tell it exactly what provider to use, what version, and what the module should create. The result is standard Terraform code that follows HashiCorp’s module structure.

// In the AI service — POST /api/scaffold-terraform
app.MapPost("/api/scaffold-terraform", async (
    ScaffoldTerraformRequest request,
    OpenAIClient client,
    IConfiguration config) =>
{
    var chatClient = client.GetChatClient(
        config["AI:ChatModel"] ?? "mistral-small-3.2-24b-instruct-2506");

    var providerDocs = request.Cloud switch
    {
        "azure" => "HashiCorp azurerm provider 4.x. Use azurerm_* resources.",
        "scaleway" => "Scaleway provider 2.x. Use scaleway_* resources.",
        "aws" => "HashiCorp aws provider 5.x. Use aws_* resources.",
        "gcp" => "HashiCorp google provider 5.x. Use google_* resources.",
        _ => throw new ArgumentException($"Unknown cloud: {request.Cloud}")
    };

    var prompt = $"""
    Generate a Terraform module for {request.Cloud}.
    Module name: {request.Name}
    Description: {request.Description}
    Provider: {providerDocs}

    Return a JSON object with three keys:
    - "main": the main.tf content with all resources
    - "variables": the variables.tf content (include name, tags, and cloud-specific variables)
    - "outputs": the outputs.tf content with all useful outputs

    Rules:
    - Use the latest resource syntax for the provider
    - Include descriptions for all variables and outputs
    - Add sensible defaults where appropriate
    - Use variable references, not hardcoded values
    - Follow HashiCorp naming conventions
    - Do not include provider blocks or terraform blocks (they are in versions.tf)
    - Do not guess features not mentioned in the description
    """;

    var completion = await chatClient.CompleteChatAsync(prompt);
    var content = completion.Value.Content[0].Text;

    var json = ExtractJson(content);
    var result = JsonSerializer.Deserialize<ScaffoldTerraformResult>(json);

    return Results.Ok(result);
});

record ScaffoldTerraformRequest(string Cloud, string Name, string Description);
record ScaffoldTerraformResult(string Main, string Variables, string Outputs);

And the Backstage custom action that calls it:

// plugins/ai-scaffolder/src/actions/aiScaffoldTerraform.ts
import { createTemplateAction } from '@backstage/plugin-scaffolder-node';
import { z } from 'zod';
import fs from 'fs';
import path from 'path';

export function createAiScaffoldTerraformAction(options: { aiServiceUrl: string }) {
  return createTemplateAction({
    id: 'forge:ai-scaffold-terraform',
    description: 'Generate Terraform resources using AI',
    schema: {
      input: z.object({
        cloud: z.string().describe('Cloud provider: azure, scaleway, aws, gcp'),
        name: z.string().describe('Module name'),
        description: z.string().describe('What the module should create'),
        workspacePath: z.string().describe('Path to the workspace'),
      }),
    },
    async handler(ctx) {
      ctx.logger.info(`Generating Terraform for ${ctx.input.cloud}: ${ctx.input.description}`);

      const response = await fetch(`${options.aiServiceUrl}/api/scaffold-terraform`, {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({
          cloud: ctx.input.cloud,
          name: ctx.input.name,
          description: ctx.input.description,
        }),
      });

      if (!response.ok) {
        throw new Error(`AI service returned ${response.status}`);
      }

      const result = await response.json();
      const ws = ctx.input.workspacePath || ctx.workspacePath;

      // Overwrite the placeholder files with AI-generated content
      fs.writeFileSync(path.join(ws, 'main.tf'), result.main);
      fs.writeFileSync(path.join(ws, 'variables.tf'), result.variables);
      fs.writeFileSync(path.join(ws, 'outputs.tf'), result.outputs);

      ctx.logger.info('Terraform files generated by AI');
    },
  });
}

This is the same pattern from article 3 of the IDP series. The AI generates code based on concrete parameters. The engineer reviews the result — the AI proposes, the human approves.

The Catalog Entry

The catalog-info.yaml is pre-filled with cloud metadata:

# skeleton/catalog-info.yaml
apiVersion: backstage.io/v1alpha1
kind: Component
metadata:
  name: tf-${{ values.cloud }}-${{ values.name }}
  title: "${{ values.description }}"
  description: "${{ values.description }}"
  tags:
    - terraform
    - ${{ values.cloud }}
{%- if values.client %}
    - client-${{ values.client }}
{%- endif %}
  annotations:
    github.com/project-slug: {% raw %}${{ github.repository }}{% endraw %}
    backstage.io/techdocs-ref: dir:.
spec:
  type: terraform-module
  lifecycle: ${{ values.lifecycle }}
  owner: ${{ values.owner }}
{%- if values.client %}
  system: client-${{ values.client }}-infrastructure
{%- else %}
  system: infrastructure
{%- endif %}

The README

Generated with the right sections for a Terraform module:

# skeleton/README.md
# tf-${{ values.cloud }}-${{ values.name }}

${{ values.description }}

## Cloud Provider

${{ values.cloud | capitalize }}

## Usage

```hcl
module "${{ values.name | replace('-', '_') }}" {
  source = "github.com/YOUR_ORG/tf-${{ values.cloud }}-${{ values.name }}"
{% if values.cloud == 'azure' %}
  name                = "my-resource"
  location            = "westeurope"
  resource_group_name = "rg-my-project"
{% elif values.cloud == 'scaleway' %}
  name       = "my-resource"
  zone       = "fr-par-1"
  project_id = "your-project-id"
{% elif values.cloud == 'aws' %}
  name   = "my-resource"
  region = "eu-west-1"
{% elif values.cloud == 'gcp' %}
  name    = "my-resource"
  region  = "europe-west1"
  project = "your-project-id"
{% endif %}
  tags = {
    environment = "production"
    managed-by  = "terraform"
  }
}

Inputs

NameTypeDefaultDescription
namestringResource name
{% if values.cloud == ‘azure’ -%}
locationstringwesteuropeAzure region
resource_group_namestringResource group
{%- elif values.cloud == ‘scaleway’ -%}
zonestringfr-par-1Scaleway zone
project_idstringScaleway project ID
{%- elif values.cloud == ‘aws’ -%}
regionstringeu-west-1AWS region
{%- elif values.cloud == ‘gcp’ -%}
regionstringeurope-west1GCP region
projectstringGCP project ID
{%- endif %}
tagsmap(string){}Resource tags

Outputs

NameDescription
idResource ID

### The CI Pipeline

For GitHub Actions:

```yaml
# skeleton/.github/workflows/terraform.yml (only if ciProvider == 'github-actions')
name: Terraform

on:
  pull_request:
    branches: [main]
  push:
    branches: [main]

jobs:
  validate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: hashicorp/setup-terraform@v3

      - name: Terraform Format
        run: terraform fmt -check -recursive

      - name: Terraform Init
        run: terraform init -backend=false

      - name: Terraform Validate
        run: terraform validate

{% if values.includeTests %}
  test:
    needs: validate
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-go@v5
        with:
          go-version: '1.22'

      - name: Run Terratest
        working-directory: test
        run: go test -v -timeout 30m
{% endif %}

TechDocs

Every module gets documentation that Backstage can render:

# skeleton/mkdocs.yml
site_name: tf-${{ values.cloud }}-${{ values.name }}
docs_dir: docs
plugins:
  - techdocs-core
# skeleton/docs/index.md
# tf-${{ values.cloud }}-${{ values.name }}

${{ values.description }}

## Cloud Provider

**${{ values.cloud | capitalize }}**

## Getting Started

See the README for usage examples and input/output documentation.

## Owner

${{ values.owner }}

{% if values.client -%}
## Client

${{ values.client }}
{%- endif %}

What It Looks Like

An engineer opens Backstage, clicks “Create”, selects “Golden Path Terraform Module”:

  1. Module Details: name = storage-account, cloud = Azure, description = “Creates a storage account with private endpoints”, owner = team-platform
  2. CI/CD: GitHub Actions, include Terratest = yes
  3. Repository: github.com/my-org/tf-azurerm-storage-account

Clicks “Create”. In 15 seconds:

  • New repo tf-azurerm-storage-account on GitHub
  • Standard structure: main.tf, variables.tf, outputs.tf, versions.tf with azurerm ~> 4.0
  • main.tf already has the azurerm_storage_account resource, azurerm_private_endpoint, lifecycle policy — generated by AI based on the description
  • variables.tf has all the inputs the resources need — not just the cloud defaults, but storage-specific variables like account_tier, replication_type, allowed_subnet_ids
  • outputs.tf exports the storage account ID, primary endpoint, private endpoint IP — all useful outputs for downstream modules
  • README with usage example and input/output tables
  • GitHub Actions workflow for terraform validate + Terratest
  • mkdocs.yml + docs ready for TechDocs
  • catalog-info.yaml registered in Backstage

The engineer reviews the AI-generated code, adjusts if needed, and pushes. The module is production-ready from birth — not after a week of boilerplate work.

For the MSP scenario: the same template, but with client = acme-corp. The module gets tagged client-acme-corp, registered under client-acme-corp-infrastructure system, and visible only to the team working on that client.

Checklist

  • Template registered in Backstage (/create page)
  • Azure, Scaleway, AWS, and GCP provider blocks generated correctly
  • AI generates main.tf with real resources matching the description
  • AI generates variables.tf and outputs.tf specific to the resources
  • catalog-info.yaml includes cloud tag and client tag (if MSP)
  • README has usage example with cloud-specific variables
  • GitHub Actions workflow runs terraform validate
  • TechDocs configuration generates docs in Backstage
  • Module appears in catalog after scaffolding
  • Generated code passes terraform validate

Challenge

Before the next article:

  1. Create a module for each cloud you use (Azure, Scaleway, AWS, or GCP)
  2. Check the catalog — can you filter by cloud provider?
  3. Open the TechDocs for one of them — does it render?

In the next article, we build Multi-tenant Infrastructure — how the same Backstage instance serves both internal DevOps teams and managed services clients. Different catalogs, different templates, different approval workflows, one platform.

The full code is on GitHub.

If this series helps you, consider buying me a coffee.

This is article 2 of the Infrastructure Hub series. Previous: Your Infrastructure Has No Catalog. Next: Multi-tenant Infrastructure — one platform, many clients.

Comments

Loading comments...