The Infrastructure Hub -- Part 3

Multi-tenant Infrastructure — One Platform, Many Clients

#platform-engineering #backstage #multi-tenant #msp #infrastructure

The Problem

You work at a managed services provider. Or maybe you’re the DevOps team inside a company that serves multiple business units. Either way, you manage infrastructure for more than one client.

Client A runs on Azure. Client B is on Scaleway. Client C has a mix of AWS and Azure. Each client has their own subscriptions, their own naming conventions, their own security requirements, and their own approval process for changes.

Today you track all of this in:

  • A spreadsheet that’s always outdated
  • A shared OneNote/Confluence page that nobody maintains
  • Tickets in ServiceNow or Jira that mix client context with task details
  • Your head (and your colleagues’ heads)

When a new engineer joins the team, they spend two weeks learning which client uses what. When you need to answer “how many AKS clusters do we manage across all clients?”, you open 15 Azure portals and count manually.

This doesn’t scale. And it’s dangerous — because one wrong terraform apply in the wrong subscription can affect the wrong client.

The Solution

Backstage already supports multi-tenancy through its catalog model. The key concepts:

  • Systems represent client environments (e.g., client-acme-infrastructure)
  • Components are the Terraform modules deployed for each client
  • Groups represent teams responsible for each client
  • Templates can be scoped by client — the golden path from article 2 already has a client field

The same Backstage instance serves everyone. But each team sees their clients, their modules, their history. And you — as the platform team — see everything.

For an internal DevOps team (not an MSP), the same model works: replace “client” with “business unit” or “product team”. The architecture is the same.

Execute

Step 1: Define client systems

Each client gets a System entity in the catalog. This groups all their infrastructure together:

# catalog/systems/client-acme.yaml
apiVersion: backstage.io/v1alpha1
kind: System
metadata:
  name: client-acme-infrastructure
  title: "ACME Corp — Infrastructure"
  description: "All infrastructure managed for ACME Corp"
  tags:
    - client
    - acme
    - azure
  annotations:
    backstage.io/techdocs-ref: dir:.
  links:
    - url: https://portal.azure.com
      title: Azure Portal (ACME subscription)
spec:
  owner: team-acme
  domain: managed-services
---
apiVersion: backstage.io/v1alpha1
kind: System
metadata:
  name: client-globex-infrastructure
  title: "Globex — Infrastructure"
  description: "All infrastructure managed for Globex"
  tags:
    - client
    - globex
    - scaleway
spec:
  owner: team-globex
  domain: managed-services

Step 2: Define client teams

Each client has an assigned team. This controls ownership and visibility:

# catalog/groups/teams.yaml
apiVersion: backstage.io/v1alpha1
kind: Group
metadata:
  name: team-acme
  title: "Team ACME"
  description: "Engineers assigned to ACME Corp"
spec:
  type: team
  children: []
  members:
    - victor.zaragoza
    - sarah.chen
---
apiVersion: backstage.io/v1alpha1
kind: Group
metadata:
  name: team-globex
  title: "Team Globex"
  description: "Engineers assigned to Globex"
spec:
  type: team
  children: []
  members:
    - victor.zaragoza
    - james.wilson
---
apiVersion: backstage.io/v1alpha1
kind: Group
metadata:
  name: team-platform
  title: "Platform Team"
  description: "Manages the platform itself — sees all clients"
spec:
  type: team
  children: []
  members:
    - victor.zaragoza

Step 3: Client-scoped modules

When you scaffold a module for a client (using the Golden Path template), the client field drives the system and tags:

# This is what the scaffolder generates for client = acme
apiVersion: backstage.io/v1alpha1
kind: Component
metadata:
  name: tf-azurerm-vnet
  title: "Azure Virtual Network for ACME"
  description: "Hub VNet with 3 subnets, peering to spoke VNets"
  tags:
    - terraform
    - azure
    - client-acme
  annotations:
    github.com/project-slug: victorZKov/tf-azurerm-vnet
spec:
  type: terraform-module
  lifecycle: production
  owner: team-acme
  system: client-acme-infrastructure

The system: client-acme-infrastructure links this module to the ACME system. In the catalog, you can filter by system to see all of ACME’s infrastructure in one place.

Step 4: Client-specific configuration

Each client has different defaults — Azure subscriptions, naming conventions, allowed regions. Store these as a config entity in the catalog:

# catalog/clients/client-acme-config.yaml
apiVersion: backstage.io/v1alpha1
kind: Resource
metadata:
  name: client-acme-config
  title: "ACME — Configuration"
  description: "Default configuration for ACME infrastructure"
  tags:
    - config
    - client-acme
spec:
  type: client-config
  owner: team-acme
  system: client-acme-infrastructure
  dependsOn: []
  profile:
    cloud: azure
    subscription: "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
    defaultRegion: westeurope
    namingPrefix: "acme"
    namingConvention: "{prefix}-{env}-{resource}-{instance}"
    environments:
      - name: production
        approvalRequired: true
        approvers: ["ciso@acme.com", "cto@acme.com"]
      - name: staging
        approvalRequired: false
      - name: development
        approvalRequired: false
    allowedRegions:
      - westeurope
      - northeurope
    tags:
      managed-by: "victorz-msp"
      client: "acme"

This configuration feeds into:

  • The scaffolder — when creating modules for ACME, it pre-fills subscription, region, and naming
  • The AI enricher — when analyzing ACME’s code, it knows the naming conventions
  • The CAB workflow (article 6) — production changes need CISO + CTO approval

Step 5: The platform team dashboard

The platform team needs a global view. We extend the Governance Dashboard to show client-level metrics:

// plugins/ai-governance/src/components/ClientOverview.tsx
import React, { useEffect, useState } from 'react';
import {
  Table, TableColumn, InfoCard,
} from '@backstage/core-components';
import { useApi, discoveryApiRef, fetchApiRef } from '@backstage/core-plugin-api';

interface ClientSummary {
  client: string;
  cloud: string;
  moduleCount: number;
  lastChange: string;
  driftStatus: string;
}

export const ClientOverview = () => {
  const discoveryApi = useApi(discoveryApiRef);
  const fetchApi = useApi(fetchApiRef);
  const [clients, setClients] = useState<ClientSummary[]>([]);

  useEffect(() => {
    const load = async () => {
      const catalogUrl = await discoveryApi.getBaseUrl('catalog');
      const res = await fetchApi.fetch(
        `${catalogUrl}/entities?filter=kind=system,spec.domain=managed-services`,
      );
      const systems = await res.json();

      const summaries: ClientSummary[] = systems.map((s: any) => ({
        client: s.metadata.name.replace('-infrastructure', '').replace('client-', ''),
        cloud: s.metadata.tags?.find((t: string) =>
          ['azure', 'aws', 'scaleway', 'gcp'].includes(t)) || 'unknown',
        moduleCount: 0,
        lastChange: 'N/A',
        driftStatus: 'OK',
      }));

      setClients(summaries);
    };
    load();
  }, [discoveryApi, fetchApi]);

  const columns: TableColumn<ClientSummary>[] = [
    { title: 'Client', field: 'client' },
    { title: 'Cloud', field: 'cloud' },
    { title: 'Modules', field: 'moduleCount', type: 'numeric' },
    { title: 'Last Change', field: 'lastChange' },
    { title: 'Drift', field: 'driftStatus' },
  ];

  return (
    <InfoCard title="Client Overview">
      <Table
        columns={columns}
        data={clients}
        options={{ paging: false, search: true }}
      />
    </InfoCard>
  );
};

What It Looks Like

As an engineer on Team ACME:

You open Backstage. The catalog shows ACME’s infrastructure: 4 Terraform modules (VNet, AKS, SQL, Storage), all under client-acme-infrastructure. You click “Create”, select “Golden Path Terraform Module”, and the client field is pre-filled with “acme”. The generated module lands in the right system, with the right tags, naming conventions, and subscription.

As the platform team:

You open the Client Overview dashboard. You see all 15 clients in a table: which cloud each one uses, how many modules they have, when the last change was, and whether there’s drift. You click on “acme” and see their full infrastructure catalog. You click on “globex” and see theirs.

As the MSP manager:

You need to answer “which clients use AKS clusters?”. You search the catalog for type:terraform-module + tag kubernetes. Three clients. You need to plan an AKS upgrade across all of them. You know exactly which modules to update and who owns each one.

The Catalog Filter Pattern

Backstage’s catalog API supports filtering. Useful queries for MSPs:

# All modules for a specific client
/api/catalog/entities?filter=spec.system=client-acme-infrastructure

# All Azure modules across all clients
/api/catalog/entities?filter=spec.type=terraform-module,metadata.tags=azure

# All production modules (any client)
/api/catalog/entities?filter=spec.type=terraform-module,spec.lifecycle=production

# All modules owned by a specific team
/api/catalog/entities?filter=spec.type=terraform-module,spec.owner=team-acme

These queries also work in the Backstage UI — the catalog page has filter dropdowns for kind, type, owner, and tags.

Checklist

  • Client systems defined in catalog (kind: System)
  • Teams assigned per client (kind: Group)
  • Scaffolded modules land in the right system
  • Client config entity has cloud, subscription, naming conventions
  • Platform team can filter across all clients
  • Catalog search returns correct results per client
  • TechDocs render per client module

Challenge

Before the next article:

  1. Create two client systems in your catalog
  2. Scaffold a module for each client — verify they land in the right system
  3. Use the catalog filters to answer: “which modules does client X have?”

In the next article, we build Pipelines from Backstage — create and manage CI/CD pipelines for Azure DevOps, GitHub Actions, and GitLab CI from one place. No more switching between three different pipeline UIs.

The full code is on GitHub.

If this series helps you, consider buying me a coffee.

This is article 3 of the Infrastructure Hub series. Previous: Golden Path Terraform Modules. Next: Pipelines from Backstage — one UI for all your CI/CD.

Comments

Loading comments...