Quantum-Safe Cloud -- Part 6

Zero Trust in AKS: mTLS with Quantum-Safe Certificates

#security #post-quantum #zero-trust #kubernetes #mtls #quantumapi #aks

The Problem

You’ve secured the perimeter. HTTPS everywhere, JWT tokens via QuantumID, secrets in QuantumVault, signed images. An attacker getting in from outside faces multiple layers.

But inside the cluster, the story is different.

In a default Kubernetes setup, service-to-service communication is unencrypted and unauthenticated. A pod in the users-api namespace can call a pod in the notifications namespace without proving who it is. If one pod is compromised — a container escape, a dependency with a backdoor, a misconfigured RBAC rule — the attacker has a foothold inside your cluster and can reach everything else.

Zero trust means: verify everything, trust nothing, even inside the cluster. Every connection requires authentication. Every service proves its identity on every request. There’s no “trusted inside” and “untrusted outside” — it’s all outside.

The standard mechanism for service-to-service identity is mTLS (mutual TLS). Both sides of a connection present certificates. Both sides verify the other’s certificate before exchanging data. A compromised pod that doesn’t have a valid certificate can’t connect.

The problem: certificate management at scale is hard. Every service needs a certificate. Certificates expire. They need rotation. You need a CA to issue them and a way to distribute the CA public key to every service.

The bigger problem: standard mTLS uses RSA or ECDSA certificates. We’re back to classical crypto, breakable by quantum computers.

QuantumAPI solves this: it acts as a certificate authority that issues X.509 certificates signed with ML-DSA. Your services get certificates that prove identity today and stay secure against quantum attacks.

The Solution

mTLS with QuantumAPI as the CA:

  1. QuantumAPI issues a root CA certificate (ML-DSA-65)
  2. Each service gets a leaf certificate issued by that CA, with its service identity in the Subject Alternative Name
  3. Services present their certificate on every outgoing connection
  4. Services verify the peer’s certificate against the CA before accepting any connection
  5. Certificates rotate automatically via the QuantumAPI SDK before expiry

For AKS, there are two approaches:

  • Service mesh (Linkerd, Istio): handles mTLS transparently at the sidecar level. You configure the mesh to use QuantumAPI as the CA. No application code changes.
  • Manual: each service fetches its own certificate from QuantumAPI and configures Kestrel to require client certificates.

We’ll use the manual approach — it’s more portable and makes the security model explicit in your code. If you prefer a service mesh, the certificate issuance step is the same.

Execute

ATLAS for this task

[A] ARCHITECT
  Enforce mTLS between users-api and any downstream services it calls.
  QuantumAPI acts as the CA. Certificates are ML-DSA-65.
  Certificate rotation: every 24 hours (QuantumAPI handles re-issuance).
  Out of scope: service mesh sidecar injection, ingress mTLS (handled at gateway level).

[T] TRACE
  Service starts → requests certificate from QuantumAPI CA
  → stores cert + key in memory (not on disk)
  → Kestrel configured with ClientCertificateMode.RequireCertificate
  → Outgoing calls: HttpClient configured with the service certificate
  → Peer verifies certificate against CA public key
  → Background: CertificateRotationService checks expiry every hour
  → If < 4 hours left → request new certificate → hot-reload Kestrel

[L] LINK
  Service → QuantumAPI CA: HTTPS, X-Api-Key, POST /api/v1/certificates
  Service A → Service B: mTLS, both sides present ML-DSA cert
  Service → QuantumAPI (JWKS): for token validation (article 4)

[A] ASSEMBLE
  1. IServiceCertificateProvider: fetches + caches cert from QuantumAPI
  2. Kestrel configuration: require client cert on all endpoints
  3. HttpClient factory: attach service cert to all outgoing calls
  4. CertificateRotationService: background rotation before expiry
  5. K8s NetworkPolicy: restrict ingress to known service identities only

[S] STRESS-TEST
  QuantumAPI CA unreachable at startup?
    → Retry 5 times with backoff → if still failing, fail fast
  Certificate expired mid-deployment?
    → CertificateRotationService catches it before expiry
    → If not, new pod fails readiness check → old pods stay up
  New service added without a certificate?
    → Connection rejected by existing services → explicit failure (not silent)

Step 1: Issue a service certificate

The QuantumAPI SDK exposes a certificate endpoint:

// ServiceCertificateProvider.cs
public class ServiceCertificateProvider : IServiceCertificateProvider
{
    private readonly QuantumApiClient _client;
    private X509Certificate2? _certificate;
    private readonly SemaphoreSlim _lock = new(1, 1);

    public ServiceCertificateProvider(QuantumApiClient client)
    {
        _client = client;
    }

    public async Task<X509Certificate2> GetCertificateAsync()
    {
        if (_certificate is not null && !IsExpiringSoon(_certificate))
            return _certificate;

        await _lock.WaitAsync();
        try
        {
            // Double-check after acquiring lock
            if (_certificate is not null && !IsExpiringSoon(_certificate))
                return _certificate;

            _certificate = await IssueCertificateAsync();
            return _certificate;
        }
        finally
        {
            _lock.Release();
        }
    }

    private async Task<X509Certificate2> IssueCertificateAsync()
    {
        var serviceName = Environment.GetEnvironmentVariable("SERVICE_NAME")
            ?? "unknown-service";

        var result = await _client.Certificates.IssueAsync(new IssueCertificateRequest
        {
            Subject = $"CN={serviceName},O=users-api,C=EU",
            SubjectAlternativeNames = new[]
            {
                $"DNS:{serviceName}.users-api.svc.cluster.local",
                $"DNS:{serviceName}"
            },
            Algorithm = "ML-DSA-65",
            ValidityHours = 24,
            KeyUsage = new[] { "DigitalSignature", "KeyEncipherment" },
            ExtendedKeyUsage = new[] { "ClientAuthentication", "ServerAuthentication" }
        });

        // Load the PEM certificate + private key into an X509Certificate2
        return X509Certificate2.CreateFromPem(result.Certificate, result.PrivateKey);
    }

    private static bool IsExpiringSoon(X509Certificate2 cert)
        => cert.NotAfter < DateTime.UtcNow.AddHours(4);
}

Step 2: Configure Kestrel to require client certificates

// Program.cs
builder.Services.AddSingleton<IServiceCertificateProvider, ServiceCertificateProvider>();
builder.Services.AddQuantumApiClient(options =>
{
    options.ApiKey = builder.Configuration["QuantumApi:ApiKey"]!;
});

// Configure Kestrel with the service certificate + client cert requirement
builder.WebHost.ConfigureKestrel(async (context, options) =>
{
    var certProvider = context.ApplicationServices
        .GetRequiredService<IServiceCertificateProvider>();

    var serviceCert = await certProvider.GetCertificateAsync();

    options.ConfigureHttpsDefaults(httpsOptions =>
    {
        httpsOptions.ServerCertificate = serviceCert;
        httpsOptions.ClientCertificateMode = ClientCertificateMode.RequireCertificate;
        httpsOptions.ClientCertificateValidation = (cert, chain, errors) =>
        {
            // Verify the client cert is issued by QuantumAPI CA
            return ValidateAgainstQuantumApiCa(cert);
        };
    });
});
// CA validation helper
private static bool ValidateAgainstQuantumApiCa(X509Certificate2 clientCert)
{
    // The CA cert is fetched once at startup and cached
    // In production: fetch from QuantumAPI /api/v1/ca/certificate endpoint
    var caCert = CaStore.GetCaCertificate();

    var chain = new X509Chain();
    chain.ChainPolicy.TrustMode = X509ChainTrustMode.CustomRootTrust;
    chain.ChainPolicy.CustomTrustStore.Add(caCert);
    chain.ChainPolicy.RevocationMode = X509RevocationMode.NoCheck;

    return chain.Build(clientCert);
}

Step 3: Attach the service certificate to outgoing calls

// Program.cs (continued)
builder.Services.AddHttpClient("internal")
    .ConfigurePrimaryHttpMessageHandler(serviceProvider =>
    {
        var certProvider = serviceProvider
            .GetRequiredService<IServiceCertificateProvider>();

        return new SocketsHttpHandler
        {
            SslOptions = new SslClientAuthenticationOptions
            {
                ClientCertificates = new X509CertificateCollection
                {
                    certProvider.GetCertificateAsync().GetAwaiter().GetResult()
                },
                RemoteCertificateValidationCallback = (_, cert, chain, errors) =>
                    ValidateAgainstQuantumApiCa(new X509Certificate2(cert!))
            }
        };
    });

Use the "internal" HttpClient for all service-to-service calls:

// NotificationServiceClient.cs
public class NotificationServiceClient
{
    private readonly HttpClient _http;

    public NotificationServiceClient(IHttpClientFactory factory)
    {
        _http = factory.CreateClient("internal");
    }

    public async Task SendAsync(NotificationRequest request)
    {
        var response = await _http.PostAsJsonAsync(
            "https://notifications.users-api.svc.cluster.local/api/v1/send",
            request);

        response.EnsureSuccessStatusCode();
    }
}

Step 4: Background certificate rotation

// CertificateRotationService.cs
public class CertificateRotationService : BackgroundService
{
    private readonly IServiceCertificateProvider _certProvider;
    private readonly ILogger<CertificateRotationService> _logger;

    public CertificateRotationService(
        IServiceCertificateProvider certProvider,
        ILogger<CertificateRotationService> logger)
    {
        _certProvider = certProvider;
        _logger = logger;
    }

    protected override async Task ExecuteAsync(CancellationToken stoppingToken)
    {
        while (!stoppingToken.IsCancellationRequested)
        {
            try
            {
                // GetCertificateAsync handles renewal internally (IsExpiringSoon check)
                var cert = await _certProvider.GetCertificateAsync();
                var hoursLeft = (cert.NotAfter - DateTime.UtcNow).TotalHours;
                _logger.LogInformation(
                    "Certificate valid. Expires in {Hours:F1} hours", hoursLeft);
            }
            catch (Exception ex)
            {
                _logger.LogError(ex, "Certificate rotation check failed");
            }

            await Task.Delay(TimeSpan.FromHours(1), stoppingToken);
        }
    }
}

Register it:

builder.Services.AddHostedService<CertificateRotationService>();

Step 5: Kubernetes NetworkPolicy

mTLS handles authentication. NetworkPolicy handles which pods can even attempt a connection:

# networkpolicy-users-api.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: users-api-ingress
  namespace: users-api
spec:
  podSelector:
    matchLabels:
      app: users-api
  policyTypes:
    - Ingress
  ingress:
    # Allow ingress from API gateway only
    - from:
        - namespaceSelector:
            matchLabels:
              kubernetes.io/metadata.name: api-gateway
          podSelector:
            matchLabels:
              app: gateway
      ports:
        - protocol: TCP
          port: 8080
    # Allow ingress from notification service
    - from:
        - podSelector:
            matchLabels:
              app: notifications
      ports:
        - protocol: TCP
          port: 8080

The NetworkPolicy is the outer layer — it blocks TCP connections from pods that aren’t in the allowed list. mTLS is the inner layer — even if a pod passes the NetworkPolicy, it still needs a valid ML-DSA certificate.

Two layers, independent failures. NetworkPolicy misconfiguration doesn’t help you if mTLS is required. mTLS certificate theft doesn’t help if NetworkPolicy blocks the TCP connection.

What we adjusted

1. Kestrel + ConfigureKestrel is async. The AI used ConfigureKestrel with a synchronous lambda. Getting the certificate requires an async call to QuantumAPI. We changed to ConfigureKestrel with an async delegate and GetAwaiter().GetResult() for the initial cert load (only at startup — acceptable).

2. Certificate in HttpClient factory. The AI created a single X509Certificate2 at startup and used it forever. If the certificate rotates, the old HttpClient still has the old cert. We changed to resolving the certificate from IServiceCertificateProvider at request time, which returns the cached valid cert (or a renewed one if close to expiry).

3. CA cert bootstrap. The AI assumed the CA cert was already in a file. In practice, you fetch it from QuantumAPI at startup: GET /api/v1/ca/certificate returns the current CA public certificate in PEM format. Cache it in memory. If the CA rotates (rare), the next TLS handshake will fail and you’ll see it in logs immediately.

Template

=== ZERO TRUST mTLS CHECKLIST ===

CERTIFICATE AUTHORITY
[ ] QuantumAPI CA: fetch root certificate at startup, cache in memory
[ ] Algorithm: ML-DSA-65 (default) — verify this in the issued cert

SERVICE CERTIFICATES
[ ] Each service: unique SAN (DNS:{service}.{namespace}.svc.cluster.local)
[ ] Validity: 24 hours (short-lived → rotation is automatic, not manual)
[ ] Storage: in-memory only, never on disk, never in K8s Secrets

KESTREL (server)
[ ] ClientCertificateMode.RequireCertificate on all internal endpoints
[ ] ClientCertificateValidation: verify against QuantumAPI CA
[ ] External-facing endpoints (ingress): separate — no client cert required from users

HTTPCLIENT (client)
[ ] Named client "internal" for all service-to-service calls
[ ] SslClientAuthenticationOptions with service cert + CA validation
[ ] Do NOT use "internal" for external API calls (wrong cert context)

ROTATION
[ ] BackgroundService checks cert expiry every hour
[ ] Renew when < 4 hours left (24-hour cert → renew at 20 hours)
[ ] Log expiry time on every check (visibility into rotation health)

KUBERNETES
[ ] NetworkPolicy restricts ingress to known pods/namespaces
[ ] mTLS + NetworkPolicy = two independent layers
[ ] Test: try connecting from a pod not in the allowed list → expect TCP rejection
[ ] Test: try connecting with an expired cert → expect TLS handshake failure

Challenge

You now have the full picture of a quantum-safe application:

  • Secrets in QuantumVault ✓
  • Data encrypted with ML-KEM + AES-256-GCM ✓
  • Authentication via QuantumID (ML-DSA tokens) ✓
  • Pipeline secrets from QuantumVault, images signed with ML-DSA ✓
  • Service-to-service mTLS with ML-DSA certificates ✓

Article 7 puts it all together: the complete reference architecture. One diagram, one ATLAS document, one GOTCHA prompt. The thing you bring to your team and say “this is how we build secure systems.”

If this series helps you, consider buying me a coffee.

Comments

Loading comments...