The AI-Native IDP -- Part 6
The AI Governance Dashboard
The Problem
You’ve built four AI features into the platform. The catalog enriches itself. The scaffolder generates projects from plain English. The code review plugin reads GOTCHA.md before reviewing PRs. The RAG system answers questions from TechDocs.
Now your CTO asks: “How much does this cost?” And you don’t know.
Then a team lead asks: “Can we disable the AI code review for our team? It’s too noisy for our frontend services.” And you can’t do that without changing code.
Then someone from security asks: “Which services have AI-generated catalog metadata? Can we audit that?” And there’s no log to check.
Every AI feature you add creates three needs: visibility (what’s happening), control (who can do what), and audit (what happened). Without governance, AI features become a black box that only the platform team understands — and even they lose track after a few weeks.
The Solution
A Backstage plugin with two parts:
- Backend: Logs every AI action (enrichment, scaffold, review, RAG query), tracks costs, and enforces policies
- Frontend: A dashboard that shows usage, costs, and lets the platform team configure policies per team or service
Every AI endpoint already exists. We don’t need to change their logic — we add a middleware layer that logs and controls access.
The backend uses PostgreSQL for storage and Npgsql for data access. The frontend uses Material UI — the same component library that Backstage uses.
Execute
The Usage Log Table
CREATE TABLE ai_usage_log (
id SERIAL PRIMARY KEY,
timestamp TIMESTAMP DEFAULT NOW(),
action VARCHAR(50) NOT NULL,
entity_ref VARCHAR(255),
team VARCHAR(100),
user_ref VARCHAR(255),
input_tokens INTEGER DEFAULT 0,
output_tokens INTEGER DEFAULT 0,
model VARCHAR(100),
duration_ms INTEGER DEFAULT 0,
status VARCHAR(20) DEFAULT 'success',
metadata JSONB DEFAULT '{}'
);
CREATE INDEX idx_usage_action ON ai_usage_log(action);
CREATE INDEX idx_usage_team ON ai_usage_log(team);
CREATE INDEX idx_usage_timestamp ON ai_usage_log(timestamp);
CREATE TABLE ai_policies (
id SERIAL PRIMARY KEY,
team VARCHAR(100),
action VARCHAR(50) NOT NULL,
enabled BOOLEAN DEFAULT true,
max_daily_calls INTEGER,
updated_at TIMESTAMP DEFAULT NOW(),
UNIQUE (team, action)
);
The Logging Middleware
In the .NET AI service, a middleware that wraps every AI call:
// Middleware/AiUsageMiddleware.cs
public class AiUsageLogger
{
private readonly NpgsqlDataSource _db;
public AiUsageLogger(NpgsqlDataSource db) => _db = db;
public async Task<T> Track<T>(
string action,
string? entityRef,
string? team,
string? userRef,
Func<Task<(T result, int inputTokens, int outputTokens)>> operation)
{
var sw = System.Diagnostics.Stopwatch.StartNew();
var status = "success";
try
{
// Check policy first
if (!await IsAllowed(action, team))
{
status = "blocked";
throw new InvalidOperationException(
$"Action '{action}' is disabled for team '{team}'");
}
// Check daily limit
if (!await WithinDailyLimit(action, team))
{
status = "rate_limited";
throw new InvalidOperationException(
$"Daily limit reached for '{action}' (team: {team})");
}
var (result, inputTokens, outputTokens) = await operation();
sw.Stop();
await LogUsage(action, entityRef, team, userRef,
inputTokens, outputTokens, sw.ElapsedMilliseconds, status);
return result;
}
catch (Exception) when (status != "success")
{
sw.Stop();
await LogUsage(action, entityRef, team, userRef,
0, 0, sw.ElapsedMilliseconds, status);
throw;
}
}
private async Task LogUsage(
string action, string? entityRef, string? team,
string? userRef, int inputTokens, int outputTokens,
long durationMs, string status)
{
await using var cmd = _db.CreateCommand();
cmd.CommandText = """
INSERT INTO ai_usage_log
(action, entity_ref, team, user_ref,
input_tokens, output_tokens, duration_ms, status)
VALUES ($1, $2, $3, $4, $5, $6, $7, $8)
""";
cmd.Parameters.AddWithValue(action);
cmd.Parameters.AddWithValue(entityRef ?? (object)DBNull.Value);
cmd.Parameters.AddWithValue(team ?? (object)DBNull.Value);
cmd.Parameters.AddWithValue(userRef ?? (object)DBNull.Value);
cmd.Parameters.AddWithValue(inputTokens);
cmd.Parameters.AddWithValue(outputTokens);
cmd.Parameters.AddWithValue((int)durationMs);
cmd.Parameters.AddWithValue(status);
await cmd.ExecuteNonQueryAsync();
}
private async Task<bool> IsAllowed(string action, string? team)
{
if (team == null) return true;
await using var cmd = _db.CreateCommand();
cmd.CommandText = """
SELECT enabled FROM ai_policies
WHERE action = $1 AND (team = $2 OR team = '*')
ORDER BY CASE WHEN team = $2 THEN 0 ELSE 1 END
LIMIT 1
""";
cmd.Parameters.AddWithValue(action);
cmd.Parameters.AddWithValue(team);
var result = await cmd.ExecuteScalarAsync();
return result == null || (bool)result;
}
private async Task<bool> WithinDailyLimit(string action, string? team)
{
if (team == null) return true;
await using var cmd = _db.CreateCommand();
cmd.CommandText = """
SELECT p.max_daily_calls, COUNT(l.id) as today_calls
FROM ai_policies p
LEFT JOIN ai_usage_log l ON l.action = p.action
AND l.team = p.team
AND l.timestamp >= CURRENT_DATE
AND l.status = 'success'
WHERE p.action = $1 AND p.team = $2
GROUP BY p.max_daily_calls
""";
cmd.Parameters.AddWithValue(action);
cmd.Parameters.AddWithValue(team);
await using var reader = await cmd.ExecuteReaderAsync();
if (!await reader.ReadAsync()) return true;
var limit = reader.IsDBNull(0) ? int.MaxValue : reader.GetInt32(0);
var calls = reader.GetInt64(1);
return calls < limit;
}
}
The Usage Endpoints
The AI service exposes endpoints for the dashboard:
// Usage summary
app.MapGet("/api/governance/usage", async (string? action, string? team, int? days, IConfiguration config) =>
{
var connStr = config["Rag:PostgresConnection"];
if (string.IsNullOrEmpty(connStr))
return Results.Json(new { error = "Governance not configured." }, statusCode: 503);
var daysValue = days ?? 30;
await using var dataSource = NpgsqlDataSource.Create(connStr);
await using var cmd = dataSource.CreateCommand();
cmd.CommandText = """
SELECT action, team, status,
COUNT(*) as call_count,
SUM(input_tokens) as total_input_tokens,
SUM(output_tokens) as total_output_tokens,
AVG(duration_ms) as avg_duration_ms
FROM ai_usage_log
WHERE timestamp >= NOW() - INTERVAL '1 day' * $1
AND ($2 = '' OR action = $2)
AND ($3 = '' OR team = $3)
GROUP BY action, team, status
ORDER BY call_count DESC
""";
cmd.Parameters.AddWithValue(daysValue > 0 ? daysValue : 30);
cmd.Parameters.AddWithValue(action ?? "");
cmd.Parameters.AddWithValue(team ?? "");
var results = new List<object>();
await using var reader = await cmd.ExecuteReaderAsync();
while (await reader.ReadAsync())
{
results.Add(new
{
Action = reader.GetString(0),
Team = reader.IsDBNull(1) ? "unknown" : reader.GetString(1),
Status = reader.GetString(2),
CallCount = reader.GetInt64(3),
TotalInputTokens = reader.GetInt64(4),
TotalOutputTokens = reader.GetInt64(5),
AvgDurationMs = reader.GetDouble(6)
});
}
return Results.Ok(results);
});
// Cost estimation
app.MapGet("/api/governance/costs", async (int? days, IConfiguration config) =>
{
var connStr = config["Rag:PostgresConnection"];
if (string.IsNullOrEmpty(connStr))
return Results.Json(new { error = "Governance not configured." }, statusCode: 503);
var daysValue = days ?? 30;
await using var dataSource = NpgsqlDataSource.Create(connStr);
await using var cmd = dataSource.CreateCommand();
cmd.CommandText = """
SELECT DATE(timestamp) as day,
SUM(input_tokens) as input_tokens,
SUM(output_tokens) as output_tokens
FROM ai_usage_log
WHERE timestamp >= NOW() - INTERVAL '1 day' * $1
AND status = 'success'
GROUP BY DATE(timestamp)
ORDER BY day
""";
cmd.Parameters.AddWithValue(daysValue > 0 ? daysValue : 30);
var results = new List<object>();
await using var reader = await cmd.ExecuteReaderAsync();
while (await reader.ReadAsync())
{
var inputTokens = reader.GetInt64(1);
var outputTokens = reader.GetInt64(2);
// Adjust pricing per your provider and model
var cost = (inputTokens * 2.0 / 1_000_000) +
(outputTokens * 6.0 / 1_000_000);
results.Add(new
{
Day = reader.GetDateTime(0).ToString("yyyy-MM-dd"),
InputTokens = inputTokens,
OutputTokens = outputTokens,
EstimatedCostUsd = Math.Round(cost, 4)
});
}
return Results.Ok(results);
});
// Policies CRUD
app.MapGet("/api/governance/policies", async (IConfiguration config) =>
{
var connStr = config["Rag:PostgresConnection"];
if (string.IsNullOrEmpty(connStr))
return Results.Json(new { error = "Governance not configured." }, statusCode: 503);
await using var dataSource = NpgsqlDataSource.Create(connStr);
await using var cmd = dataSource.CreateCommand();
cmd.CommandText = "SELECT id, team, action, enabled, max_daily_calls FROM ai_policies ORDER BY team, action";
var results = new List<object>();
await using var reader = await cmd.ExecuteReaderAsync();
while (await reader.ReadAsync())
{
results.Add(new
{
Id = reader.GetInt32(0),
Team = reader.GetString(1),
Action = reader.GetString(2),
Enabled = reader.GetBoolean(3),
MaxDailyCalls = reader.IsDBNull(4) ? (int?)null : reader.GetInt32(4)
});
}
return Results.Ok(results);
});
app.MapPut("/api/governance/policies", async (PolicyUpdate update, IConfiguration config) =>
{
var connStr = config["Rag:PostgresConnection"];
if (string.IsNullOrEmpty(connStr))
return Results.Json(new { error = "Governance not configured." }, statusCode: 503);
await using var dataSource = NpgsqlDataSource.Create(connStr);
await using var cmd = dataSource.CreateCommand();
cmd.CommandText = """
INSERT INTO ai_policies (team, action, enabled, max_daily_calls)
VALUES ($1, $2, $3, $4)
ON CONFLICT (team, action)
DO UPDATE SET enabled = EXCLUDED.enabled,
max_daily_calls = EXCLUDED.max_daily_calls,
updated_at = NOW()
""";
cmd.Parameters.AddWithValue(update.Team);
cmd.Parameters.AddWithValue(update.Action);
cmd.Parameters.AddWithValue(update.Enabled);
cmd.Parameters.AddWithValue(update.MaxDailyCalls.HasValue
? update.MaxDailyCalls.Value : DBNull.Value);
await cmd.ExecuteNonQueryAsync();
return Results.Ok();
});
record PolicyUpdate(string Team, string Action, bool Enabled, int? MaxDailyCalls);
The Dashboard Component
// plugins/ai-governance/src/components/GovernanceDashboard.tsx
import React, { useEffect, useState } from 'react';
import {
Card,
CardContent,
CardHeader,
Grid,
Typography,
Table,
TableBody,
TableCell,
TableHead,
TableRow,
Switch,
} from '@material-ui/core';
import { useApi, fetchApiRef, discoveryApiRef } from '@backstage/core-plugin-api';
interface UsageSummary {
action: string;
team: string;
status: string;
callCount: number;
totalInputTokens: number;
totalOutputTokens: number;
avgDurationMs: number;
}
interface CostEntry {
day: string;
inputTokens: number;
outputTokens: number;
estimatedCostUsd: number;
}
interface Policy {
id: number;
team: string;
action: string;
enabled: boolean;
maxDailyCalls: number | null;
}
export const GovernanceDashboard = () => {
const [usage, setUsage] = useState<UsageSummary[]>([]);
const [costs, setCosts] = useState<CostEntry[]>([]);
const [policies, setPolicies] = useState<Policy[]>([]);
const fetchApi = useApi(fetchApiRef);
const discoveryApi = useApi(discoveryApiRef);
const [days, setDays] = useState(30);
useEffect(() => {
fetchData();
}, [days]);
const fetchData = async () => {
const proxyUrl = await discoveryApi.getBaseUrl('proxy');
const [usageRes, costsRes, policiesRes] = await Promise.all([
fetchApi.fetch(`${proxyUrl}/ai-service/api/governance/usage?days=${days}`),
fetchApi.fetch(`${proxyUrl}/ai-service/api/governance/costs?days=${days}`),
fetchApi.fetch(`${proxyUrl}/ai-service/api/governance/policies`),
]);
if (usageRes.ok) setUsage(await usageRes.json());
if (costsRes.ok) setCosts(await costsRes.json());
if (policiesRes.ok) setPolicies(await policiesRes.json());
};
const totalCost = costs.reduce((sum, c) => sum + c.estimatedCostUsd, 0);
const totalCalls = usage
.filter(u => u.status === 'success')
.reduce((sum, u) => sum + u.callCount, 0);
const togglePolicy = async (policy: Policy) => {
const proxyUrl = await discoveryApi.getBaseUrl('proxy');
await fetchApi.fetch(`${proxyUrl}/ai-service/api/governance/policies`, {
method: 'PUT',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
team: policy.team,
action: policy.action,
enabled: !policy.enabled,
maxDailyCalls: policy.maxDailyCalls,
}),
});
fetchData();
};
return (
<Grid container spacing={3}>
{/* Summary cards */}
<Grid item md={4}>
<Card>
<CardContent>
<Typography variant="h4">{totalCalls}</Typography>
<Typography color="textSecondary">
AI calls (last {days} days)
</Typography>
</CardContent>
</Card>
</Grid>
<Grid item md={4}>
<Card>
<CardContent>
<Typography variant="h4">
${totalCost.toFixed(2)}
</Typography>
<Typography color="textSecondary">
Estimated cost (last {days} days)
</Typography>
</CardContent>
</Card>
</Grid>
<Grid item md={4}>
<Card>
<CardContent>
<Typography variant="h4">
{usage.filter(u => u.status === 'blocked').length}
</Typography>
<Typography color="textSecondary">
Blocked by policy
</Typography>
</CardContent>
</Card>
</Grid>
{/* Usage by action */}
<Grid item md={12}>
<Card>
<CardHeader title="Usage by Action" />
<CardContent>
<Table size="small">
<TableHead>
<TableRow>
<TableCell>Action</TableCell>
<TableCell>Team</TableCell>
<TableCell align="right">Calls</TableCell>
<TableCell align="right">Input Tokens</TableCell>
<TableCell align="right">Output Tokens</TableCell>
<TableCell align="right">Avg Duration</TableCell>
<TableCell>Status</TableCell>
</TableRow>
</TableHead>
<TableBody>
{usage.map((row, i) => (
<TableRow key={i}>
<TableCell>{row.action}</TableCell>
<TableCell>{row.team}</TableCell>
<TableCell align="right">{row.callCount}</TableCell>
<TableCell align="right">
{row.totalInputTokens.toLocaleString()}
</TableCell>
<TableCell align="right">
{row.totalOutputTokens.toLocaleString()}
</TableCell>
<TableCell align="right">
{Math.round(row.avgDurationMs)}ms
</TableCell>
<TableCell>{row.status}</TableCell>
</TableRow>
))}
</TableBody>
</Table>
</CardContent>
</Card>
</Grid>
{/* Policies */}
<Grid item md={12}>
<Card>
<CardHeader title="Policies" />
<CardContent>
<Table size="small">
<TableHead>
<TableRow>
<TableCell>Team</TableCell>
<TableCell>Action</TableCell>
<TableCell>Enabled</TableCell>
<TableCell>Daily Limit</TableCell>
</TableRow>
</TableHead>
<TableBody>
{policies.map(policy => (
<TableRow key={policy.id}>
<TableCell>{policy.team}</TableCell>
<TableCell>{policy.action}</TableCell>
<TableCell>
<Switch
checked={policy.enabled}
onChange={() => togglePolicy(policy)}
/>
</TableCell>
<TableCell>
{policy.maxDailyCalls ?? 'unlimited'}
</TableCell>
</TableRow>
))}
</TableBody>
</Table>
</CardContent>
</Card>
</Grid>
{/* Daily costs */}
<Grid item md={12}>
<Card>
<CardHeader title="Daily Cost Breakdown" />
<CardContent>
<Table size="small">
<TableHead>
<TableRow>
<TableCell>Date</TableCell>
<TableCell align="right">Input Tokens</TableCell>
<TableCell align="right">Output Tokens</TableCell>
<TableCell align="right">Est. Cost (USD)</TableCell>
</TableRow>
</TableHead>
<TableBody>
{costs.map((row, i) => (
<TableRow key={i}>
<TableCell>{row.day}</TableCell>
<TableCell align="right">
{row.inputTokens.toLocaleString()}
</TableCell>
<TableCell align="right">
{row.outputTokens.toLocaleString()}
</TableCell>
<TableCell align="right">
${row.estimatedCostUsd.toFixed(4)}
</TableCell>
</TableRow>
))}
</TableBody>
</Table>
</CardContent>
</Card>
</Grid>
</Grid>
);
};
Adding the Dashboard Page to Backstage
// plugins/ai-governance/src/plugin.ts (frontend)
import {
createPlugin,
createRouteRef,
createRoutableExtension,
} from '@backstage/core-plugin-api';
const rootRouteRef = createRouteRef({ id: 'ai-governance' });
export const aiGovernancePlugin = createPlugin({
id: 'ai-governance',
routes: {
root: rootRouteRef,
},
});
export const GovernancePage = aiGovernancePlugin.provide(
createRoutableExtension({
name: 'GovernancePage',
component: () =>
import('./components/GovernanceDashboard').then(
m => m.GovernanceDashboard,
),
mountPoint: rootRouteRef,
}),
);
In packages/app/src/App.tsx:
import { GovernancePage } from '@internal/plugin-ai-governance';
// Inside <FlatRoutes>:
<Route path="/ai-governance" element={<GovernancePage />} />
And in the sidebar:
<SidebarItem icon={DashboardIcon} to="ai-governance" text="AI Governance" />
What the Dashboard Shows
The platform team opens /ai-governance and sees:
- Total AI calls: 1,247 in the last 30 days
- Estimated cost: $8.34
- Blocked calls: 3 (team-frontend tried to use the scaffolder, which is disabled for them)
The usage table shows:
enrich: 420 calls, mostly overnight (the 24h scheduler)scaffold: 28 calls, spread across 4 teamsreview: 612 calls, triggered by every PRask: 187 calls, developers searching docs
The policies table lets them:
- Disable AI code review for team-frontend (too noisy for CSS changes)
- Set a daily limit of 10 scaffold calls per team (prevent abuse)
- Keep enrichment and RAG enabled for everyone
Checklist
-
ai_usage_logtable created with proper indexes -
ai_policiestable created with team + action uniqueness - AiUsageLogger middleware wraps all AI service endpoints
- Usage, costs, and policies endpoints return correct data
- Dashboard shows summary cards, usage table, policies, and cost breakdown
- Policy toggle (enable/disable) works from the dashboard
- Blocked calls logged with
status = 'blocked' - Dashboard accessible from Backstage sidebar
Before the Next Article
The platform team can now see what’s happening, how much it costs, and control who uses what. Every AI feature is logged, every team has configurable policies, and the costs are visible.
But all of this runs when things are normal. What happens when things break? When the invoice-api returns 500 errors at 3am? The on-call engineer opens the incident page and sees… logs. Thousands of lines of logs. No context about what this service does, what it depends on, or what changed recently.
What if the incident page could read the catalog, check recent deployments, search the logs, and suggest what went wrong — before the engineer finishes their coffee?
That’s article 7: AI-Assisted Incident Response.
The full code is on GitHub.
Troubleshooting
Proxy returns empty responses
Make sure allowedMethods includes all HTTP methods used by the dashboard. In app-config.yaml:
proxy:
endpoints:
/ai-service:
target: http://localhost:5100
allowedHeaders: ['Content-Type']
allowedMethods: ['GET', 'POST', 'PUT']
better-sqlite3 fails to build on Node 24+
Backstage’s default database is better-sqlite3, which requires native compilation. If it fails, switch to PostgreSQL in app-config.yaml:
backend:
database:
client: pg
connection:
host: localhost
port: 5432
user: postgres
password: your-password
Dashboard shows all zeros
The governance endpoints need the Rag:PostgresConnection config in the AI service. Make sure you start the AI service with the connection string:
Rag__PostgresConnection="Host=localhost;Database=forge;Username=postgres;Password=forge-dev" dotnet run
If this series helps you, consider buying me a coffee.
This is article 6 of the AI-Native IDP series. Previous: TechDocs RAG. Next: AI-Assisted Incident Response.
Loading comments...