El Plugin de Code Review con IA

El Problema

Un developer abre un PR. El reviewer mira el diff. Ve 400 lineas de codigo. Comprueba: compila? Sigue la guia de estilo? Hay bugs obvios?

Pero no comprueba: este servicio usa el repository pattern? Esta bien llamar a Service Bus de forma sincrona aqui? Este endpoint necesita autenticacion? No lo comprueba porque no lo sabe. Esta revisando el codigo de forma aislada — sin el contexto de que es este servicio, que reglas sigue o como encaja en el sistema.

Esto pasa en todos los equipos. El Reviewer A conoce bien el servicio de facturas pero le asignan un PR del servicio de notificaciones. Se pasa 20 minutos entendiendo el contexto antes de poder dar feedback util. Y aun asi, se le escapan cosas. Patrones arquitectonicos que son obvios para el equipo que construyo el servicio son invisibles para un reviewer externo.

En el articulo 2 ensenamos al catalogo a entender servicios. En el articulo 3 generamos proyectos con prompts GOTCHA. Ahora usamos ambos — los metadatos del catalogo y el prompt GOTCHA — para hacer code reviews con contexto.

La Solucion

Un plugin de Backstage que se conecta a los pull requests de GitHub y ejecuta una review con IA. Pero a diferencia de las herramientas genericas de review con IA, este lee el catalogo de servicios primero.

El flujo:

Un webhook se dispara cuando se abre o actualiza un PR
El plugin busca el repositorio en el catalogo de Backstage
Lee los metadatos del servicio: descripcion, dependencias, tags, owner
Lee el archivo GOTCHA.md del repo (generado en el articulo 3)
Obtiene el diff del PR desde GitHub
Envia todo al modelo de IA: el contexto del catalogo, las heuristics de GOTCHA y el diff
La IA revisa el codigo con contexto arquitectonico completo
El plugin publica la review como comentario en el PR de GitHub

El reviewer sigue revisando. Pero ahora tiene un asistente de IA que ya conoce el servicio, las reglas y los patrones.

Execute

El Endpoint de AI Review

Anadimos un nuevo endpoint al servicio .NET de IA:

app.MapPost("/api/review", async (ReviewRequest request, IConfiguration config) =>
{
    if (string.IsNullOrWhiteSpace(request.Diff))
        return Results.BadRequest(new { error = "Diff is required." });

    var endpoint = config["AI:Endpoint"];
    var apiKey = config["AI:Key"];
    var model = config["AI:ChatModel"] ?? "mistral-small-3.2-24b-instruct-2506";
    var provider = config["AI:Provider"] ?? "openai";

    ChatClient chatClient = provider.ToLowerInvariant() switch
    {
        "azure" => new AzureOpenAIClient(
            new Uri(endpoint!), new ApiKeyCredential(apiKey!))
            .GetChatClient(model),
        _ => new OpenAIClient(
            new ApiKeyCredential(apiKey!),
            new OpenAIClientOptions { Endpoint = new Uri(endpoint!) })
            .GetChatClient(model),
    };

    var systemPrompt = $"""
        You are a senior code reviewer for the {request.ServiceName} service.

        SERVICE CONTEXT (from the Software Catalog):
        Description: {request.ServiceDescription}
        Tags: {string.Join(", ", request.Tags)}
        Dependencies: {string.Join(", ", request.Dependencies)}

        ARCHITECTURAL RULES (from GOTCHA.md):
        {request.GotchaHeuristics}

        Review the following pull request diff. Focus on:
        1. Violations of the architectural rules listed above
        2. Security issues (authentication, input validation, secrets)
        3. Patterns that contradict the service's documented purpose
        4. Missing error handling for the specific dependencies this service uses

        Do NOT comment on:
        - Code style (formatting, naming conventions) — the linter handles that
        - Generic best practices that don't relate to this specific service

        Format your review as a list of findings. For each finding:
        - File and line reference
        - What the issue is
        - Why it matters for THIS service specifically
        - Suggested fix

        If the code looks good, say so. Don't invent problems.
        """;

    try
    {
        var completion = await chatClient.CompleteChatAsync(
        [
            new SystemChatMessage(systemPrompt),
            new UserChatMessage($"PR: {request.PrTitle}\n\nDiff:\n{request.Diff}"),
        ]);

        var review = completion.Value.Content[0].Text.Trim();
        return Results.Ok(new { review });
    }
    catch (ClientResultException ex) when (ex.Status == 401)
    {
        return Results.Json(new { error = "AI provider authentication failed." }, statusCode: 503);
    }
    catch (Exception ex)
    {
        return Results.Json(new { error = $"AI provider error: {ex.Message}" }, statusCode: 502);
    }
});

record ReviewRequest(
    string ServiceName,
    string ServiceDescription,
    string[] Tags,
    string[] Dependencies,
    string GotchaHeuristics,
    string PrTitle,
    string Diff);

El system prompt es lo que marca la diferencia. No es “revisa este codigo.” Es “revisa este codigo sabiendo que este servicio usa PostgreSQL, publica en Service Bus y nunca debe llamar a Service Bus de forma sincrona en el pipeline de la request.”

El Backend Plugin de Backstage

El plugin escucha webhooks de GitHub y lanza la review:

// plugins/ai-code-review/src/module.ts
import {
  coreServices,
  createBackendPlugin,
} from '@backstage/backend-plugin-api';
import { catalogServiceRef } from '@backstage/plugin-catalog-node';
import { createRouter } from './router';

export const aiCodeReviewPlugin = createBackendPlugin({
  pluginId: 'ai-code-review',
  register(env) {
    env.registerInit({
      deps: {
        logger: coreServices.logger,
        httpRouter: coreServices.httpRouter,
        config: coreServices.rootConfig,
        catalog: catalogServiceRef,
        auth: coreServices.auth,
      },
      async init({ logger, httpRouter, config, catalog, auth }) {
        const aiServiceUrl = config.getString('forge.aiServiceUrl');

        const router = await createRouter({
          logger,
          catalog,
          auth,
          aiServiceUrl,
        });

        httpRouter.use(router);
        httpRouter.addAuthPolicy({
          path: '/webhook/github',
          allow: 'unauthenticated',
        });
        logger.info('AI Code Review plugin initialized');
      },
    });
  },
});

Esto es un createBackendPlugin — no un module — porque code review tiene sus propias rutas HTTP. Los modules extienden plugins existentes; los plugins tienen su propio namespace de rutas (/api/ai-code-review/). La llamada a addAuthPolicy permite que el endpoint del webhook acepte requests no autenticadas desde GitHub.

El Webhook Router

El router gestiona los eventos del webhook de GitHub:

// plugins/ai-code-review/src/router.ts
import { Router, json } from 'express';
import type { LoggerService, AuthService } from '@backstage/backend-plugin-api';
import type { CatalogService } from '@backstage/plugin-catalog-node';
import { reviewPullRequest } from './review';

interface RouterOptions {
  logger: LoggerService;
  catalog: CatalogService;
  auth: AuthService;
  aiServiceUrl: string;
}

export async function createRouter(options: RouterOptions): Promise<Router> {
  const { logger, catalog, auth, aiServiceUrl } = options;
  const router = Router();
  router.use(json());

  router.post('/webhook/github', async (req, res) => {
    const event = req.headers['x-github-event'];
    const payload = req.body;

    if (event !== 'pull_request') {
      res.status(200).json({ ignored: true });
      return;
    }

    const action = payload.action;
    if (action !== 'opened' && action !== 'synchronize') {
      res.status(200).json({ ignored: true });
      return;
    }

    const repoFullName = payload.repository.full_name;
    const prNumber = payload.pull_request.number;
    const prTitle = payload.pull_request.title;

    logger.info(
      `PR ${action}: ${repoFullName}#${prNumber} — ${prTitle}`,
    );

    // Look up the service in the catalog
    const credentials = await auth.getOwnServiceCredentials();
    const entities = await catalog.getEntities(
      {
        filter: {
          kind: 'Component',
          'metadata.annotations.github.com/project-slug': repoFullName,
        },
      },
      { credentials },
    );

    if (entities.items.length === 0) {
      logger.info(
        `No catalog entity for ${repoFullName}, skipping review`,
      );
      res.status(200).json({ skipped: 'not in catalog' });
      return;
    }

    const entity = entities.items[0];

    // Run the review in the background
    reviewPullRequest({
      entity,
      repoFullName,
      prNumber,
      prTitle,
      aiServiceUrl,
      logger,
    }).catch(err =>
      logger.error(`Review failed for ${repoFullName}#${prNumber}: ${err}`),
    );

    res.status(202).json({ accepted: true });
  });

  return router;
}

Dos cosas cambiaron respecto a la version simple: anadimos el middleware json() (la ruta del webhook gestiona su propio body parsing), y usamos auth.getOwnServiceCredentials() para autenticarnos con el catalogo. Asi es como los plugins independientes hablan con otros plugins en el nuevo sistema backend de Backstage.

La decision clave: cuando llega un PR, el plugin busca el repo en el catalogo. Si el repo no esta registrado en Backstage, nos saltamos la review. Esto no es una herramienta generica de review — solo revisa servicios que forman parte de la plataforma.

La Logica de Review

Aqui es donde el contexto del catalogo y el prompt GOTCHA se juntan:

// plugins/ai-code-review/src/review.ts
import { Entity } from '@backstage/catalog-model';
import { Octokit } from '@octokit/rest';

interface ReviewOptions {
  entity: Entity;
  repoFullName: string;
  prNumber: number;
  prTitle: string;
  aiServiceUrl: string;
  logger: { info: (msg: string) => void };
}

export async function reviewPullRequest(
  options: ReviewOptions,
): Promise<void> {
  const { entity, repoFullName, prNumber, prTitle, aiServiceUrl, logger } =
    options;
  const [owner, repo] = repoFullName.split('/');
  const octokit = new Octokit({ auth: process.env.GITHUB_TOKEN });

  // 1. Fetch PR diff
  const { data: diff } = await octokit.pulls.get({
    owner,
    repo,
    pull_number: prNumber,
    mediaType: { format: 'diff' },
  });

  const diffText = diff as unknown as string;

  // Limit diff size to avoid exceeding token limits
  const maxDiffLength = 15000;
  const truncatedDiff =
    diffText.length > maxDiffLength
      ? diffText.slice(0, maxDiffLength) + '\n[diff truncated]'
      : diffText;

  // 2. Read GOTCHA.md from the repo (if it exists)
  let gotchaHeuristics = 'No GOTCHA.md found in this repository.';
  try {
    const { data: gotchaFile } = await octokit.repos.getContent({
      owner,
      repo,
      path: 'GOTCHA.md',
      mediaType: { format: 'raw' },
    });
    const gotchaContent = gotchaFile as unknown as string;

    // Extract just the HEURISTICS section
    const heuristicsMatch = gotchaContent.match(
      /## HEURISTICS\s*\n([\s\S]*?)(?=\n## [A-Z]|\n---|\$)/,
    );
    if (heuristicsMatch) {
      gotchaHeuristics = heuristicsMatch[1].trim();
    }
  } catch {
    // No GOTCHA.md — use catalog metadata only
  }

  // 3. Build context from catalog entity
  const serviceName = entity.metadata.name;
  const serviceDescription = entity.metadata.description ?? 'No description';
  const tags = (entity.metadata.tags as string[]) ?? [];
  // Use tags as a proxy for dependencies — in production,
  // read from catalog relations (dependsOn)
  const dependencies = tags;

  logger.info(`Reviewing ${repoFullName}#${prNumber} with catalog context`);

  // 4. Call AI service
  const res = await fetch(`${aiServiceUrl}/api/review`, {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({
      serviceName,
      serviceDescription,
      tags,
      dependencies,
      gotchaHeuristics,
      prTitle,
      diff: truncatedDiff,
    }),
  });

  if (!res.ok) {
    throw new Error(`AI review service returned ${res.status}`);
  }

  const { review } = (await res.json()) as { review: string };

  // 5. Post review as PR comment
  await octokit.issues.createComment({
    owner,
    repo,
    issue_number: prNumber,
    body: formatReviewComment(review, serviceName),
  });

  logger.info(`Review posted for ${repoFullName}#${prNumber}`);
}

function formatReviewComment(
  review: string,
  serviceName: string,
): string {
  return [
    `## Forge Code Review — ${serviceName}`,
    '',
    '*Reviewed with context from the Software Catalog and GOTCHA.md*',
    '',
    review,
    '',
    '---',
    '*Generated by [Forge](https://github.com/victorZKov/forge) AI Code Review Plugin*',
  ].join('\n');
}

Como se Ve una Review

Un developer abre un PR para invoice-api. El PR anade un nuevo endpoint que crea una factura y envia un mensaje a Service Bus. La review de IA responde:

## Forge Code Review — invoice-api

*Reviewed with context from the Software Catalog and GOTCHA.md*

### Finding 1: Synchronous Service Bus call in request pipeline
**File:** `Endpoints/CreateInvoice.cs`, line 34
**Issue:** `await serviceBusClient.SendMessageAsync(message)` is called inside
the HTTP request handler before returning the response.
**Why it matters:** The GOTCHA heuristics for this service say
"No synchronous Service Bus sends in request pipeline." If Service Bus
is slow or unavailable, the HTTP request blocks.
**Fix:** Move the send to a background task or use the Outbox pattern —
write the event to the database in the same transaction as the invoice,
then process it asynchronously.

### Finding 2: Entity exposed directly in response
**File:** `Endpoints/CreateInvoice.cs`, line 42
**Issue:** The endpoint returns `Results.Ok(invoice)` where `invoice` is
the EF Core entity.
**Why it matters:** The GOTCHA heuristics say "Return DTOs, not entities."
Returning the entity exposes the database schema (including `DeletedAt`,
internal IDs) to the API consumer.
**Fix:** Create a `CreateInvoiceResponse` record with only the fields
the client needs.

### Overall
The endpoint logic is correct and the validation looks good.
The two findings above are architectural — fixing them aligns the code
with the patterns documented for this service.

---
*Generated by Forge AI Code Review Plugin*

Esto no es una review generica de “deberias anadir manejo de errores.” Hace referencia a las reglas especificas de este servicio en concreto. El reviewer que recibe este PR ahora tiene contexto arquitectonico sin necesidad de leerse el GOTCHA.md por su cuenta.

El Webhook de GitHub

Configura el webhook en GitHub (en los settings del repositorio o a nivel de organizacion):

Payload URL: https://your-backstage/api/ai-code-review/webhook/github
Content type: application/json
Secret: Usa un webhook secret y validalo en el router (omitido aqui para simplificar)
Events: Pull requests

Registrando el Plugin

En packages/backend/src/index.ts:

import { aiCodeReviewPlugin } from '@internal/plugin-ai-code-review';

backend.add(aiCodeReviewPlugin);

El plugin lee forge.aiServiceUrl de app-config.yaml (la misma config que el scaffolder y el enricher).

Cuando Saltarse la Review

No todos los PRs necesitan una review de IA. El plugin se salta reviews cuando:

El repo no esta registrado en el catalogo de Backstage
El PR solo cambia archivos de documentacion (.md, .txt)
El diff esta vacio (merge commits, reverts)

Anade esta comprobacion en el router:

// Skip docs-only PRs
const changedFiles = payload.pull_request.changed_files;
if (changedFiles === 0) {
  res.status(200).json({ skipped: 'no changes' });
  return;
}

Para un filtro mas completo, obtiene la lista de archivos desde GitHub y comprueba las extensiones:

const { data: files } = await octokit.pulls.listFiles({
  owner,
  repo,
  pull_number: prNumber,
});

const codeFiles = files.filter(
  f => !f.filename.match(/\.(md|txt|png|jpg|svg)$/),
);

if (codeFiles.length === 0) {
  logger.info(`PR #${prNumber}: only docs/assets changed, skipping`);
  res.status(200).json({ skipped: 'docs only' });
  return;
}

Checklist

Endpoint de AI review (/api/review) acepta contexto del catalogo + diff y devuelve review estructurada
Plugin de Backstage registrado y escuchando webhooks de GitHub
El plugin busca la entidad del catalogo por la annotation github.com/project-slug
Las heuristics del GOTCHA.md se extraen y se incluyen en el prompt de review
La review se publica como comentario en el PR de GitHub con el nombre del servicio y la fuente de contexto
Los PRs que solo son docs se saltan
El diff se trunca en PRs grandes para no pasarse del limite de tokens

Antes del Siguiente Articulo

Ahora tienes un asistente de code review que sabe lo que esta revisando. Lee el catalogo, lee las heuristics de GOTCHA y da feedback que referencia las reglas especificas del servicio concreto.

Pero el servicio de IA solo usa el chat model. Cuando un developer pregunta “como gestiona la autenticacion el servicio de facturas?” o “cual es la politica de reintentos para Service Bus?”, la IA solo puede responder con lo que hay en el prompt GOTCHA.

Y si la IA pudiera buscar en la documentacion real? Y si los TechDocs de cada servicio estuvieran indexados en un vector store, y la IA pudiera recuperar documentos relevantes antes de responder?

Eso es el articulo 5: TechDocs RAG — Retrieval-Augmented Generation para la documentacion de tu plataforma.

Si esta serie te resulta util, considera invitarme a un cafe.

Este es el articulo 4 de la serie AI-Native IDP. Siguiente: TechDocs RAG — dando acceso a la IA a la documentacion de tu plataforma.