Backstage as an MCP client: calling any AI tool from a scaffolder template

Building scaffolder-backend-module-mcp — the inverse of mcp-actions-backend. One generic action, any MCP server.

Jun 3, 2026 · Backstage AI plugins, part 2

backstagemcpmodel-context-protocolscaffolderaitypescript

Backstage shipped a plugin called mcp-actions-backend a while back. It exposes Backstage’s actions as MCP tools, so external AI agents (Claude Desktop, Copilot, anything that speaks Model Context Protocol) can discover and invoke them. Backstage as MCP server.

The inverse is missing: Backstage as MCP client. Templates can already call github:repo:create, gitlab:repo:push, and dozens of other hand-rolled actions — but there’s no clean way to call the rich and growing ecosystem of MCP servers from inside a template. Filesystem operations, web fetches, GitHub repo-search tools, kubectl wrappers, internal company MCP servers — all of them are inaccessible to scaffolder templates unless someone writes a Backstage-specific wrapper.

So I built @backstage/plugin-scaffolder-backend-module-mcp. One scaffolder action — mcp:call — and a small registry that lazily spawns MCP servers configured in app-config.yaml and reuses the connection across calls.

This post walks through the design, the code, and the rough edges.

What it looks like to use

Declare MCP servers in app-config.yaml:

scaffolder:
  mcpServers:
    fs:
      command: npx
      args: ['-y', '@modelcontextprotocol/server-filesystem', '/workspace']
    fetch:
      command: uvx
      args: ['mcp-server-fetch']
      timeoutMs: 30000

Use them from a template:

steps:
  - id: read-config
    name: Read repo config via MCP filesystem server
    action: mcp:call
    input:
      server: fs
      tool: read_file
      arguments:
        path: /workspace/template.yaml

  - id: fetch-spec
    name: Fetch OpenAPI spec via MCP fetch server
    action: mcp:call
    input:
      server: fetch
      tool: fetch
      arguments:
        url: https://example.com/openapi.yaml

The action returns the raw MCP tool response under steps.<id>.output.result — typically an object of the shape { content: [{ type: 'text', text: '...' }] }. Templates can pass that into subsequent steps with the usual ${{ steps.read-config.output.result }} expression.

Architecture

Three pieces, ~200 LOC total:

Architecture diagram

The registry holds per-server config and a Map<serverId, Promise<Client>>. On the first call for a given server, the registry spawns the process and opens an MCP StdioClientTransport. The Promise sits in the map; the second call for the same server awaits the same Promise and reuses the connection.

// services/McpServerRegistry.ts (excerpt)
async callTool(
  serverId: string,
  toolName: string,
  args: Record<string, unknown>,
): Promise<unknown> {
  const server = this.servers.get(serverId);
  if (!server) {
    throw new NotFoundError(
      `MCP server '${serverId}' is not configured. ` +
      `Configured servers: ${this.list().join(', ') || '(none)'}`,
    );
  }
  const client = await this.connect(server);

  let timer: NodeJS.Timeout | undefined;
  const timeout = new Promise<never>((_, reject) => {
    timer = setTimeout(
      () => reject(new Error(
        `MCP tool '${toolName}' on server '${serverId}' ` +
        `timed out after ${server.timeoutMs}ms`,
      )),
      server.timeoutMs,
    );
  });

  try {
    return await Promise.race([
      client.callTool({ name: toolName, arguments: args }),
      timeout,
    ]);
  } finally {
    if (timer) clearTimeout(timer);
  }
}

One detail worth flagging: the failed-connection promise is explicitly not cached. If the first spawn fails (binary not on PATH, crash on startup, whatever), the next call needs to try again — not see a permanently rejected promise:

private connect(server: McpServerConfig) {
  let pending = this.clients.get(server.id);
  if (!pending) {
    pending = this.clientFactory(server).catch(e => {
      // Failed connections must not be cached, so retries can recover.
      this.clients.delete(server.id);
      throw e;
    });
    this.clients.set(server.id, pending);
  }
  return pending;
}

The module registers a shutdown hook via coreServices.lifecycle so MCP processes die when Backstage shuts down — important for stdio transport, where the child process lives as long as you keep the pipe open.

The action: deliberately generic

The first version of mcp:call is one action with three inputs (server, tool, arguments):

return createTemplateAction({
  id: 'mcp:call',
  schema: {
    input: {
      server: z => z.string({ description: '...' }),
      tool: z => z.string({ description: '...' }),
      arguments: z => z.record(z.unknown()).optional(),
    },
    output: {
      result: z => z.unknown().describe('...'),
    },
  },
  async handler(ctx) {
    const { server, tool, arguments: args } = ctx.input;
    const result = await registry.callTool(
      server, tool, (args ?? {}) as Record<string, unknown>,
    );
    ctx.output('result', result as any);
  },
});

The alternative would be to dynamically register one scaffolder action per MCP tool at startup — so templates write action: mcp.fs:read_file instead of the generic mcp:call. That’s nicer ergonomically, but it forces every MCP server to be connected at backend startup just to enumerate its tools. With the generic action, connections are lazy and the cost of declaring an MCP server in config is zero until something actually calls it. The per-tool sugar can come later as a second action that wraps the first.

What surprised me

Three things, in order of severity:

Backstage plugins need .eslintrc.js. Without it, ESLint treats your .ts files as plain JavaScript and you get Parsing error: The keyword 'import' is reserved on every import statement. The fix is a one-line re-export:
```
module.exports = require('@backstage/cli/config/eslint-factory')(__dirname);
```
I’d written all the code, written the tests, and was ready to commit before realising why lint was rejecting half of my files.
The full Apache 2.0 header is enforced by lint. Not the short version ending at “limitations under the License” — the full version including “WITHOUT WARRANTIES OR CONDITIONS.” Catch this once and you’ll never skip it again.
Jest fake timers and unsettled promises don’t mix. My first timeout test used jest.useFakeTimers() against a callTool mock that returned new Promise(() => {}) (never resolves). Advancing the timers fired the setTimeout, but the test still timed out because the microtask queue never flushed cleanly. Switched to a 5ms real timer and a real timeout assertion. Worked first try.

Limitations and what’s next

stdio only. HTTP and SSE transports aren’t wired up. That’s the next feature — most production MCP servers people want to talk to live as HTTP endpoints behind auth.
Single generic action. Per-tool actions (mcp.fs:read_file) would be nicer in templates; I’d build them on top of mcp:call rather than replacing it.
No per-call auth pass-through. When HTTP transport lands, forwarding the calling user’s token to the MCP server will matter.
Catalog integration. Once RFC #32062 (MCP servers as catalog API entities) is fully shipped, the scaffolder client should resolve server IDs against the catalog instead of (or in addition to) app-config. That’s the path to org-level discovery: a template references entityRef: api:default/payments-mcp and the scaffolder resolves the rest.

Install

yarn --cwd packages/backend add @theplatformlog/scaffolder-backend-module-mcp

// packages/backend/src/index.ts
backend.add(import('@theplatformlog/scaffolder-backend-module-mcp'));

The upstream Backstage PR is #34490. Once that merges it’ll publish as @backstage/plugin-scaffolder-backend-module-mcp — swap the package name then if you want to track upstream.

Code

Branch lives at Naga15/backstage feat/scaffolder-backend-module-mcp. 13 unit tests, lint clean, builds clean. Upstream draft PR #34490 is open for maintainer signal.

Next post in the series: catalog-assistant-backend, a plugin that answers natural-language questions about your Backstage catalog using an LLM grounded on real catalog entities. Same Vercel-AI-SDK pattern, very different use case.