Devlin again, writing from Norway with the late-spring sun doing that thing where it pretends it's 4 p.m. when it's actually 9 p.m. This post is about a small, boring API surface that solves a very large, very expensive problem: how do platform teams ship cross-cutting concerns (observability, registration, graceful drain, policy) without making every route file import the infra layer?
Two callbacks. That's the answer. Two callbacks and the discipline to put the platform code in plugins instead of inside routes.
The shape of the problem
routes/users.ts (the bad version)
# The pattern every large-team backend eventually grows:
#
# routes/users.ts
# import { metrics } from "../platform/metrics.js"; // ← infra import
# import { registry } from "../platform/registry.js"; // ← infra import
# import { tracer } from "../platform/tracing.js"; // ← infra import
# import { drainSignal } from "../platform/drain.js"; // ← infra import
#
# handler(ctx) {
# metrics.inc("users.read");
# const span = tracer.startSpan("users.read");
# if (drainSignal.shouldRefuse()) return { status: 503, ... };
# ...
# }
#
# Multiply by 200 route files. Every "small" platform change is now a
# 200-file pull request. Every junior dev learns to copy-paste the
# preamble. Every audit finds three routes that forgot one of them.
#
# The fix is not "discipline". The fix is to give the platform team a
# place to hang cross-cutting concerns that's NOT inside the route
# files. That place is the plugin lifecycle.
four infra imports · multiplied by 200 route files · this is what burnout looks likeTS · UTF-8 · LF
Every team I've worked with eventually wrote this preamble in every route. Every team then wrote a wiki page telling new joiners to remember the preamble. Every team then had an audit finding three months later because three routes had forgotten part of the preamble. The framework should make the wiki page unnecessary.
The whole API, on one screen
// Two events, that's the whole API:
//
// app.onPluginInstalled(listener)
// ↑ fires once per app.register(...) call, AFTER the plugin's
// register() (sync or async) finishes. Gets { name?, prefix }.
//
// app.onShutdown(listener)
// ↑ fires when app.shutdown() starts, BEFORE in-flight requests
// drain. Gets { reason?, timeoutMs }.
//
// Plus the two you already knew about:
//
// app.onClose(hook) // AFTER drain — close pools, etc.
// await app.ready() // waits for async plugins
//
// That's the entire surface for cross-cutting platform plugins.
// Everything below is built on those four primitives.
two new events · two you already knew · that's itTS · UTF-8 · LF
onPluginInstalledFires: once per app.register(), after register() resolves
Receives { name?, prefix } where prefix is the effective mount path after parent and group prefixes are applied. Async plugin? The listener fires when its promise settles. Anonymous plugin? name is undefined.
onShutdownFires: when app.shutdown() begins, BEFORE drain
Receives { reason?, timeoutMs }. The right place to push metrics, deregister from service discovery, and flush span buffers — everything that needs the network to still work.
onCloseFires: AFTER in-flight requests drain
Use for releasing resources you paid for at boot: pool.end(), redis disconnect, queue consumer stop. Errors are caught and logged so one bad cleanup doesn't take down the rest.
await app.ready()Fires: resolves when all async plugins finish
Sync plugins also push observer promises here, so it's always safe to call. Standard pattern: register → ready → serve.
What a register() call actually looks like
// What a Fastify-style register() call looks like in DaloyJS.
// The plugin is just a function that receives a SCOPED child App.
import { App } from "@daloyjs/core";
import type { Hooks } from "@daloyjs/core";
const app = new App();
app.register(
{
name: "users",
register: (child) => {
child.route({
method: "GET",
path: "/me",
operationId: "getMe",
responses: { 200: { description: "ok" } },
handler: async () => ({ status: 200, body: {} }),
});
},
},
{
prefix: "/v1/users",
tags: ["users"],
hooks: {} satisfies Hooks, // group-scoped middleware
},
);
// Every register() fires onPluginInstalled exactly once, with:
// { name: "users", prefix: "/v1/users" }
//
// Anonymous plugins (passed as a bare function) fire too, with name=undefined.
Fastify-style register · scoped child App · prefix + tags + group hooksTS · UTF-8 · LF
A service-registration plugin in 40 lines
platform/registry-plugin.ts
// platform/registry-plugin.ts — a service registration plugin.
// Run by the platform team. Routes are unaware. Operations gets a
// real-time inventory of what each pod actually serves.
import type { App } from "@daloyjs/core";
import { fetch } from "undici";
export function registrationPlugin(consul: { url: string; token: string }) {
const installed: Array<{ name?: string; prefix: string }> = [];
let serviceId: string | null = null;
return (app: App) => {
// Collect every mounted plugin without touching route code.
app.onPluginInstalled((info) => {
installed.push(info);
});
// Register on boot (after ready), deregister on shutdown.
app.onShutdown(async ({ reason }) => {
if (!serviceId) return;
await fetch(`${consul.url}/v1/agent/service/deregister/${serviceId}`, {
method: "PUT",
headers: { "x-consul-token": consul.token },
});
app.log.info({ event: "deregistered", serviceId, reason });
});
// app.onClose runs AFTER drain — perfect for releasing
// the registration lock if your service mesh holds one.
app.onClose(async () => {
// ...release any final platform resource
});
// The boot side: call this from your main() after await app.ready().
return async function register(serviceName: string, address: string) {
serviceId = `${serviceName}-${crypto.randomUUID()}`;
const meta = Object.fromEntries(
installed.map((p, i) => [`plugin_${i}`, `${p.name ?? "anon"}@${p.prefix}`]),
);
await fetch(`${consul.url}/v1/agent/service/register`, {
method: "PUT",
headers: { "x-consul-token": consul.token, "content-type": "application/json" },
body: JSON.stringify({ ID: serviceId, Name: serviceName, Address: address, Meta: meta }),
});
app.log.info({ event: "registered", serviceId, plugins: installed.length });
};
};
}
register at boot · deregister on shutdown · expose plugin inventory as service metadataTS · UTF-8 · LF
Read it twice. The whole "deregister cleanly before the load balancer realizes we're going away" behavior is a single onShutdownhandler. Operations gets the deregistration in their logs at the exact moment SIGTERM arrives, and no in-flight request gets a connection-reset because we're still answering until drain completes.
A metrics plugin that flushes before drain
platform/metrics-plugin.ts
// platform/metrics-plugin.ts — Prometheus-style metrics, flushed on shutdown.
// One platform-team file. Routes never import metrics directly.
import type { App, Hooks } from "@daloyjs/core";
import { register, Counter, Histogram } from "prom-client";
const httpRequests = new Counter({
name: "http_requests_total",
help: "Count of HTTP responses by status class",
labelNames: ["method", "status_class", "plugin"],
});
const httpLatency = new Histogram({
name: "http_request_duration_seconds",
help: "Latency by plugin",
labelNames: ["method", "plugin"],
buckets: [0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5],
});
export function metricsPlugin(opts: { pushGatewayUrl?: string; job: string }) {
return (app: App) => {
// Per-plugin label is captured at install time — no global mutable state
// racing inside hot handlers.
const labelByPrefix = new Map<string, string>();
app.onPluginInstalled(({ name, prefix }) => {
labelByPrefix.set(prefix, name ?? "anon");
});
const hooks: Hooks = {
beforeHandle(ctx) {
(ctx.state as Record<string, unknown>).__metricsStart = performance.now();
},
onSend(res, ctx) {
if (!ctx) return;
const start = (ctx.state as Record<string, unknown>).__metricsStart as number;
const url = new URL(ctx.request.url);
const plugin = nearestPrefix(labelByPrefix, url.pathname);
const sec = (performance.now() - start) / 1000;
const statusClass = `${Math.floor(res.status / 100)}xx`;
httpRequests.inc({ method: ctx.request.method, status_class: statusClass, plugin });
httpLatency.observe({ method: ctx.request.method, plugin }, sec);
},
};
app.use(hooks);
// The whole point: flush metrics on graceful shutdown, BEFORE drain.
// We've still got network and CPU; once we start refusing requests we
// also lose the ability to make outbound calls reliably.
app.onShutdown(async ({ reason }) => {
if (!opts.pushGatewayUrl) return;
const body = await register.metrics();
await fetch(`${opts.pushGatewayUrl}/metrics/job/${opts.job}`, {
method: "POST",
body,
});
app.log.info({ event: "metrics_pushed", reason });
});
};
}
function nearestPrefix(map: Map<string, string>, path: string) {
let best: { prefix: string; label: string } = { prefix: "/", label: "root" };
for (const [prefix, label] of map) {
if (path.startsWith(prefix) && prefix.length > best.prefix.length) {
best = { prefix, label };
}
}
return best.label;
}
Prometheus counter/histogram · plugin label captured at install · push-gateway flush on shutdownTS · UTF-8 · LF
The clever bit: onPluginInstalled lets us capture the plugin label at install time instead of looking it up on the hot path. The metric label is correct (and stable), and the request handler does a hash-map lookup, not a router replay.
The full shutdown sequence
# What happens when app.shutdown() is called:
#
# T+0 app.shutdown(10_000, "SIGTERM") starts.
# - this.draining = true → every NEW request gets
# 503 Service Unavailable + Retry-After: 5.
# - onShutdown listeners fire IN ORDER:
# 1) metrics → push to gateway (last chance to send)
# 2) service registry → deregister (so the LB stops
# sending us new traffic; the 503 above is a
# safety net for traffic already in flight)
# 3) tracing → flush span buffer
# 4) feature flags → snapshot decisions for debugging
#
# T+t Drain loop polls inflight count every 25ms.
# Waits up to timeoutMs for in-flight requests to settle.
#
# T+drained onClose hooks fire:
# - close DB pool
# - close redis
# - close queue consumers
#
# T+done Single log line: "DaloyJS shutdown complete".
#
# Both Node and Bun adapters call app.shutdown() automatically on
# SIGINT and SIGTERM. Other runtimes (Workers, Lambda) call it from
# your handler when the lifecycle event fires.
onShutdown → 503-new-requests → drain → onClose · in that order, every timeTS · UTF-8 · LF
That ordering is non-negotiable on purpose. If you push metrics after drain, half the time the metrics never get pushed because your container runtime SIGKILLs you mid-flush. If you deregister afterdrain, you get a 30-second window where the load balancer is still routing fresh traffic to a server that's already saying 503. onShutdown exists exactly to give you the early window.
A policy plugin that fails boot
platform/policy-plugin.ts
// platform/policy-plugin.ts — central tagging / enforcement.
// Use onPluginInstalled to AUDIT every plugin's mount, and fail boot
// if anything violates the platform-team policy.
import type { App } from "@daloyjs/core";
interface Policy {
/** Plugin name → required prefix pattern */
prefixRules: RegExp;
/** Plugin names that must be present in every build */
required: string[];
/** Plugin names that are forbidden in production */
forbiddenInProd?: string[];
}
export function policyPlugin(policy: Policy) {
const env = process.env.NODE_ENV ?? "development";
const seen = new Set<string>();
const violations: string[] = [];
return (app: App) => {
app.onPluginInstalled(({ name, prefix }) => {
if (name) seen.add(name);
if (!policy.prefixRules.test(prefix)) {
violations.push(`plugin ${name ?? "anon"} mounted at non-conforming prefix "${prefix}"`);
}
if (env === "production" && name && policy.forbiddenInProd?.includes(name)) {
violations.push(`plugin ${name} is forbidden in production`);
}
});
// Call this AFTER all app.register() calls and AFTER await app.ready().
// Bonus: most platforms put this inside a tiny "verifyBoot()" helper
// that the main() entry calls before serve().
(app as App & { verifyPolicy?: () => void }).verifyPolicy = () => {
for (const required of policy.required) {
if (!seen.has(required)) violations.push(`missing required plugin: ${required}`);
}
if (violations.length > 0) {
throw new Error("Policy violations:\n - " + violations.join("\n - "));
}
app.log.info({ event: "policy_ok", plugins: [...seen] });
};
};
}
audit every install · required plugins · prefix conventions · forbidden in prodTS · UTF-8 · LF
Boot-time policy is the cheapest possible enforcement: zero runtime cost, zero false positives, fails fast at CI time. Every platform team I've seen graduate from "wiki page everyone forgets" to "real platform" does some version of this.
Composing it all in main()
// src/server.ts — what main() looks like in a real platform-team app.
// Notice: NOT ONE infra import in routes/*.ts. They just declare routes.
import { App, requestId, secureHeaders, timing } from "@daloyjs/core";
import { trace } from "@opentelemetry/api";
import { otelTracing } from "@daloyjs/core";
import { serve } from "@daloyjs/core/node";
import { registrationPlugin } from "./platform/registry-plugin.js";
import { metricsPlugin } from "./platform/metrics-plugin.js";
import { policyPlugin } from "./platform/policy-plugin.js";
import { usersPlugin } from "./routes/users.js";
import { ordersPlugin } from "./routes/orders.js";
import { adminPlugin } from "./routes/admin.js";
const app = new App({ hooks: otelTracing({ tracer: trace.getTracer("api") }) });
app.use(requestId());
app.use(secureHeaders());
app.use(timing());
// Platform plugins go FIRST so their onPluginInstalled listeners are
// in place when the route plugins below get installed.
const registerService = registrationPlugin({ url: process.env.CONSUL_URL!, token: process.env.CONSUL_TOKEN! });
app.register({ name: "platform.registry", register: registerService });
app.register({ name: "platform.metrics", register: metricsPlugin({ pushGatewayUrl: process.env.PROM_PUSH_URL, job: "api" }) });
const policy = policyPlugin({
prefixRules: /^\/v1\//,
required: ["users", "orders", "platform.metrics", "platform.registry"],
forbiddenInProd: ["admin.debug"],
});
app.register({ name: "platform.policy", register: policy });
// Application plugins. Routes only know about THEIR own concerns.
app.register({ name: "users", register: usersPlugin }, { prefix: "/v1/users", tags: ["users"] });
app.register({ name: "orders", register: ordersPlugin }, { prefix: "/v1/orders", tags: ["orders"] });
if (process.env.NODE_ENV !== "production") {
app.register({ name: "admin.debug", register: adminPlugin }, { prefix: "/__admin", tags: ["debug"] });
}
await app.ready(); // wait for async plugins
(app as App & { verifyPolicy: () => void }).verifyPolicy(); // fail boot if violated
serve(app, { port: Number(process.env.PORT ?? 3000) });
platform plugins FIRST · then app plugins · ready → verifyPolicy → serveTS · UTF-8 · LF
Note carefully: routes/users.ts, routes/orders.ts, routes/admin.ts have zero imports from platform/*. The metrics show up, the registration happens, the policy fires — all without a single line in the route files knowing any of that exists. That is the entire point.
What the logs say
# What boot looks like in the logs of a single replica:
#
# 08:00:00.001 level=info msg="DaloyJS booting"
# 08:00:00.043 event=registered serviceId=api-2cf... plugins=5
# 08:00:00.044 event=policy_ok plugins=["users","orders","platform.metrics","platform.registry","platform.policy"]
# 08:00:00.051 msg="DaloyJS listening on :3000"
#
# What graceful shutdown looks like:
#
# 08:42:11.001 msg="SIGTERM received"
# 08:42:11.002 event=metrics_pushed reason=SIGTERM
# 08:42:11.004 event=deregistered serviceId=api-2cf... reason=SIGTERM
# 08:42:11.005 msg="draining" inflight=12
# 08:42:11.731 msg="closed db pool"
# 08:42:11.732 msg="closed redis"
# 08:42:11.733 inflight=0 msg="DaloyJS shutdown complete"
#
# The whole sequence is composable. Add a plugin → its listener slots
# into onPluginInstalled and onShutdown alongside everyone else's.
boot · steady state · graceful shutdown · 8 lines tell the whole storyTS · UTF-8 · LF
Testing platform plugins
tests/platform-plugins.test.ts
// tests/platform-plugins.test.ts — verify the wiring without a network.
import { test } from "node:test";
import assert from "node:assert/strict";
import { App } from "@daloyjs/core";
test("onPluginInstalled fires once per register, in order", async () => {
const app = new App({ logger: false });
const events: Array<{ name?: string; prefix: string }> = [];
app.onPluginInstalled((e) => { events.push(e); });
app.register({ name: "users", register: () => {} }, { prefix: "/v1/users" });
app.register({ name: "orders", register: () => {} }, { prefix: "/v1/orders" });
app.register(() => {}, { prefix: "/anon" }); // anonymous plugin
await app.ready();
assert.deepEqual(events, [
{ name: "users", prefix: "/v1/users" },
{ name: "orders", prefix: "/v1/orders" },
{ name: undefined, prefix: "/anon" },
]);
});
test("onShutdown runs BEFORE onClose; both run on app.shutdown()", async () => {
const app = new App({ logger: false });
const order: string[] = [];
app.onShutdown(async () => { order.push("shutdown"); });
app.onClose(async () => { order.push("close"); });
await app.shutdown(50, "test");
assert.deepEqual(order, ["shutdown", "close"]);
});
node:test · no network · verify ordering + argumentsTS · UTF-8 · LF
The pre-flight checklist
# Platform plugin pre-flight checklist.
#
# 1) Name every platform plugin. Anonymous plugins are fine for
# application code, but platform-team plugins always have names
# so policy / audit can enforce presence.
#
# 2) Register platform plugins BEFORE application ones. onPluginInstalled
# listeners installed later don't retroactively fire for earlier
# register() calls.
#
# 3) await app.ready() between register() and serve(). Otherwise async
# plugins (database pool, feature-flag fetch) may not be initialized
# when the first request arrives.
#
# 4) Pick a side: onShutdown for "do something while we're still
# healthy" (push metrics, deregister, flush spans). onClose for
# "release resources we already paid for" (pool.end(), file
# handles).
#
# 5) Keep listeners idempotent. SIGTERM can arrive twice in container
# shutdown sequences; both onShutdown and onClose are guarded
# internally but YOUR listeners should be too.
#
# 6) Don't make app code import platform code. The whole point is that
# routes/users.ts knows only about "users". If a route is reaching
# for the metrics registry directly, you've leaked the platform
# boundary; pull the concern into a plugin.
#
# 7) Log listener errors. The framework catches and logs them already;
# your listeners should add structured fields so operations can
# tell metrics-flush failures apart from registry-deregister failures
# in a noisy postmortem.
seven items · pin to the team wiki next to your deploy runbookTS · UTF-8 · LF
Wrapping up
The number of large-team backend codebases I've seen with cross-cutting infrastructure imports leaking into route files is… depressing. The fix is structural, not motivational. Give the platform team a place to put their concerns. Make that place inert from the perspective of application code. DaloyJS does that with two callbacks (onPluginInstalled, onShutdown), two cleanup hooks (onClose, app.ready()), and a Fastify-shaped register() that scopes everything.
Closest neighbors: the middleware lifecycle post for the per-request hooks the metrics plugin uses, the observability post for the tracing piece that fits into the same shutdown sequence, and the five-runtimes post for why the platform plugins above run identically on Node, Bun, Workers, Vercel Edge, and Lambda.
— Devlin