Response caching

A hot read endpoint often renders the same response over and over while nothing has changed. Re-running the handler (and its database or upstream calls) each time is pure waste. The responseCache() middleware stores rendered response bodies and replays them for matching requests, so the handler is not invoked at all while a cached representation is fresh.

It completes (and does not overlap with) the two caching-adjacent helpers DaloyJS already ships. etag() answers conditional GETs with 304 Not Modified but still runs the handler to produce the body it hashes; compression() shrinks the bytes on the wire but caches nothing. responseCache() is the missing third piece: it caches the body.

It is built-in and dependency-free, built on the Web-standard Request/Response, so it runs unchanged on Node, Bun, Deno, and Cloudflare Workers.

Quick start

Mount responseCache() ahead of the read routes whose rendered bodies are safe to reuse for a short window. By default only GET / HEAD responses with status 200 are cached.

import { App, responseCache } from "@daloyjs/core";
import { z } from "zod";

const app = new App();

// Reuse rendered bodies for 30 seconds.
app.use(responseCache({ ttlSeconds: 30 }));

app.get(
  "/products",
  {
    operationId: "listProducts",
    responses: {
      200: { description: "ok", body: z.array(z.object({ id: z.string() })) },
    },
  },
  async () => {
    const products = await db.listProducts(); // skipped on a fresh cache hit
    return { status: 200 as const, body: products };
  },
);

Each response the cache handles carries an X-Cache marker (HIT, MISS, or STALE), plus an Age header on a hit, so caches and clients can observe the outcome. A request that bypasses the cache entirely (a non-GET/HEAD method, an Authorization header, or a request Cache-Control: no-store) passes through unmarked.

How it works

For an eligible request the middleware derives a cache key and:

Three cache outcomes

requestEligible GET/HEAD, derive cache keymethod + path + query (+ varyHeaders)

freshHITstored body served, handler skipped

within SWR windowSTALEstale served now, one background refresh

no entryMISShandler runs, cacheable response stored

A handled request's response carries an X-Cache marker (HIT, STALE, or MISS); a request that bypasses the cache (non-GET/HEAD, an Authorization header, or Cache-Control: no-store) carries none. On a fresh hit the handler is never invoked. STALE requires a revalidate callback and serves the old body immediately while a single de-duplicated refresh repopulates the entry.

Fresh hit: the stored response is served and the handler does not run (X-Cache: HIT).
Stale hit within the SWR window (requires revalidate): the stale response is served immediately (X-Cache: STALE) while a single, de-duplicated background refresh repopulates the cache.
Miss: the handler runs and a cacheable response is stored (X-Cache: MISS).

Cache-Control orchestration

Freshness is derived from the response’s own Cache-Control when present (s-maxage wins over max-age), falling back to the configured ttlSeconds. Responses are never cached when they:

carry Cache-Control: no-store, private, or no-cache;
include a Set-Cookie header (per-user / credentialed responses must not be shared);
fail cacheableStatus (default: only 200); or
exceed maxBodyBytes(1 MiB by default).

On the request side:

Cache-Control: no-store bypasses the cache entirely (no read, no write).
Cache-Control: no-cache bypasses the read but still refreshes the stored entry. This is exactly what the background stale-while-revalidate refresh uses, which makes revalidation recursion-safe.

stale-while-revalidate

With staleWhileRevalidateSeconds plus a revalidate callback (typically wired to app.fetch), a stale-but-recent entry is served immediately while a single background refresh runs. The refresh request carries Cache-Control: no-cache so it bypasses the cached read and repopulates the entry without recursing.

const app = new App();

app.use(
  responseCache({
    ttlSeconds: 30,             // serve fresh for 30s
    staleWhileRevalidateSeconds: 300, // then serve stale up to 5 min while refreshing
    revalidate: (req) => app.fetch(req),
  }),
);

Options

app.use(
  responseCache({
    // Freshness lifetime when the response has no s-maxage/max-age. Default: 60.
    ttlSeconds: 60,
    // Extra seconds a stale entry may be served while refreshing. Default: 0.
    staleWhileRevalidateSeconds: 0,
    // Background refresh callback; required to enable SWR.
    revalidate: (req) => app.fetch(req),
    // Methods eligible for caching. Default: GET, HEAD.
    methods: ["GET", "HEAD"],
    // Which response statuses are cacheable. Default: status === 200.
    cacheableStatus: (status) => status === 200,
    // Request headers whose values partition the cache (e.g. localization).
    varyHeaders: ["accept-language"],
    // Cache Authorization-bearing requests only when responses are shareable.
    cacheAuthenticatedRequests: false,
    // Custom cache key; return null to skip caching this request.
    keyGenerator: (ctx) => new URL(ctx.request.url).pathname,
    // Largest response body buffered + stored. Default: 1 MiB.
    maxBodyBytes: 1_048_576,
    // Response header marking the outcome. Set to null to disable. Default: "x-cache".
    statusHeaderName: "x-cache",
    // Share one in-memory store across mounts with the same id.
    groupId: "catalog",
  }),
);

Pluggable stores

The default MemoryResponseCacheStore is process-local, perfect for tests and single-instance deployments. For a multi-instance or serverless fleet, supply a shared backend by implementing ResponseCacheStore. The contract mirrors SessionStore and the rate-limit store; entries whose staleUntil is in the past should be treated as missing.

import type { ResponseCacheStore, CachedResponse } from "@daloyjs/core";

const redisResponseCacheStore: ResponseCacheStore = {
  async get(key) {
    const raw = await redis.get(key);
    return raw ? (JSON.parse(raw) as CachedResponse) : null;
  },
  async set(key, entry, ttlMs) {
    await redis.set(key, JSON.stringify(entry), "PX", ttlMs);
  },
  async delete(key) {
    await redis.del(key);
  },
};

app.use(responseCache({ store: redisResponseCacheStore }));

Security notes

Credentialed and per-user responses are never shared by default: anything carrying Set-Cookie or Cache-Control: private | no-store | no-cache is skipped, the same skip posture as etag().
Requests carrying an Authorizationheader bypass the cache entirely (CWE-524, RFC 9111 §3.5). A shared cache keyed on method + path + query does not include the credential, so caching an authenticated response would serve one user's private data to the next caller of the same resource. Set cacheAuthenticatedRequests: true only for content that is genuinely shareable across principals, and pair it with varyHeaders: ["authorization"] (or a custom keyGenerator) so distinct callers cannot collide.
Only 200 OK is cached unless you widen cacheableStatus, so error pages do not poison the cache.
Stored bodies are capped by maxBodyBytes to bound memory growth from large replies.
Use varyHeaders (or a custom keyGenerator) to partition the cache whenever the response depends on a request header such as Accept-Language.

Search docs