Search docs

Jump between documentation pages.

Browse docs

Vercel AI SDK

The Vercel AI SDK (v7) is built on web standards: streamText() and friends return a standard Response whose body is a ReadableStream. DaloyJS is a web-standard core that already streams and already hands you the raw Request, so there is no adapter to install. You host the AI SDK the same way you host any other route, and you get DaloyJS's guardrails around it for free.

This page covers the four things worth knowing: a streaming chat endpoint, structured output validated against your contract, tool calls behind fetchGuard(), and the secure-by-default layer that wraps all of it.

Install

The AI SDK and your model provider are your dependencies. @daloyjs/core stays at zero runtime dependencies; it does not bundle or re-export the AI SDK.

bash
# The AI SDK is YOUR dependency, not part of @daloyjs/core.
# @daloyjs/core stays at zero runtime dependencies; you add the
# model provider you actually use.
pnpm add ai @ai-sdk/openai

The request path

Every AI request flows through the same guardrails as the rest of your API before it ever reaches the model, and the model's output streams back through DaloyJS unbuffered.

An AI request through DaloyJS
  1. 01edgeGuardrailsrate limit, auth, body cap, secureHeaders
  2. 02contractRequest schemavalidated before the handler runs
  3. 03modelAI SDK callstreamText / generateObject
  4. 04toolsfetchGuardtool fetches are SSRF-safe
  5. 05clientStreamed ResponseReadableStream, passed through
The deployment-time layer (auth, limits, SSRF defense) holds even when the model is prompt-injected or hallucinating. That is the point: the guardrails do not depend on the model behaving.

Streaming chat

The AI SDK's result.toUIMessageStreamResponse() returns a web-standard Response, and a DaloyJS handler can return a raw Response directly. The stream passes through, backpressure and all, and the framework still finalizes it: it adds the request id, applies secureHeaders() and CORS, runs your onSend hooks, and strips fingerprint headers, exactly as it does for a structured result. This endpoint works with the AI SDK's useChat() hook on the client with no extra wiring.

ts
// A streaming chat endpoint, compatible with the AI SDK's
// useChat() hook on the client. The AI SDK produces a web-standard
// Response, and a DaloyJS handler can return a raw Response directly:
// the framework still finalizes it (request id, secureHeaders, CORS,
// fingerprint stripping) like any other response.
import { z } from "zod";
import { App } from "@daloyjs/core";
import { streamText, convertToModelMessages } from "ai";
import { openai } from "@ai-sdk/openai";

export const app = new App();

app.route({
  method: "POST",
  path: "/api/chat",
  operationId: "chat",
  request: {
    // You still validate the request. A message count cap plus the
    // default 1 MiB body limit are your first abuse guard, even on a
    // streaming route. Tighten z.unknown() to a UIMessage schema if
    // you want a stricter contract.
    body: z
      .object({ messages: z.array(z.unknown()).min(1).max(50) })
      .strict(),
  },
  responses: {
    // Streaming routes do not carry a response-body schema; OpenAPI
    // documents them as a stream. That is the one honest trade-off.
    200: { description: "UI message stream (text/event-stream)" },
  },
  handler: async ({ body, request }) => {
    const result = streamText({
      model: openai("gpt-5.1"),
      messages: convertToModelMessages(body.messages as never),
      // Cancel the upstream model call if the client disconnects.
      abortSignal: request.signal,
    });

    // Return the Response as-is. No mapping, no adapter.
    return result.toUIMessageStreamResponse();
  },
});

Structured output, validated by your contract

This is the pattern that is meaningfully better on a contract-first framework. With generateObject(), the model is constrained to a Zod schema. Reuse that same schema as the route's response schema and it becomes your OpenAPI shape and your typed client too: one source of truth. The model output is then validated twice, once by the AI SDK and once by DaloyJS at the HTTP boundary, so a drifting model or a schema mismatch becomes a controlled error instead of a malformed body on the wire.

ts
// This is the pattern that is genuinely better on a contract-first
// framework: the SAME Zod schema is the model's output schema, the
// route's response schema, the OpenAPI shape, and the typed client.
// One source of truth, validated twice (once by the SDK, once at the
// HTTP boundary), so a drifting model can never leak a malformed body.
import { z } from "zod";
import { App } from "@daloyjs/core";
import { generateObject } from "ai";
import { openai } from "@ai-sdk/openai";

export const app = new App();

const Analysis = z
  .object({
    sentiment: z.enum(["positive", "neutral", "negative"]),
    summary: z.string().max(280),
    topics: z.array(z.string()).max(8),
  })
  .strict();

app.route({
  method: "POST",
  path: "/api/analyze",
  operationId: "analyzeText",
  request: {
    body: z.object({ text: z.string().min(1).max(10_000) }).strict(),
  },
  responses: {
    // Reuse the exact schema the model is constrained to.
    200: { description: "analysis", schema: Analysis },
  },
  handler: async ({ body }) => {
    const { object } = await generateObject({
      model: openai("gpt-5.1"),
      schema: Analysis,
      prompt: body.text,
    });

    // 'object' was validated by the AI SDK. DaloyJS validates it
    // AGAIN against the response schema before it leaves, so even an
    // SDK or schema mismatch becomes a controlled 500, never a leak.
    return { status: 200 as const, body: object };
  },
});

Tool calling behind fetchGuard

AI SDK 7's tool loop is where prompt injection becomes a server-side request forgery problem: the model asks a tool to fetch a URL, and a poisoned prompt aims it at 169.254.169.254 or your internal network. Run every tool fetch through fetchGuard() and that class is default-denied, including redirect bounces. Bound the step count so a runaway agent cannot loop forever.

ts
// Tool calls are where prompt injection turns into SSRF: the model
// asks your tool to fetch a URL, and a poisoned prompt points it at
// 169.254.169.254 (cloud metadata) or your internal network.
// Route every tool fetch through fetchGuard() and that whole class
// of attack is default-denied, including redirects.
import { z } from "zod";
import { App, fetchGuard } from "@daloyjs/core";
import { streamText, convertToModelMessages, tool, stepCountIs } from "ai";
import { openai } from "@ai-sdk/openai";

export const app = new App();

// A guarded fetch: loopback, RFC1918, link-local, and cloud-metadata
// IPs are refused; only the hosts you allow get through.
const safeFetch = fetchGuard({ allow: ["https://api.weather.example"] });

const getWeather = tool({
  description: "Get the current weather for a city.",
  inputSchema: z.object({ city: z.string().min(1).max(80) }),
  execute: async ({ city }) => {
    const r = await safeFetch(
      `https://api.weather.example/v1?city=${encodeURIComponent(city)}`,
    );
    return (await r.json()) as { tempC: number; summary: string };
  },
});

app.route({
  method: "POST",
  path: "/api/agent",
  operationId: "agent",
  request: {
    body: z.object({ messages: z.array(z.unknown()).min(1).max(50) }).strict(),
  },
  responses: { 200: { description: "UI message stream" } },
  handler: async ({ body, request }) => {
    const result = streamText({
      model: openai("gpt-5.1"),
      messages: convertToModelMessages(body.messages as never),
      tools: { getWeather },
      // AI SDK 7 multi-step tool loop, bounded so a runaway agent
      // cannot loop forever on your dime.
      stopWhen: stepCountIs(5),
      abortSignal: request.signal,
    });

    return result.toUIMessageStreamResponse();
  },
});

The secure-by-default layer

None of the above is the interesting part. The interesting part is what DaloyJS does around your AI endpoint without you asking: a 1 MiB body cap that limits prompt size, a request timeout so a stuck model call cannot hang a worker, production-mode error redaction so an upstream provider error never leaks internals, structured logs that redact provider API keys, and the rate limiting and auth you add in one line each. The model is a caller, not a user. Treat it like one.

ts
// The deployment-time layer the model never gets a vote in.
// All of this ships in @daloyjs/core with zero runtime dependencies.
import {
  App,
  secureHeaders,
  requestId,
  structuredLogger,
  rateLimit,
  loadShedding,
  bearerAuth,
} from "@daloyjs/core";

export const app = new App({
  // bodyLimitBytes: 1 << 20    // 1 MiB default: caps prompt size
  // requestTimeoutMs: 30_000   // default: a stuck model call cannot hang forever
  // production auto-detected   // prod-mode 5xx redaction by default
});

app.use(secureHeaders());
app.use(requestId());
app.use(structuredLogger()); // provider keys are redacted by default
app.use(rateLimit({ windowMs: 60_000, max: 60 }));
app.use(loadShedding({ maxQueueDepth: 100, maxEventLoopDelayMs: 50 }));

// Every AI endpoint authenticates. The model is a caller, not a user.
app.use("/api/*", bearerAuth({ verify: (token) => sessions.verify(token) }));

For the full argument behind this (why the deployment-time layer must hold when the model fails), see the blog post on the International AI Safety Report, and the secure-by-default guide.

What about OpenAPI and the typed client?

Be honest with yourself about the trade-off. A pure streaming endpoint cannot carry a meaningful response-body schema, so OpenAPI documents it as a stream and the typed client treats it as such. Endpoints that return structured output (the generateObject() pattern above) get the full treatment: response schema, OpenAPI shape, typed client, and response validation, all from the one Zod schema. Mix the two freely. Stream the chat, contract the structured calls.

Next steps