Search docs

Jump between documentation pages.

Browse docs

Use DuckDB with DaloyJS

DuckDB is an in-process OLAP database. Use it when your DaloyJS API needs to query local analytical data, Parquet files, CSV exports, or a small embedded reporting database without running a separate database server. It is not a replacement for Postgres or MySQL as the primary transactional database behind a multi-writer API.

Where DuckDB fits
daloy routehandlervalidated HTTP request
good fitEmbedded analyticsParquet, CSV, local .duckdb files, reporting queries
bad fitShared transactional storemany app replicas writing the same primary data
DuckDB runs inside the process. That makes it excellent for local analytics and data APIs, but not the right default for shared OLTP state across many API instances.

1. Install

Use the modern Node client. The older duckdb package is not the client this guide targets.

ts
pnpm add @duckdb/node-api

2. Create a DuckDB plugin

Create one instance per process and connect during app startup. Use an in-memory database for transient analytics, or point DuckDB at a persisted file when the platform gives you durable disk.

ts
// src/db/duckdb.ts
import { DuckDBInstance, type DuckDBConnection } from "@duckdb/node-api";
import type { App } from "@daloyjs/core";

export type DuckDbState = {
  duckdb: DuckDBConnection;
};

export const duckDbPlugin = {
  name: "duckdb",
  async register(app: App) {
    const path = process.env.DUCKDB_PATH ?? ":memory:";
    const instance = await DuckDBInstance.fromCache(path);
    const connection = await instance.connect();

    app.decorate("duckdb", connection);
    app.onClose(() => {
      connection.closeSync();
    });
  },
};

3. Augment app state

ts
// src/types/state.d.ts
import type { DuckDBConnection } from "@duckdb/node-api";

declare module "@daloyjs/core" {
  interface AppState {
    duckdb: DuckDBConnection;
  }
}

4. Query from a route

Keep SQL structure owned by the server and pass request values as parameters. Convert results to JSON-safe objects before returning them through a response schema.

ts
import { z } from "zod";
import { App, secureHeaders } from "@daloyjs/core";
import { duckDbPlugin } from "./db/duckdb";

const app = new App();
app.use(secureHeaders());
app.register(duckDbPlugin);

const SalesSummary = z.object({
  region: z.string(),
  revenue: z.number(),
});

app.route({
  method: "GET",
  path: "/analytics/sales",
  operationId: "salesSummary",
  query: z.object({ region: z.string().min(1).optional() }),
  responses: {
    200: { description: "ok", body: z.array(SalesSummary) },
  },
  handler: async ({ query, state }) => {
    const reader = await state.duckdb.runAndReadAll(
      `select region, sum(revenue)::double as revenue
       from read_parquet('data/sales/*.parquet')
       where $region is null or region = $region
       group by region
       order by revenue desc`,
      { region: query.region ?? null },
    );

    return {
      status: 200,
      body: SalesSummary.array().parse(reader.getRowObjectsJson()),
    };
  },
});

Runtime support

RuntimeFitWhy
Node.jsRecommended@duckdb/node-api is a native Node client.
Bun / DenoLimitedPrefer Node unless you have tested the native package and deploy target yourself.
Cloudflare WorkersNoWorkers cannot load the native Node package.
VercelLimitedNode functions may work for in-memory or read-only analytical data. Edge functions cannot load native Node binaries, and local disk is not durable.
Cloudflare Workers / Vercel EdgeNoEdge runtimes cannot load the native Node package.
AWS Lambda / containersYesWorks when the native package matches the deployment platform and any persisted files live on durable storage.

Use the right storage mode

Storage choice
  1. 01ephemeral:memory:scratch reports, tests, temp imports
  2. 02file.duckdbsingle-process durable local database
  3. 03Parquet / CSVdata lake, exports, read-heavy analytics
  4. 04Postgres / MySQLprimary transactional app state
Use DuckDB for analytical reads and local processing. Keep primary multi-user writes in a transactional database unless you have a carefully controlled single-writer design.

Security notes

  • Do not execute SQL text from users. DuckDB SQL can read files, access networks through extensions, and consume significant CPU or memory.
  • Use parameterized values when request data belongs in a query, as in the route above.
  • Disable external access for routes that only query in-memory tables or controlled data:
ts
await state.duckdb.run("set enable_external_access = false");
await state.duckdb.run("set allow_community_extensions = false");
await state.duckdb.run("set lock_configuration = true");
  • If you need file reads, restrict the directories and never pass a user supplied path directly into read_csv, read_parquet, COPY, or ATTACH.
  • For user-authored SQL, run DuckDB out of process with OS/container sandboxing and strict timeouts. A DaloyJS route should treat user-authored SQL like code execution, not like a search filter.

When to choose DuckDB

  • You need analytics over Parquet, CSV, JSON, or local snapshots.
  • You want an embedded reporting endpoint in a Node service without operating a warehouse.
  • You are building import/export, admin analytics, billing summaries, or offline data tools.

When not to choose it

  • You need many API replicas to write the same primary application data.
  • You deploy only to Cloudflare Workers or Vercel Edge. Use Turso, Cloudflare D1, or Neon instead.
  • You want row-level authorization enforced inside the database. Keep that in Postgres/Supabase or enforce it explicitly in application code.

See also the database hosting overview, Node adapter, and DuckDB Node.js client docs.