Bot / User-Agent management
As of 0.37.0 DaloyJS ships botGuard() — the in-app equivalent of the bot rules Nginx, Cloudflare, and other WAFs run at the edge, but inside the app where the framework already owns request parsing and client-IP resolution. It does three opt-in jobs:
- Block empty / missing
User-Agent— a common signature of crude scrapers and vulnerability scanners (on by default). - Block known-abusive
User-Agentstrings — your own substrings orRegExps. - Verify declared crawlers — when a request claims to be Googlebot or Bingbot, confirm it via reverse-DNS + forward-confirm (the method Google and Bing themselves document) so a spoofed
User-Agentcan't impersonate a trusted crawler.
Every check is opt-in and allowlist-friendly, and the middleware is dependency-free and runtime-portable.
Quick start
Mount it with app.use() so it runs in beforeHandle before your handlers. A blocked request is rejected with 403 Forbidden RFC 9457 problem+json.
Blocking empty & abusive User-Agents
blockEmptyUserAgent defaults to true. A plain string in blockedUserAgents matches case-insensitively as a substring; a RegExp is tested as-is.
Allowlist wins
allowUserAgents is consulted first and bypasses everyother rule (including empty-UA blocking and crawler verification) — handy for your own monitoring agents or a partner's integration.
Verifying declared crawlers
Spoofing User-Agent: Googlebot is trivial. The only reliable check is the one Google and Bing publish: reverse-DNS the client IP, make sure the PTR hostname is on an official domain, then forward-resolve that hostname back to the same IP. botGuard() ships GOOGLEBOT and BINGBOT rules (bundled as WELL_KNOWN_BOTS) and you can add your own:
Because verifiedBots needs the client IP, the middleware refuses to construct unless you supply resolveIp or set trustProxyHeaders. A request that claims to be a crawler but can't be verified — no client IP, or a DNS failure — is blocked by default (blockUnverifiableBots, the secure-by-default posture). Set it to false to fail open. Verification results are cached per IP (default 1 h via cacheTtlMs) so DNS stays off the hot path.
Monitor mode & callbacks
Roll it out safely with mode: "log" — nothing is blocked, but every match fires onBlock so you can measure impact before enforcing.
The reason is one of "empty-user-agent", "blocked-user-agent", "spoofed-bot", or "unverifiable-bot".
Custom DNS resolver
The default resolver lazily imports node:dns/promises. On a runtime without it (Workers, Deno without --allow-net) or in tests, supply your own BotResolver: