Why this matters
Most alerting is “a threshold was crossed.” You set it once, it pages you 12 times in week one, you tune the threshold up, then you stop trusting it. Classic alert fatigue.
The signal that actually matters to oncall is “a genuinely new failure mode appeared.” Not “errors went up 20%” — those happen on every load spike. “Yesterday this fingerprint did not exist; today it does” is what wakes you up.
Building this with traditional tools means cron + a queryable error DB + diff logic + webhook code. Agentry’s Routines do it in one prompt: the agent runs on a schedule, has the full MCP toolbox, and can post anywhere you can hit with HTTP.
What you get
- A scheduled cloud agent that runs hourly (or whatever cadence you pick)
- Per run: snapshot of current fingerprints, diff against last run, post only deltas
- Webhook payloads to Slack/Discord/PagerDuty/etc.
Walk through it
Set up a Routine: every hour, check staging for new error fingerprints that weren’t there yesterday. If any appear, summarize each one and post to my Slack webhook at https://hooks.slack.com/services/T0000/B0000/XXXX. Don’t page me for fingerprints I’ve already triaged.
Got it. Let me register the webhook first so we have a stable target, then schedule the Routine.
agentry_register_webhook {
project_id: "default",
url: "https://hooks.slack.com/services/T0000/B0000/XXXX",
description: "Staging new-fingerprint watcher",
events: ["case.created"]
}
{
"webhook_id": "wh_8f12",
"signing_secret": "agws_…(shown once)…",
"active": true
}
Webhook registered. Now scheduling the Routine.
/schedule "0 * * * *" "List staging cases that are new since the last hour (status=open, environment=staging, first_seen_at within last 1h). For each, fetch agentry_get_case for a one-paragraph summary, then POST a Slack-formatted message to webhook wh_8f12 with the fingerprint, summary, and a link to agentry_get_case."
{ "routine_id": "rt_7a3c", "next_run_at": "2026-05-16T15:00:00Z" }
Done. Routine rt_7a3c runs every hour at :00. Per run it:
- Pulls staging cases with
first_seen_at > now - 1h - For each one, fetches the case + drafts a one-line summary
- Posts a Slack message: ”🐛 New staging error:
TypeError: …inauth.ts:42. 3 events / 1 user. View: agentry.sh/case/c_xxx”
If no new fingerprints, no post (silent run).
The output
A Slack message in #eng-oncall that looks like:
🐛 New staging error fingerprint
TypeError: Cannot read properties of undefined (reading 'plan')
src/middleware/auth.ts:42 · 3 events · 1 user
First seen: 14:23 (12 min ago) · After deploy a3f29b1
[Investigate ↗] [Suppress] [Mark spurious]
The buttons link to MCP tool URIs that, when clicked in a Claude Code session, prefill the next prompt.
Setting it up
Two pieces of plumbing:
1. Tag errors with environment. Otherwise the watcher can’t distinguish staging from prod. Agentry has no SDK — just raw fetch to /v1/logs/:
await fetch(`https://api.agentry.sh/v1/logs/${PROJECT_ID}/`, {
method: "POST",
headers: {
"Authorization": `Bearer ${process.env.AGENTRY_DSN}`,
"Content-Type": "application/json",
"User-Agent": "myapp/1.0", // REQUIRED — Cloudflare 403s default UAs
},
body: JSON.stringify({
message: err.message,
stack: err.stack,
environment: process.env.NODE_ENV, // "staging" | "production"
user: { id: currentUser?.email },
}),
});
2. Create the Slack incoming webhook in Slack → Apps → Incoming Webhooks. Copy the URL into the prompt above. No code on your side beyond Slack’s standard payload format.
3. (Optional) Sign the outgoing webhook. Agentry signs all webhooks with HMAC-SHA256. To verify in your Slack-side proxy — uses Web Crypto, so it runs anywhere (Workers, Deno, Node 20+, browsers):
async function verifyAgentryWebhook(req: Request): Promise<boolean> {
const sig = req.headers.get("x-agentry-signature") ?? "";
const body = await req.text();
const key = await crypto.subtle.importKey(
"raw",
new TextEncoder().encode(process.env.AGENTRY_WEBHOOK_SECRET!),
{ name: "HMAC", hash: "SHA-256" },
false,
["sign"],
);
const mac = await crypto.subtle.sign("HMAC", key, new TextEncoder().encode(body));
const expected = Array.from(new Uint8Array(mac))
.map((b) => b.toString(16).padStart(2, "0"))
.join("");
return sig === expected;
}
// In your handler
if (!(await verifyAgentryWebhook(req))) {
return new Response("bad sig", { status: 401 });
}
Variations
- “Same setup but for production. Only alert if the new fingerprint hits 5+ users within the hour.”
- “Run every 5 min instead of hourly — I want faster signal during a launch.”
- “Watch for fingerprints that re-appear after being marked resolved (regressions).”
- “Post different channels depending on severity: 1-user errors → #eng-noise, 5+ user errors → #eng-oncall, 50+ → @here in #incidents.”