Why this matters
A customer hitting their API rate limit feels their integration silently break. Their dashboard goes blank, their cron jobs fail, their alerts misfire. They open Postman, see 429s, file a support ticket — by which point they’ve lost trust in your product for 24 hours.
The proactive version: when usage crosses 80% of their cap, you ping their dedicated Slack channel before the cliff. Their account manager reaches out with “we noticed you’re at 82% — want to upgrade or want us to bump the soft cap?” That same friction becomes an expansion conversation.
The Routine pattern makes this trivial: a scheduled check every 15 min, no infrastructure beyond a webhook map.
What you get
- A Routine running every 15 minutes
- Per-customer Slack message when usage crosses 80%, 90%, 100% (one each)
- Idempotency — same customer doesn’t get re-pinged within the same threshold band
Walk through it
Set up a Routine that checks every 15 minutes: any customer at 80%+ of their monthly API quota? If yes, post to that customer’s dedicated Slack channel. Don’t double-page if I already pinged them in the last 24h.
Let me see what’s already registered, then add the per-customer webhooks.
agentry_list_webhooks { project_id: "default" }
{
"webhooks": [
{ "id": "wh_general", "url": "https://hooks.slack.com/services/T0000/B_general/X", "events": ["case.*"] }
]
}
Only the general one. Let me register a webhook per top customer. (You can also just have one webhook and let the Routine include the customer-channel URL in its POST payload — depends on your Slack architecture.)
agentry_register_webhook {
project_id: "default",
url: "https://hooks.slack.com/services/T0000/B_acme/X",
description: "acme_corp quota alerts",
events: ["routine.quota_alert.acme_corp"]
}
{ "webhook_id": "wh_acme", "signing_secret": "agws_…", "active": true }
agentry_register_webhook {
project_id: "default",
url: "https://hooks.slack.com/services/T0000/B_wayne/X",
description: "wayne_ent quota alerts",
events: ["routine.quota_alert.wayne_ent"]
}
{ "webhook_id": "wh_wayne", "signing_secret": "agws_…", "active": true }
Now scheduling the Routine. The prompt below runs every 15 min — it computes per-customer monthly usage, compares to cap, fires the right webhook when over threshold, dedupes via agentry_recent_changes against the last 24h.
/schedule "*/15 * * * *" "
1. Run agentry_analytics_query:
SELECT properties.customer_id, properties.plan, count() as calls_this_month
FROM events WHERE event = 'api_request' AND timestamp > toStartOfMonth(now())
GROUP BY properties.customer_id, properties.plan
2. For each row, compute % of monthly quota using this plan→cap map:
{ starter: 500_000, pro: 5_000_000, enterprise: 50_000_000 }
3. For any customer ≥ 80%: check agentry_recent_changes for the action
'quota_alert.<customer_id>.<band>' in the last 24h. If not present:
a) POST a Slack message to the customer's webhook with:
headline, usage %, calls remaining, upgrade CTA
b) Record the alert via agentry_remember so next run skips it
4. Bands: 80% = soft warning, 90% = strong, 100% = critical
"
{ "routine_id": "rt_quota", "next_run_at": "2026-05-16T13:45:00Z" }
Done. Routine rt_quota runs every 15 minutes. Per run:
- Pulls per-customer monthly usage
- Determines threshold band (80/90/100%)
- Skips customers already alerted in this band in the last 24h
- Posts to the customer’s dedicated Slack channel for new alerts
First check fires at :45. Slack message format:
⚠ acme_corp at 82% of monthly quota
4.10M of 5.00M calls used · 9 days left in month
Projected end-of-month: 5.20M (104%)
Plan: pro · Cap: 5,000,000
[Upgrade to enterprise] [Bump soft cap] [Snooze 24h]
The output
quota_alert Routine (rt_quota)
✓ Schedule: every 15 minutes
✓ Webhooks registered:
acme_corp → wh_acme (channel: #acct-acme)
wayne_ent → wh_wayne (channel: #acct-wayne)
stark_ind → wh_stark (channel: #acct-stark)
✓ Threshold bands:
80% warning (soft Slack ping)
90% strong (loud Slack ping + AM alert)
100% critical (Slack + AM + page on-call)
✓ De-dupe: 24h cooldown per (customer × band)
✓ First run: 2026-05-16T13:45:00Z
Plan caps applied:
starter 500,000 /mo
pro 5,000,000 /mo
enterprise 50,000,000 /mo
After 9am on day 22 of the month a Slack message:
⚠ acme_corp crossed 90% threshold
4.59M of 5.00M calls used · 8 days left in month
At current rate: 5.60M (112% — will hit cap May 25)
Plan: pro
AM contact: @sasha
[Upgrade to enterprise] [Soft-bump cap by 1M]
Setting it up
The Routine needs three ingredients:
1. Per-request api_request events with customer_id and plan (see api-usage-by-customer recipe for the snippet).
2. A webhook per customer (or one webhook + a dispatch-by-customer-channel proxy). Register them upfront:
// Bulk-register from your customer table
for (const customer of premiumCustomers) {
await fetch(`https://api.agentry.sh/v1/projects/${PROJECT_ID}/webhooks`, {
method: "POST",
headers: {
"Authorization": `Bearer ${process.env.AGENTRY_API_KEY}`, // agk_ — owner-side
"Content-Type": "application/json",
"User-Agent": "quota-setup/1.0", // REQUIRED — Cloudflare 403s default UAs
},
body: JSON.stringify({
url: customer.slack_webhook_url,
description: `${customer.id} quota alerts`,
events: [`routine.quota_alert.${customer.id}`],
}),
});
}
3. The Routine itself — /schedule with the prompt above. The 24h cooldown via agentry_remember/agentry_recall prevents flapping when usage hovers right around the threshold.
Tuning tips:
- Higher cooldown (48-72h) if your account managers complain about ping volume
- Lower cooldown (4h) at 90/100% bands since those are urgent
- Add a “projected end-of-month” calculation (current daily rate × days remaining) — usually that’s the upsell anchor, not the current %
Variations
- “Same Routine but per-endpoint quotas —
api_requestevents with endpoint=critical-write cost 10x more against the quota.” - “Add a ‘no usage today’ alert — if a normally-active customer goes silent, that’s a churn signal not a quota signal.”
- “For free-tier customers crossing 100%, post to a sales channel with a ‘they’re now an upsell lead’ message.”
- “Cross-reference quota alerts with recent deploys — if a customer suddenly 10x’d their usage after our last release, they may have a runaway loop.”