Rank errors by revenue impact, not raw count

Multiply users-affected by their plan ARR to surface the bugs costing real money. Stop triaging by event volume.

Published

difficulty intermediate · time to value 2 minutes · tools used 3

Just say this

Show me open cases ranked by dollars-at-risk. For each, multiply users_affected by their plan ARR. Top 5.

Why this matters

Count-based triage is a trap. The loudest fingerprint in your error tracker is usually a cron loop hitting a stale row 5,000 times an hour — annoying, low-impact, fixable Tuesday.

Meanwhile the bug that three Pro accounts hit during checkout — twice each, total of 6 events — generated 80% of the revenue risk in that window. Sentry sorts it 80th on the list. You won’t see it until those three accounts churn.

The fix is to weight every error by the affected user’s plan value. Datadog won’t do this. Sentry won’t do this. The combination of error data + analytics data lives in one place in Agentry, so the agent can JOIN them in a single HogQL query and rank by dollars-at-risk instead of dumb count.

What you get

  • Top 5 open cases ranked by estimated ARR-at-risk (users-affected × per-user ARR)
  • For the #1 case: the specific accounts involved, with their plan tiers
  • A suggestion for who at your company should call those customers today

Walk through it

You

Show me open cases ranked by dollars-at-risk. For each, multiply users_affected by their plan ARR. Top 5.

Agent

Let me pull open cases with their affected user lists, then join against the latest plan value per user.

agentry_list_cases {
  project_id: "default",
  status: "open",
  since: "14d",
  include_affected_distinct_ids: true,
  limit: 25
}
{
  "cases": [
    { "id": "c_f72a", "message": "TypeError on user.plan", "users_affected": 14,
      "event_count": 89, "distinct_ids": ["[email protected]", "[email protected]", "..."] },
    { "id": "c_1cd9", "message": "fetch failed: invoice.pdf", "users_affected": 9,
      "event_count": 11, "distinct_ids": ["..."] },
    { "id": "c_2d8e", "message": "checkout 500: stripe webhook timeout", "users_affected": 3,
      "event_count": 6, "distinct_ids": ["[email protected]", "[email protected]", "..."] }
  ]
}
Agent

Now let me join the affected distinct_ids against current plan tier.

agentry_analytics_query {
  project_id: "default",
  hogql: "SELECT distinct_id, argMax(properties.plan, timestamp) AS plan FROM events WHERE distinct_id IN ({ids}) GROUP BY distinct_id",
  params: { ids: ["[email protected]", "[email protected]", "[email protected]", "[email protected]", "..."] }
}
{
  "rows": [
    { "distinct_id": "[email protected]", "plan": "free" },
    { "distinct_id": "[email protected]", "plan": "starter" },
    { "distinct_id": "[email protected]", "plan": "enterprise" },
    { "distinct_id": "[email protected]", "plan": "scale" }
  ]
}
Agent

Ranked by ARR-at-risk (assuming starter=$300, scale=$3,000, enterprise=$30,000):

  1. c_2d8e — checkout 500 on Stripe webhook timeout — $63,000 ARR at risk Only 6 events but it’s hitting 2 enterprise + 1 scale account. This is your #1.
  2. c_f72a — TypeError on user.plan — $4,200 ARR 14 users but 11 are on free tier (the bug literally fires on the free-tier path).
  3. c_1cd9 — invoice PDF fetch — $2,700 ARR 9 users, mostly starter.

The numbers flip completely. c_2d8e is 15x more important than c_f72a by revenue, but raw count says the opposite. Want me to pull the case detail on c_2d8e and draft a Slack message to whoever owns Stripe?

The output

Top 5 cases by ARR-at-risk (14d, open)

$63,000   c_2d8e   checkout 500: stripe webhook timeout
                   3 users  · 6 events
                   Affected: [email protected] (enterprise $30k),
                             [email protected] (scale $3k),
                             [email protected] (enterprise $30k)
                   ↗ Owner: @stripe-oncall — investigate webhook latency

 $4,200   c_f72a   TypeError on user.plan (free-tier path)
                   14 users · 89 events · mostly free tier
                   ↗ Low ARR, but fixable in 4 lines

 $2,700   c_1cd9   invoice.pdf fetch failed
                   9 users · 11 events · mixed plans
                   ↗ Probably CORS regression on invoice-service

   $600   c_a3e1   phone validation NANP-only
                   7 users · 14 events · all starter
                   ↗ Low priority unless we want international growth

   $300   c_4f01   pagination edge case
                   2 users · 3 events · starter
                   ↗ Easy fix, ship next sprint

Net ARR at risk: $70,800

Setting it up

The recipe joins two things: cases (your error data) and events (your analytics data, where plan lives). For the join to work, both must use the same distinct_id.

// Error reporting — distinct_id (or user.id) MUST match analytics
await fetch(`https://api.agentry.sh/v1/logs/${PROJECT_ID}/`, {
  method: "POST",
  headers: {
    "Authorization": `Bearer ${process.env.AGENTRY_DSN}`,
    "Content-Type": "application/json",
    "User-Agent": "myapp/1.0",  // REQUIRED — Cloudflare 403s default UAs
  },
  body: JSON.stringify({
    message: err.message,
    stack: err.stack,
    user: { id: currentUser?.email },  // ← same id as analytics
  }),
});

// Analytics — every event carries `plan` so we can JOIN later
await fetch(`https://api.agentry.sh/v1/analytics/${PROJECT_ID}/`, {
  method: "POST",
  headers: {
    "Authorization": `Bearer ${process.env.AGENTRY_DSN}`,
    "Content-Type": "application/json",
    "User-Agent": "myapp/1.0",
  },
  body: JSON.stringify({
    event: "feature_used",
    distinct_id: currentUser.email,  // ← same id as errors
    properties: { plan: currentUser.plan, mrr: currentUser.mrr },
  }),
});

If you don’t track plan on every event, an alternative is to fire a one-shot plan_changed event whenever it changes — the HogQL argMax(properties.plan, timestamp) pattern picks up the latest value either way.

Variations

  • “Same ranking, but only count enterprise customers (plan = ‘enterprise’).”
  • “Show errors ranked by ARR-at-risk over the last 24 hours, not 14 days. Daily oncall summary.”
  • “For the top case, draft a per-customer outreach email apologizing and offering credit.”
  • “Run this every weekday at 9am and post the top-3 to #cs-oncall with the affected emails.” (uses a Routine)

Try this recipe in your own agent.

Paste the prompt above into your agent. It'll set up Agentry against your data, then run the recipe and show you the real output.

Install https://agentry.sh/agentry.md and set it up