Why this matters
Crash-free user rate is the single number that mobile teams live and die by. Apple and Google’s stated guidance is >99.5% — drop under 99% and you start hemorrhaging users to uninstalls and see your App Store ranking get downgraded by their stability scoring.
But the rate that matters is per app version, not aggregate. Aggregate looks fine because 80% of your users are on the stable old version. The 20% who upgraded to v4.2 yesterday are hitting a crash on first launch — and your aggregate dashboard hides it because their misery is averaged out.
Agentry computes per-version crash-free rate as a single query. Pin it to a Routine that runs every hour during a rollout, and you’ll catch a bad version before it reaches the 50% rollout step.
What you get
- Crash-free user rate per app version, sorted by version (newest first)
- DAU per version (so you can ignore versions with no traffic)
- A clear delta — which versions are below your target (typically 99.5%)
Walk through it
What’s our crash-free rate by app version? Distinct users who hit any crash ÷ DAU. Flag any version that’s below 99% over the last 24h.
One HogQL pass: distinct users who had a crash, distinct users who used the app, grouped by version.
agentry_analytics_query {
project_id: "default",
query: "
with active as (
select properties.app_version as ver,
count(distinct distinct_id) as dau
from events
where event = 'app_open'
and timestamp > now() - interval 24 hour
group by ver
),
crashed as (
select properties.app_version as ver,
count(distinct distinct_id) as crashed_users
from events
where event = '$exception'
and timestamp > now() - interval 24 hour
group by ver
)
select
a.ver,
a.dau,
coalesce(c.crashed_users, 0) as crashed,
1 - (coalesce(c.crashed_users, 0) / a.dau) as crash_free_rate
from active a
left join crashed c on c.ver = a.ver
where a.dau > 50
order by ver desc
"
}
{
"rows": [
{ "ver": "4.2.0", "dau": 18420, "crashed": 442, "crash_free_rate": 0.9760 },
{ "ver": "4.1.3", "dau": 62100, "crashed": 92, "crash_free_rate": 0.9985 },
{ "ver": "4.1.2", "dau": 41880, "crashed": 61, "crash_free_rate": 0.9985 },
{ "ver": "4.1.1", "dau": 12340, "crashed": 18, "crash_free_rate": 0.9985 },
{ "ver": "4.0.5", "dau": 3420, "crashed": 5, "crash_free_rate": 0.9985 }
]
}
Found it. v4.2.0 is at 97.6% crash-free — that’s a regression from your normal 99.85%. 442 users crashed in 24h out of 18,420 active. Let me look at what’s crashing.
agentry_list_cases {
project_id: "default",
matching: "",
filter: { property: "app_version", value: "4.2.0" },
window_hours: 24
}
{
"cases": [
{
"case_id": "c_mob_1",
"fingerprint": "f_42_a",
"error_type": "NullPointerException",
"message": "Attempt to invoke virtual method 'getString' on null PushTokenManager",
"events_count": 380,
"users_affected": 380,
"introduced_in_deploy": "4.2.0"
},
{
"case_id": "c_mob_2",
"fingerprint": "f_42_b",
"error_type": "OutOfMemoryError",
"message": "Failed to allocate 64MB during ImageDecoder init",
"events_count": 62,
"users_affected": 62,
"introduced_in_deploy": "4.2.0"
}
]
}
Two issues, both introduced in 4.2.0:
- NullPointerException in
PushTokenManager(380 users — accounts for 86% of the crash budget). Looks like push registration runs beforeApplication.onCreatefinishes. Easy fix. - OutOfMemoryError in
ImageDecoder(62 users). 64MB allocation — bigger images or you forgot a downsample. Affects low-RAM Androids.
Recommendation: roll back v4.2.0 from the rollout (right now you’re at 18k DAU on it; if the rollout proceeds you’ll have 200k+ users crashing). Push 4.2.1 with these two fixes.
Want me to set up a Routine that auto-pages oncall if any new version drops below 99%?
The output
Crash-free user rate by app version (24h)
Version DAU Crashes Crash-Free Target (99.5%)
─────────────────────────────────────────────────────────────
4.2.0 18,420 442 97.60% FAIL ★★★
4.1.3 62,100 92 99.85% PASS
4.1.2 41,880 61 99.85% PASS
4.1.1 12,340 18 99.85% PASS
4.0.5 3,420 5 99.85% PASS
★ v4.2.0 regression: -2.25 pp vs baseline
Crash budget for v4.2.0 (24h):
Total crashes: 442
Top contributor (86%): NullPointerException in PushTokenManager
Second (14%): OutOfMemoryError in ImageDecoder
Recommended actions (urgent):
1. HALT v4.2.0 rollout in App Store / Play Console
2. Patch the two fingerprints
3. Ship 4.2.1 once verified in staging
Setting it up
Mobile crashes flow to /v1/logs/. The two properties you need are app_version (so the query can group) and user.id (so distinct-user counts work):
// Android global crash handler
Thread.setDefaultUncaughtExceptionHandler { thread, err ->
val payload = JSONObject().apply {
put("message", err.message)
put("stack", err.stackTraceToString())
put("error_type", err.javaClass.simpleName)
put("environment", BuildConfig.BUILD_TYPE)
put("app_version", BuildConfig.VERSION_NAME)
put("user", JSONObject().put("id", currentUserId()))
}
// Fire-and-forget POST to /v1/logs/ — see agentry.md for the helper
postToAgentry("/v1/logs/${PROJECT_ID}/", payload)
defaultHandler.uncaughtException(thread, err)
}
// iOS — pair with NSSetUncaughtExceptionHandler + a signal handler
func reportCrash(_ exception: NSException) {
var req = URLRequest(url: URL(string: "https://api.agentry.sh/v1/logs/\(projectId)/")!)
req.httpMethod = "POST"
req.setValue("Bearer \(dsn)", forHTTPHeaderField: "Authorization")
req.setValue("application/json", forHTTPHeaderField: "Content-Type")
req.setValue("myapp-ios/\(Bundle.main.releaseVersion)", forHTTPHeaderField: "User-Agent") // REQUIRED
let body: [String: Any] = [
"message": exception.reason ?? "",
"stack": exception.callStackSymbols.joined(separator: "\n"),
"error_type": exception.name.rawValue,
"app_version": Bundle.main.releaseVersion,
"user": ["id": currentUserId()],
]
req.httpBody = try? JSONSerialization.data(withJSONObject: body)
URLSession.shared.dataTask(with: req).resume()
}
Fire app_open (or session_start) on cold start with the same user.id so DAU computes correctly:
// Application.onCreate
postAnalytics("app_open", mapOf(
"app_version" to BuildConfig.VERSION_NAME,
"platform" to "android",
"os_version" to Build.VERSION.RELEASE
))
Variations
- “Same metric but per OS version — is the crash hitting only Android 14, or all versions?”
- “Crash-free SESSIONS instead of users — closer to Google Play’s metric definition.”
- “Set up a Routine: every hour, recompute, page #mobile-oncall if any active-rollout version drops below 99%.”
- “Compare crash-free rates for users who came from organic vs paid installs — sometimes ad networks send bot traffic that crashes immediately.”