Build a roll-your-own status.yourdomain page

Why this matters

Most status pages are theater. The vendor SaaS (Statuspage.io, BetterUptime, etc.) gives you a green-bar template that someone on call has to manually flip to yellow when an incident hits. They almost never do, because:

Mid-incident is the worst time to also update a status page
The status page becomes a lie (“all systems operational” while customers are getting 500s)
Engineers stop trusting it; customers stop trusting it; it’s purely decorative

The honest version: a data-driven status page that reads directly from your real metrics. No human flips a switch — the page shows actual ingest rates, actual incident counts, actual MTTR. When it lies, you have a real bug to fix; when it shows yellow, customers can see why.

Agentry gives you live rows for each widget. Serve those rows from your own backend so the browser never receives AGENTRY_API_KEY.

What you get

A public status page with three sections:
- Now: ingest health (events received in last 5 min vs baseline)
- Recent incidents: open cases in last 24h with ≥ 10 users affected
- MTTR (7d): median time from case.created to case.resolved
A server-side JSON/view pattern for status.yourdomain.com
No vendor SaaS, no monthly fee, no green-by-default lie

Walk through it

You

Compose a public status dashboard: current ingest health (logs / analytics / deploys), recent incidents (cases in last 24h with users_affected ≥ 10), and MTTR for the last 7 days. Return rows I can serve from status.mydomain.com.

Agent

Three server-side widget reads. Let me build each, then hand back the status-page data contract.

POST /v1/projects/:project_id/analytics/query {
  project_id: "default",
  query: "
    with recent as (
      select count() as n_5min
      from events
      where timestamp > now() - interval 5 minute
    ),
    baseline as (
      select count() / (7 * 24 * 12) as baseline_per_5min
      from events
      where timestamp > now() - interval 7 day
    )
    select
      r.n_5min,
      b.baseline_per_5min,
      r.n_5min / b.baseline_per_5min as ratio,
      case
        when r.n_5min / b.baseline_per_5min between 0.5 and 2.0 then 'operational'
        when r.n_5min / b.baseline_per_5min < 0.5 then 'degraded'
        else 'elevated_volume'
      end as status
    from recent r cross join baseline b
  "
}

{ "rows": [{ "n_5min": 812, "baseline_per_5min": 790, "ratio": 1.03, "status": "operational" }] }

GET /v1/projects/:project_id/cases?status=open&limit=10

{ "cases": [/* filter server-side for recent high-impact incidents */] }

GET /v1/projects/:project_id/cases?status=resolved&limit=100

{ "cases": [/* compute MTTR in your server route from resolved case timestamps */] }

Agent

Three data sources mapped. Here’s the server-rendered page shape you can drop on status.yourdomain.com:

<!doctype html>
<html lang="en">
<head>
  <meta charset="utf-8">
  <title>System status — yourdomain</title>
  <style>
    body { font-family: ui-sans-serif, system-ui, sans-serif; max-width: 720px; margin: 2rem auto; padding: 0 1rem; color: #1c1c1c; }
    section { border: 1px solid #e5e5e5; border-radius: 8px; padding: 1rem; margin-bottom: 1rem; }
    h1, h2 { letter-spacing: -0.01em; }
    .grid-3 { display: grid; gap: 0.75rem; grid-template-columns: repeat(3, 1fr); margin-bottom: 1.5rem; }
    .status-good { color: #15803d; } .status-bad { color: #b91c1c; }
  </style>
</head>
<body>
  <h1>System status</h1>

  <section id="ingest-health">
    <h2>Right now</h2>
    <!-- Render rows from your server-side /status-data route. -->
  </section>

  <section id="active-incidents">
    <h2>Active incidents (24h)</h2>
  </section>

  <section id="mttr">
    <h2>7-day MTTR</h2>
  </section>

  <p style="color: #6b7280; font-size: 0.85rem;">
    Driven by real production telemetry, refreshed every minute.
    No human flips this dashboard between green and red — if it shows yellow,
    something is actually misbehaving.
  </p>
</body>
</html>

One detail: cache your /status-data route for 60 seconds so a traffic spike to the status page does not re-run every query per visitor.

The output

Server-rendered status page components

  ingest-health       "Now" — events in last 5min vs 7d baseline
  active-incidents    "Incidents" — 24h cases with ≥10 users
  mttr                "MTTR" — median + p90 over 7d

Sample rendered output:

  System status — yourdomain
  ─────────────────────────────────────
  Right now
    Events/5min:    12,400 (1.04× baseline)    ✓ operational

  Active incidents (24h)
    [none]

  7-day MTTR
    median: 42 minutes
    p90:   2h 14m
    resolved this week: 14

  Driven by real telemetry, refreshed every minute.

Setting it up

1. Subdomain. Point status.yourdomain.com to an app route or serverless function that can read AGENTRY_API_KEY server-side.

2. Caching. Cache the server-side response for 60 seconds — protects you from someone hitting the page 1000 times during an incident.

3. Key-safe. Never put AGENTRY_API_KEY in the browser. Pull JSON server-side and render HTML:

// Astro / Next / your framework
const res = await fetch(
  `https://api.agentry.sh/v1/projects/${projectId}/analytics/query`,
  {
    method: "POST",
    headers: {
      "Authorization": `Bearer ${process.env.AGENTRY_API_KEY}`,
      "Content-Type": "application/json",
      "User-Agent": "status-page/1.0",
    },
    body: JSON.stringify({ query: ingestHealthQuery }),
  }
);
const { rows } = await res.json();
const status = rows[0].status;  // 'operational' | 'degraded' | ...

4. Branding. Render the rows into your own HTML and CSS.

5. Honest copy. Resist the urge to add manual “we’re investigating” banners. The whole point of this page is that it never lies, and a human-curated banner re-introduces the failure mode of vendor status pages.

Variations

“Add a 4th component: per-service breakdown. ingest/auth/checkout/api as separate rows.”
“Make the incident list show only the agent_summary (not the raw error message) — friendlier for customers.”
“Per-region status: filter ingest health by properties.region so EU/US/APAC are separately displayed.”
“Add a ‘historical uptime %’ computed as (1 - hours_with_active_incidents / 720) over the last 30 days.”
“Embed in a Notion page using their /embed block instead of a standalone subdomain.”

Build a roll-your-own status.yourdomain page

Why this matters

What you get

Walk through it

The output

Setting it up

Variations

Related playbooks

Build a public live-stats widget for your marketing site

Model Black Friday from last year's data

Get a weekly digest of what your agent did

Adapt this playbook in your own agent.