Auto-document your captured event taxonomy

Every event captured in the last 30 days with count, first-seen, last-seen, sample property keys. Markdown table for Notion — your event taxonomy doc that never goes stale.

Published

difficulty beginner · time to value 1 minute

Start from this

List every event we've captured in the last 30 days with its count, first-seen, last-seen, and sample property keys. Format as a Markdown table for review or publication.

Why this matters

Event taxonomies always drift. The team agrees on signup_completed in week 1. Someone on the mobile team ships signup_complete (no d) in month 3. The web team renames to user_signed_up in month 6. By month 12 you have three names for the same conceptual event, four schemas of properties, and the only person who knows which one to query has left.

The fix: a living event-taxonomy document. The trouble: nobody writes one because it goes stale by next quarter. The fix to the fix: auto-generate it from the actual event stream.

Agentry exposes every captured event plus count, first-seen, last-seen, and sample properties. A starter request produces a clean Markdown table ready to save, publish, or adapt for Notion/Linear. Re-run weekly, the doc is always current.

What you get

  • A Markdown table of every event name with count, first-seen, last-seen, and sample properties
  • Sorted by volume (so you see the heavy hitters first)
  • A “deprecation candidates” section flagging events with zero activity in the last 7 days
  • Quick spot for typos (signup_completed vs signup_complete)

Walk through it

You

List every event we’ve captured in the last 30 days with its count, first-seen, last-seen, and sample property keys. Format as a Markdown table I can paste into Notion.

Agent

Two passes — GET /v1/projects/:project_id/event-names for the inventory, then per-event property keys via HogQL.

GET /v1/projects/:project_id/event-names { project_id: "default" }
{
  "server_emitted": [
    { "name": "case.created" }, { "name": "case.resolved" }, { "name": "deploy.recorded" }
  ],
  "analytics_events": [
    { "name": "page_view",          "count": 1240000, "first_seen": "2024-08-12", "last_seen": "2026-05-16" },
    { "name": "signup_completed",   "count":    8400, "first_seen": "2024-08-15", "last_seen": "2026-05-16" },
    { "name": "checkout_started",   "count":   42000, "first_seen": "2024-08-20", "last_seen": "2026-05-16" },
    { "name": "order_completed",    "count":   18400, "first_seen": "2024-08-20", "last_seen": "2026-05-16" },
    { "name": "product_viewed",     "count":  410000, "first_seen": "2024-08-20", "last_seen": "2026-05-16" },
    { "name": "signup_complete",    "count":      62, "first_seen": "2025-12-01", "last_seen": "2026-05-15" },
    { "name": "ab_assigned",        "count":  118000, "first_seen": "2025-04-12", "last_seen": "2026-05-16" },
    { "name": "video_uploaded_v2",  "count":     180, "first_seen": "2024-11-01", "last_seen": "2025-09-12" }
  ]
}
Agent

Note: I spotted signup_complete (no d) with only 62 events — looks like a typo’d version of signup_completed (8,400). Also video_uploaded_v2 hasn’t fired since September — deprecation candidate.

Let me fetch sample property keys per event.

POST /v1/projects/:project_id/analytics/query {
  project_id: "default",
  query: "
    select
      event,
      arrayDistinct(arrayFlatten(groupArray(JSONExtractKeys(properties)))) as sample_keys
    from events
    where timestamp > now() - interval 30 day
    group by event
  "
}
{
  "rows": [
    { "event": "page_view",        "sample_keys": ["path", "referrer", "user_agent"] },
    { "event": "signup_completed", "sample_keys": ["source", "plan", "country"] },
    { "event": "checkout_started", "sample_keys": ["cart_value", "item_count", "shipping_method"] },
    { "event": "order_completed",  "sample_keys": ["order_id", "total", "items", "coupon_code"] },
    { "event": "product_viewed",   "sample_keys": ["sku", "category", "price"] },
    { "event": "signup_complete",  "sample_keys": ["source", "plan"] },
    { "event": "ab_assigned",      "sample_keys": ["flag_key", "variant"] },
    { "event": "video_uploaded_v2","sample_keys": ["video_id", "size_bytes", "format"] }
  ]
}
Agent

Here’s the Markdown — ready to save or publish in your wiki of choice.

# Event taxonomy (auto-generated 2026-05-16)

| Event | 30d count | First seen | Last seen | Sample properties |
|---|---:|---|---|---|
| `page_view` | 1,240,000 | 2024-08-12 | 2026-05-16 | path, referrer, user_agent |
| `product_viewed` | 410,000 | 2024-08-20 | 2026-05-16 | sku, category, price |
| `ab_assigned` | 118,000 | 2025-04-12 | 2026-05-16 | flag_key, variant |
| `checkout_started` | 42,000 | 2024-08-20 | 2026-05-16 | cart_value, item_count, shipping_method |
| `order_completed` | 18,400 | 2024-08-20 | 2026-05-16 | order_id, total, items, coupon_code |
| `signup_completed` | 8,400 | 2024-08-15 | 2026-05-16 | source, plan, country |
| `signup_complete` ⚠ | 62 | 2025-12-01 | 2026-05-15 | source, plan |

## Deprecation candidates (0 activity in last 7 days)

- `video_uploaded_v2` — last seen 2025-09-12 (180 lifetime events)

## Typo / duplicate candidates

- `signup_complete` (62 events) vs `signup_completed` (8,400 events)
  → recommended fix: drop the `signup_complete` callsite, all callers should be on `signup_completed`

Want me to set up a Routine that regenerates this every Monday and posts it to a Notion page?

The output

Event taxonomy auto-doc — 2026-05-16

Top events (by 30d volume):
  page_view              1,240,000     props: path, referrer, user_agent
  product_viewed           410,000     props: sku, category, price
  ab_assigned              118,000     props: flag_key, variant
  checkout_started          42,000     props: cart_value, item_count, shipping_method
  order_completed           18,400     props: order_id, total, items, coupon_code
  signup_completed           8,400     props: source, plan, country
  signup_complete ⚠             62     props: source, plan         ← TYPO

Deprecation candidates (no activity in 7d):
  video_uploaded_v2   last: 2025-09-12   (180 lifetime)

Typo / duplicate candidates:
  signup_complete (62)    vs signup_completed (8,400)
    → kill the misnamed callsite

Output saved as Markdown — publish or save to:
  Notion: agentry/events
  Linear doc: agentry-event-taxonomy
  Slack: #data

Setting it up

No setup at all — this is a pure read against the events your app is already sending. If you’re getting “no events found,” it just means analytics isn’t flowing yet. Send a few test events to confirm:

await fetch(`https://api.agentry.sh/v1/analytics/`, {
  method: "POST",
  headers: {
    "Authorization": `Bearer ${process.env.AGENTRY_DSN}`,
    "Content-Type": "application/json",
    "User-Agent": "myapp/1.0",  // REQUIRED — Cloudflare 403s default UAs
  },
  body: JSON.stringify({
    event: "signup_completed",
    distinct_id: "[email protected]",
    properties: { source: "homepage", plan: "free" },
  }),
});

Once events flow, re-run the prompt. If you want this to be a permanent artifact, two options:

Option A: Weekly Routine that posts to Slack. Same pattern as weekly-agent-digest, regenerates the table every Monday.

Option B: Embed as a public dashboard. Publish via POST /v1/projects/:project_id/public-queries and iframe-embed in your Notion/wiki — that way it’s always current without re-paste.

Variations

  • “Same table but include sample event values (top 5 values per property), not just keys. Helps spot bad data.”
  • “Just the typo / duplicate candidates — I want a clean cleanup list.”
  • “For each event, show me how many unique distinct_ids fire it — events with low cardinality are often background-job spam.”
  • “Weekly Routine: regenerate, diff vs last week, post deltas to #data (new events created, events that died, count changes).”

Adapt this playbook in your own agent.

Ask your agent to map the starter prompt to your saved signal map, live events, and the current HTTP API before answering.

Install agentry.sh/install.md for me
Agent will onboard itself and then your app