engineering

The funnel that reported 461% conversion

June 10, 2026

The funnel that reported 461% conversion

One of our analytics dashboards started showing a conversion funnel where the second step converted at 461.7%, with a drop rate of -361.7%. More people "opened the app" than had ever installed it, and the funnel got wider as it went down. A funnel is supposed to be a triangle. Ours was an hourglass with the bell on top.

The problem

The funnel endpoint takes an ordered list of event names — app_install, app_open, core_action, paywall_view, subscribe_start — and is supposed to report, for each step, how many users reached it and what fraction of the starting cohort that represents. The contract everyone assumes: the first step is 100%, every later step is some number at or below it, and the drop rate between steps is non-negative.

What we actually returned for a normal week looked like this:

app_install      1,000   100.0%   drop   0.0%
app_open         4,617   461.7%   drop -361.7%
core_action      2,140   214.0%   drop  53.6%
paywall_view       180    18.0%   drop  91.6%
subscribe_start     74     7.4%   drop  58.9%

app_open at 461.7% is not a rounding glitch. It is the funnel telling you that for every install, the average user opened the app between four and five times — which is true, and exactly the problem.

Why it happened

The original query counted event rows, not people:

const eventCounts = await prisma.analyticsEvent.groupBy({
  by: ['eventName'],
  where: whereClause,
  _count: { _all: true },
});

app_install fires roughly once per device. app_open fires every single time someone launches the app — many times per device per week. So the raw row count for app_open happily sailed past the install count, and dividing it by installs gave a rate over 100%. The negative drop rate was the same bug wearing a different hat: step two had more rows than step one, so "previous minus current" went negative.

There was a second, quieter bug hiding underneath. Each step's count was computed independently from its own bucket. Even if you fixed the unique-vs-raw issue, nothing guaranteed that the people who hit core_action were a subset of the people who hit app_open. A funnel that doesn't intersect cohorts step by step isn't a funnel — it's five unrelated counts stacked in a column and labeled with arrows.

What we tried first

The tempting one-line fix is to clamp: rate = Math.min(rate, 1). It makes the dashboard stop embarrassing you and changes nothing real. The 461.7% becomes 100%, the -361.7% drop becomes 0%, and the underlying number — "we are counting launches, not users" — is still wrong, just no longer visible. We didn't ship that. A funnel you can't trust below the fold is worse than one that's obviously broken above it.

The fix

Count distinct devices per step, then make each step a subset of the one before it. First, ask the database for unique (event, device) pairs instead of row totals:

// A funnel must count UNIQUE devices per step, not raw event rows.
const stepDevicePairs = await prisma.analyticsEvent.groupBy({
  by: ['eventName', 'deviceId'],
  where: whereClause,
});

Then do the cohort math in a pure function — no database, no request object, trivially testable:

let cohort = new Set<string>();
const funnel = funnelSteps.map((step, index) => {
  const actors = devicesByStep.get(step) ?? new Set<string>();
  if (index === 0) {
    cohort = actors;                 // first step defines 100%
  } else {
    const next = new Set<string>();
    // intersect the running cohort with this step's devices
    const [small, large] = actors.size <= cohort.size ? [actors, cohort] : [cohort, actors];
    for (const id of small) if (large.has(id)) next.add(id);
    cohort = next;                   // only devices that did THIS step and all prior steps
  }
  // ...rate and drop_rate computed off the first-step size
});

Two properties fall out of this for free. The cohort only ever shrinks, so the funnel is monotonically decreasing by construction. And because every later count is a subset of the first, rate is always in [0, 1] and drop_rate is never negative — no clamping required. (Small detail that matters at scale: we iterate the smaller of the two sets when intersecting, so each step is O(min(a, b)) rather than O(a).)

The endpoint's response shape didn't change — same fields, same nesting — so the admin frontend needed zero edits. The only thing that changed was that the numbers became possible.

Before and after

                 BEFORE (event rows)        AFTER (unique devices)
app_open         461.7%  drop -361.7%       63.0%   drop 37.0%
core_action      214.0%  drop   53.6%       41.0%   drop 34.9%
overall_conv      7.4% (off a wrong base)    7.4% (off a real base)

The overall conversion number barely moved — the last step was already low-frequency, roughly one row per device — which is exactly why nobody caught this for so long. The bug lived entirely in the high-frequency middle of the funnel, where the steps people actually look at to decide what to fix.

We also added a unit-test file that encodes the symptom directly, so it can't come back:

it('is monotonically decreasing and never exceeds 100%', () => {
  // 3 unique devices, 50 app_open rows — the production symptom
  const opens = Array.from({ length: 50 }, (_, i) => `d${i % 3}`);
  // ...expect every count <= the previous, every rate in [0,1]
});

What we learned

"Count" almost always means "count people," and the database almost always gives you rows. The gap between those two is where most analytics bugs live. Name the unit in the query, not just in the dashboard label.
A funnel is a sequence of nested cohorts, not a column of independent totals. If your steps aren't intersected, monotonicity is a coincidence, not a guarantee — and coincidences break on real data.
Pull the math out of the request handler. Moving cohort logic into a pure function turned an "I'll eyeball the dashboard" situation into a test that fails loudly. The bug was easy to describe; it just needed somewhere to be asserted.

What's next

This still counts a device, not a person — someone on a phone and a tablet is two devices in the funnel. Cross-device identity is a bigger project with its own privacy constraints, and we'd rather under-merge than guess. The current fix makes the funnel honest within a device; making it honest across devices is the next layer, and we're not rushing it.

Try Eodin

Eodin builds small products that solve specific problems. See what we're working on.

Website

Why did the funnel report a conversion rate over 100%?

The funnel counted raw event rows instead of unique users or devices. For example, 'app_open' events happen multiple times per device, so counting all rows inflated the numbers, resulting in conversion rates exceeding 100%.

How was the funnel calculation fixed to show accurate conversion rates?

The fix involved counting distinct devices per step rather than raw event rows and ensuring each step's cohort is a subset of the previous one. This approach guarantees the funnel is monotonically decreasing and conversion rates stay between 0% and 100% without artificial clamping.

What does it mean that a funnel should be a sequence of nested cohorts?

A funnel's steps should represent subsets of users who completed all prior steps, ensuring each step's cohort is contained within the previous one. This nesting guarantees the funnel narrows logically and conversion rates decrease or stay the same at each step.

Why is counting devices instead of people still a limitation in the funnel analysis?

Counting devices treats each device as a separate user, so one person using multiple devices appears multiple times in the funnel. Resolving cross-device identity is complex and involves privacy concerns, so the current fix focuses on accuracy within devices only.

Why is it important to separate funnel math from the request handler?

Separating funnel logic into a pure function makes it easier to test and verify correctness. This approach helps catch bugs early, ensures consistent results, and avoids relying on manual dashboard checks to detect errors.

Continue reading

engineering

The enum-case bug that labeled every free tool "paid"

Every tool in our directory told Google the same thing about price: it was paid. Free tools too. The cause was a lowercase enum comparison that never matched.

Jun 17, 2026

engineering

How we removed fingerprinting from our deep-link attribution

A walkthrough of replacing our SHA256 device-fingerprint matcher with deterministic Play Install Referrer clickIds on Android and a guarded IP fallback on iOS.

Jun 3, 2026

engineering

We shipped a 4-source outbreak ETL. Two days later, we kept one.

Within 48 hours we deleted PAHO, CDC, and ECDC scrapers and replaced them with a single WHO OData API. Here's what broke, what we shipped, and what we learned.

May 27, 2026

The funnel that reported 461% conversion

The problem

Why it happened

What we tried first

The fix

Before and after

What we learned

What's next

Try Eodin

Frequently Asked Questions

Why did the funnel report a conversion rate over 100%?

How was the funnel calculation fixed to show accurate conversion rates?

What does it mean that a funnel should be a sequence of nested cohorts?

Why is counting devices instead of people still a limitation in the funnel analysis?

Why is it important to separate funnel math from the request handler?

Continue reading

The enum-case bug that labeled every free tool "paid"

How we removed fingerprinting from our deep-link attribution

We shipped a 4-source outbreak ETL. Two days later, we kept one.