Custom Performance Beacons & RUM

Lab audits tell you what a single throttled Chrome on a CI runner measured; they cannot tell you what a mid-range Android on Fast 3G in a user's hand actually experienced. A custom Real User Monitoring (RUM) beacon system closes that gap by capturing field Core Web Vitals from production sessions and feeding their percentiles back into your budget gates. This is the field-telemetry layer of the Lighthouse CI & WebPageTest Integration reference: it turns synthetic ceilings into thresholds calibrated against the device and network distribution your users really have.

A beacon system has four coupled responsibilities — measure the vitals in the browser, transmit them reliably without harming responsiveness, aggregate raw events into percentiles, and enforce those percentiles as a gate. Get the measurement layer wrong and every downstream percentile inherits the error; get the transport wrong and you silently drop the slowest sessions, biasing your P99 optimistically. This page is the authoritative spec for all four stages and links to the deeper guides for each.

Architecture Overview

The pipeline moves a metric from the web-vitals library in the browser, through a sendBeacon transmission to a collection endpoint, into an aggregation store that computes percentiles, and finally into a dashboard and the CI gate. The diagram below shows where each responsibility lives and where the budget check reads from.

RUM beacon data flow from browser to budget gate The web-vitals library captures Core Web Vitals in the browser and transmits them with sendBeacon to a collection endpoint, which appends raw events to an aggregation store that computes P75 and P99 percentiles, feeding both a Grafana dashboard and the CI budget gate. Browser web-vitals lib LCP / INP / CLS Endpoint sendBeacon validate + sample Aggregation t-digest rollup P75 / P99 Dashboard trends + alerts CI budget gate P75 vs ceiling
The browser measures vitals, the endpoint validates and samples, the aggregation store computes percentiles, and both the dashboard and the CI gate read those percentiles.

Prerequisites & Environment

You need three pieces in place before instrumenting: a way to ship a small script to every page, a collection endpoint that accepts POST bodies, and a store that can compute percentiles over a time window.

  • web-vitals ≥ 4 — Google's attribution-capable library. It reports the final LCP, INP, and CLS values on page visibility change, which is exactly when you want to beacon. Pin the major version; metric definitions occasionally shift.
  • A collection endpoint — any HTTP handler (an edge function, a small Node service, or a managed collector) that accepts the beacon body and appends it to durable storage. It must respond 204 quickly and never block.
  • An aggregation store — a columnar or time-series store (ClickHouse, BigQuery, or Prometheus-style histograms) where you compute the percentiles described in Building P75/P99 Aggregation Pipelines.

Map the endpoint and any write credentials through environment variables so nothing is hardcoded in the page script or CI:

  • RUM_INGEST_URL — the collection endpoint the beacon posts to.
  • RUM_QUERY_URL — the read endpoint the CI gate queries for percentiles.
  • RUM_SAMPLE_RATE — the head-based sample rate, tuned per the methodology in RUM Sampling Strategies for High-Traffic Sites.

Configuration Reference

The browser side is a single module loaded as early as possible. The annotated block below is the authoritative client spec: it registers web-vitals callbacks, assembles a compact payload, and transmits once per metric using sendBeacon. Payload design is expanded in Designing Efficient RUM Beacon Payloads.

// rum-beacon.js — loaded in <head>, after a deterministic sampling decision
import { onLCP, onINP, onCLS, onTTFB } from "web-vitals";

const ENDPOINT = "/rum/ingest";
const session = crypto.randomUUID();

// Compact, fixed-key payload keeps each beacon well under 1 KB.
function send(metric) {
  const body = JSON.stringify({
    s: session,                       // session id
    n: metric.name,                   // LCP | INP | CLS | TTFB
    v: Math.round(metric.value),      // milliseconds (or CLS * 1000)
    r: metric.rating,                 // good | needs-improvement | poor
    d: navigator.deviceMemory || 0,   // device-memory hint for class split
    c: navigator.connection?.effectiveType || "", // 4g | 3g | slow-2g
    u: location.pathname,             // route, no query string
  });
  // sendBeacon survives page unload; web-vitals fires on visibility change.
  navigator.sendBeacon(ENDPOINT, body);
}

onLCP(send);   // largest contentful paint, final value
onINP(send);   // interaction to next paint, final value
onCLS(send);   // cumulative layout shift, final value
onTTFB(send);  // time to first byte, for backend correlation

onLCP, onINP, and onCLS each fire once with the final field value when the page is backgrounded or unloaded, so you never beacon a provisional reading. sendBeacon queues the request in the browser's network stack and returns immediately, surviving the unload that would abort a fetch — the reliability property that makes it the correct transport for terminal metrics.

Step-by-Step Implementation

  1. Make the sampling decision first, before loading the beacon module, so excluded sessions never pay for the library. Hash the session id and compare against the rate.

    npm install web-vitals@^4

    Expected output: web-vitals added to dependencies in package.json.

  2. Wire the collection endpoint to validate the body, drop malformed payloads, and append raw events. A minimal edge handler returns 204 and writes asynchronously.

    // ingest.js — edge/serverless handler
    export async function POST(req) {
      const e = await req.json();
      if (!e.s || !e.n || typeof e.v !== "number") {
        return new Response(null, { status: 400 });
      }
      await appendEvent({ ...e, t: Date.now() }); // fire-and-forget write
      return new Response(null, { status: 204 });
    }

    Expected output: a 204 with no body, observed in the browser Network panel under the ping/beacon type.

  3. Verify end to end by loading a page, backgrounding the tab, and confirming one row per metric lands in the store. Then build the percentile rollup and point the CI gate at it.

Threshold Calibration

Do not reuse lab ceilings for field gates — field P75 is the metric Google's Core Web Vitals program scores, and it sits well above any single lab run. Pull the P75 from your own RUM data per device class and connection, hold P99 as a tail-regression tripwire, and only then set the gate. The matrix below is a representative starting point; calibrate against Percentile-Based Threshold Tuning.

Device class Connection profile LCP (field P75) INP (field P75) CLS (field P75) P99 tripwire (LCP)
Desktop Cable / Fiber 2000 ms 150 ms 0.05 3500 ms
High-end mobile 4G / LTE 2500 ms 200 ms 0.10 4500 ms
Mid-range mobile Fast 3G 3500 ms 350 ms 0.10 6000 ms

Gate on P75 because it is the published "good" boundary and is stable enough to act on; watch P99 because tail regressions (a slow region, a broken cache, a bloated third party) show there first while P75 still looks healthy.

CI Enforcement Snippet

This GitHub Actions job queries the aggregation store for the current 28-day rolling P75 and fails the build when field data has drifted past the budget — gating real-user regressions, not just lab ones. It pairs naturally with the lab gate in Lighthouse CI Configuration & Storage.

name: RUM Budget Gate
on:
  schedule:
    - cron: "0 6 * * *"   # daily, after overnight traffic settles
  workflow_dispatch:

jobs:
  rum-gate:
    runs-on: ubuntu-latest
    timeout-minutes: 10
    steps:
      - uses: actions/checkout@v4
      - name: Query field P75 and assert budgets
        env:
          RUM_QUERY_URL: ${{ secrets.RUM_QUERY_URL }}
        run: |
          curl -fsSL "$RUM_QUERY_URL?window=28d&pct=75" -o p75.json
          node -e '
            const b = require("./p75.json");
            const budgets = { LCP: 2500, INP: 200, CLS: 100 };
            let failed = false;
            for (const [m, max] of Object.entries(budgets)) {
              const v = b[m];
              const ok = v <= max;
              console.log(`[RUM] ${m} P75=${v} budget=${max} ${ok ? "PASS" : "FAIL"}`);
              if (!ok) failed = true;
            }
            process.exit(failed ? 1 : 0);
          '

Run this on a schedule rather than per-PR, because field percentiles change with traffic, not with a single commit; surface its result on the dashboard described in Visualizing Budget Trends with Grafana so a drift is visible before the gate trips.

Troubleshooting & Edge Cases

  • P99 is missing or wildly noisy → your sample size in the tail is too small; lower the sample rate's exclusion or widen the window, per RUM Sampling Strategies for High-Traffic Sites.
  • Beacons dropped on iOS SafarisendBeacon bodies over ~64 KB are silently discarded; keep payloads compact and never batch hundreds of events into one beacon.
  • INP never reported → the user never interacted, or you read the metric too early; rely on web-vitals onINP, which reports the final value on visibility change.
  • Field P75 far above lab → expected and correct; lab runs one fast device, while field data includes the slow upper tail of real sessions. Calibrate the gate to field, not lab.
  • CLS values look like 0.10 but you stored 100 → you multiplied CLS by 1000 for integer transport; divide on read or the budget comparison will be off by three orders of magnitude.
  • Endpoint latency climbs under load → the handler is writing synchronously; make the durable write fire-and-forget and return 204 first.

Frequently Asked Questions

Why use a custom RUM beacon instead of a managed provider?

A custom beacon gives you the raw event stream, so you can compute the exact percentiles your gate needs, split by your own device classes and routes, and join against your CI baselines. Managed providers often expose only pre-aggregated P75 and rate-limit historical queries. If you already gate on lab data, owning the field pipeline lets you assert both with one budget vocabulary. See Building P75/P99 Aggregation Pipelines.

Should the CI gate read field P75 or P99?

Gate on field P75 because it is the published Core Web Vitals "good" boundary and is stable enough to fail a build on without flakiness. Track P99 as a tripwire — alert on it, but do not block merges on a single tail spike, which is usually a transient regional or third-party issue. Calibrate both against Percentile-Based Threshold Tuning.

Will the beacon script slow down the page it measures?

Negligibly, if you make the sampling decision before loading the library, use sendBeacon rather than a blocking request, and keep the payload under 1 KB. The web-vitals library registers passive observers and transmits only on page hide, so it adds no synchronous work to load. Payload discipline is covered in Designing Efficient RUM Beacon Payloads.