Custom Performance Beacons & RUM
Lab audits tell you what a single throttled Chrome on a CI runner measured; they cannot tell you what a mid-range Android on Fast 3G in a user's hand actually experienced. A custom Real User Monitoring (RUM) beacon system closes that gap by capturing field Core Web Vitals from production sessions and feeding their percentiles back into your budget gates. This is the field-telemetry layer of the Lighthouse CI & WebPageTest Integration reference: it turns synthetic ceilings into thresholds calibrated against the device and network distribution your users really have.
A beacon system has four coupled responsibilities — measure the vitals in the browser, transmit them reliably without harming responsiveness, aggregate raw events into percentiles, and enforce those percentiles as a gate. Get the measurement layer wrong and every downstream percentile inherits the error; get the transport wrong and you silently drop the slowest sessions, biasing your P99 optimistically. This page is the authoritative spec for all four stages and links to the deeper guides for each.
Architecture Overview
The pipeline moves a metric from the web-vitals library in the browser, through a sendBeacon transmission to a collection endpoint, into an aggregation store that computes percentiles, and finally into a dashboard and the CI gate. The diagram below shows where each responsibility lives and where the budget check reads from.
Prerequisites & Environment
You need three pieces in place before instrumenting: a way to ship a small script to every page, a collection endpoint that accepts POST bodies, and a store that can compute percentiles over a time window.
web-vitals≥ 4 — Google's attribution-capable library. It reports the final LCP, INP, and CLS values on page visibility change, which is exactly when you want to beacon. Pin the major version; metric definitions occasionally shift.- A collection endpoint — any HTTP handler (an edge function, a small Node service, or a managed collector) that accepts the beacon body and appends it to durable storage. It must respond
204quickly and never block. - An aggregation store — a columnar or time-series store (ClickHouse, BigQuery, or Prometheus-style histograms) where you compute the percentiles described in Building P75/P99 Aggregation Pipelines.
Map the endpoint and any write credentials through environment variables so nothing is hardcoded in the page script or CI:
RUM_INGEST_URL— the collection endpoint the beacon posts to.RUM_QUERY_URL— the read endpoint the CI gate queries for percentiles.RUM_SAMPLE_RATE— the head-based sample rate, tuned per the methodology in RUM Sampling Strategies for High-Traffic Sites.
Configuration Reference
The browser side is a single module loaded as early as possible. The annotated block below is the authoritative client spec: it registers web-vitals callbacks, assembles a compact payload, and transmits once per metric using sendBeacon. Payload design is expanded in Designing Efficient RUM Beacon Payloads.
// rum-beacon.js — loaded in <head>, after a deterministic sampling decision
import { onLCP, onINP, onCLS, onTTFB } from "web-vitals";
const ENDPOINT = "/rum/ingest";
const session = crypto.randomUUID();
// Compact, fixed-key payload keeps each beacon well under 1 KB.
function send(metric) {
const body = JSON.stringify({
s: session, // session id
n: metric.name, // LCP | INP | CLS | TTFB
v: Math.round(metric.value), // milliseconds (or CLS * 1000)
r: metric.rating, // good | needs-improvement | poor
d: navigator.deviceMemory || 0, // device-memory hint for class split
c: navigator.connection?.effectiveType || "", // 4g | 3g | slow-2g
u: location.pathname, // route, no query string
});
// sendBeacon survives page unload; web-vitals fires on visibility change.
navigator.sendBeacon(ENDPOINT, body);
}
onLCP(send); // largest contentful paint, final value
onINP(send); // interaction to next paint, final value
onCLS(send); // cumulative layout shift, final value
onTTFB(send); // time to first byte, for backend correlation
onLCP, onINP, and onCLS each fire once with the final field value when the page is backgrounded or unloaded, so you never beacon a provisional reading. sendBeacon queues the request in the browser's network stack and returns immediately, surviving the unload that would abort a fetch — the reliability property that makes it the correct transport for terminal metrics.
Step-by-Step Implementation
-
Make the sampling decision first, before loading the beacon module, so excluded sessions never pay for the library. Hash the session id and compare against the rate.
npm install web-vitals@^4Expected output:
web-vitalsadded todependenciesinpackage.json. -
Wire the collection endpoint to validate the body, drop malformed payloads, and append raw events. A minimal edge handler returns
204and writes asynchronously.// ingest.js — edge/serverless handler export async function POST(req) { const e = await req.json(); if (!e.s || !e.n || typeof e.v !== "number") { return new Response(null, { status: 400 }); } await appendEvent({ ...e, t: Date.now() }); // fire-and-forget write return new Response(null, { status: 204 }); }Expected output: a
204with no body, observed in the browser Network panel under theping/beacontype. -
Verify end to end by loading a page, backgrounding the tab, and confirming one row per metric lands in the store. Then build the percentile rollup and point the CI gate at it.
Threshold Calibration
Do not reuse lab ceilings for field gates — field P75 is the metric Google's Core Web Vitals program scores, and it sits well above any single lab run. Pull the P75 from your own RUM data per device class and connection, hold P99 as a tail-regression tripwire, and only then set the gate. The matrix below is a representative starting point; calibrate against Percentile-Based Threshold Tuning.
| Device class | Connection profile | LCP (field P75) | INP (field P75) | CLS (field P75) | P99 tripwire (LCP) |
|---|---|---|---|---|---|
| Desktop | Cable / Fiber | 2000 ms | 150 ms | 0.05 | 3500 ms |
| High-end mobile | 4G / LTE | 2500 ms | 200 ms | 0.10 | 4500 ms |
| Mid-range mobile | Fast 3G | 3500 ms | 350 ms | 0.10 | 6000 ms |
Gate on P75 because it is the published "good" boundary and is stable enough to act on; watch P99 because tail regressions (a slow region, a broken cache, a bloated third party) show there first while P75 still looks healthy.
CI Enforcement Snippet
This GitHub Actions job queries the aggregation store for the current 28-day rolling P75 and fails the build when field data has drifted past the budget — gating real-user regressions, not just lab ones. It pairs naturally with the lab gate in Lighthouse CI Configuration & Storage.
name: RUM Budget Gate
on:
schedule:
- cron: "0 6 * * *" # daily, after overnight traffic settles
workflow_dispatch:
jobs:
rum-gate:
runs-on: ubuntu-latest
timeout-minutes: 10
steps:
- uses: actions/checkout@v4
- name: Query field P75 and assert budgets
env:
RUM_QUERY_URL: ${{ secrets.RUM_QUERY_URL }}
run: |
curl -fsSL "$RUM_QUERY_URL?window=28d&pct=75" -o p75.json
node -e '
const b = require("./p75.json");
const budgets = { LCP: 2500, INP: 200, CLS: 100 };
let failed = false;
for (const [m, max] of Object.entries(budgets)) {
const v = b[m];
const ok = v <= max;
console.log(`[RUM] ${m} P75=${v} budget=${max} ${ok ? "PASS" : "FAIL"}`);
if (!ok) failed = true;
}
process.exit(failed ? 1 : 0);
'
Run this on a schedule rather than per-PR, because field percentiles change with traffic, not with a single commit; surface its result on the dashboard described in Visualizing Budget Trends with Grafana so a drift is visible before the gate trips.
Troubleshooting & Edge Cases
- P99 is missing or wildly noisy → your sample size in the tail is too small; lower the sample rate's exclusion or widen the window, per RUM Sampling Strategies for High-Traffic Sites.
- Beacons dropped on iOS Safari →
sendBeaconbodies over ~64 KB are silently discarded; keep payloads compact and never batch hundreds of events into one beacon. - INP never reported → the user never interacted, or you read the metric too early; rely on
web-vitalsonINP, which reports the final value on visibility change. - Field P75 far above lab → expected and correct; lab runs one fast device, while field data includes the slow upper tail of real sessions. Calibrate the gate to field, not lab.
- CLS values look like 0.10 but you stored 100 → you multiplied CLS by 1000 for integer transport; divide on read or the budget comparison will be off by three orders of magnitude.
- Endpoint latency climbs under load → the handler is writing synchronously; make the durable write fire-and-forget and return
204first.
Frequently Asked Questions
Why use a custom RUM beacon instead of a managed provider?
A custom beacon gives you the raw event stream, so you can compute the exact percentiles your gate needs, split by your own device classes and routes, and join against your CI baselines. Managed providers often expose only pre-aggregated P75 and rate-limit historical queries. If you already gate on lab data, owning the field pipeline lets you assert both with one budget vocabulary. See Building P75/P99 Aggregation Pipelines.
Should the CI gate read field P75 or P99?
Gate on field P75 because it is the published Core Web Vitals "good" boundary and is stable enough to fail a build on without flakiness. Track P99 as a tripwire — alert on it, but do not block merges on a single tail spike, which is usually a transient regional or third-party issue. Calibrate both against Percentile-Based Threshold Tuning.
Will the beacon script slow down the page it measures?
Negligibly, if you make the sampling decision before loading the library, use sendBeacon rather than a blocking request, and keep the payload under 1 KB. The web-vitals library registers passive observers and transmits only on page hide, so it adds no synchronous work to load. Payload discipline is covered in Designing Efficient RUM Beacon Payloads.