Defining Web Performance Budgets
A performance budget is an engineering contract, not an aspiration. It establishes immutable ceilings on latency, payload size, and rendering metrics that are enforced before code reaches production, the same way a type checker or a security scanner is. Aspirational targets drift; contracts gate merges. This reference covers the full operational lifecycle: codifying budgets as version-controlled schema, calibrating thresholds against device class and connection profile, enforcing asset-level constraints in CI, wiring observability that closes the loop with field data, and defining the failure modes and escalation paths that make the gate survivable for the team.
The discipline splits into six coupled concerns, each with its own detailed reference: per-metric allocation across LCP, INP, and CLS; JavaScript payload limits; the divergence between mobile and desktop ceilings; third-party script governance; image and media weight; and web font delivery. Get the schema and calibration wrong at the top and every section below inherits the noise. This page is the authoritative top-level spec; the named sections drill into each constraint.
Architecture Overview
A budget system is a closed loop. A version-controlled schema feeds a tool chain that measures every build; the CI pipeline asserts against the schema and gates the merge; observability compares shipped builds against field telemetry and feeds recalibration back into the schema. The diagram below shows how the four layers connect.
Budget Definition as Version-Controlled Schema
Budgets must be codified as version-controlled configuration, not tribal knowledge held in a wiki. A strict YAML or JSON schema that maps directly to your telemetry pipeline turns every threshold change into a reviewable diff. The schema correlates P75 field baselines with lab-derived synthetic targets and applies a controlled delta tolerance to absorb environmental variance between the runner and real devices. When Core Web Vitals Budget Allocation is treated as a hard constraint rather than a guideline, the boundary between shipping velocity and user-experience regression becomes explicit and reviewable.
# performance-budget.yaml
version: "1.0"
environment: "production"
tolerance:
lab_to_field_delta: "15%"
percentile: "p75"
thresholds:
lcp: 2500 # ms, Largest Contentful Paint ceiling
cls: 0.1 # unitless, Cumulative Layout Shift
inp: 200 # ms, Interaction to Next Paint
ttfb: 800 # ms, Time to First Byte reserve
total_js_gzipped: 150000 # bytes, initial-route script
total_css_gzipped: 50000 # bytes, critical + deferred
Enforce schema validation during PR creation. Reject any configuration that omits a percentile definition or lacks explicit tolerance routing, so every budget change undergoes architectural review before it merges. The schema is the contract; the rest of this reference is its enforcement.
Metric Selection & Threshold Matrix
A single global threshold fails in production. Network latency and CPU constraints force Mobile vs Desktop Budget Divergence, because a desktop-optimized ceiling silently masks mobile regressions. Calibrate P75 and P90 against device class, connection profile, and geographic routing. Weight synthetic lab data (Lighthouse, WebPageTest) at roughly 60% and anchor the remaining 40% in CrUX field data; the hybrid prevents over-fitting to idealized lab conditions while keeping the CI signal actionable.
| Device class | Connection | LCP P75 / P90 (ms) | INP P75 / P90 (ms) | CLS P75 / P90 | TTFB P75 (ms) | Payload P75 (KB) |
|---|---|---|---|---|---|---|
| High-end mobile | 4G / LTE | 2200 / 2600 | 180 / 230 | 0.08 / 0.11 | 700 | 170 |
| Mid-range mobile | Fast 3G | 3500 / 4100 | 300 / 380 | 0.12 / 0.16 | 1100 | 150 |
| Desktop | Cable / Fiber | 1500 / 1900 | 120 / 160 | 0.05 / 0.08 | 500 | 200 |
Implement connection-aware routing in the test harness: throttle CPU to 4x slowdown and network to Fast 3G for the mobile baseline, and fail builds when P90 metrics exceed the calibrated matrix by more than 10%. Derive these numbers from your own field data rather than copying them; the percentile methodology is the subject of Percentile-Based Threshold Tuning.
Asset-Level Constraints & Implementation
Byte budgets must be enforced across the entire critical rendering path. Main-thread blocking is contained by enforcing JavaScript Bundle Size Limits alongside route-based code-splitting; the initial route payload should never exceed 150 KB gzipped, and secondary chunks defer through import(). Heavy visual routes layer on Image & Media Weight Budgets to cap hero and gallery bytes, while text rendering is governed by Web Font Performance Budgets — subset to the needed glyphs, ship WOFF2, and cap total font payload near 40 KB.
Uncontrolled vendor payloads are the most common silent budget breach, so every integration declares a maximum execution window and network budget under Third-Party Script Constraints, enforced through Content Security Policy, dynamic loading guards, and fallback timeouts. The loader below enforces a hard timeout and routes failure to graceful degradation rather than letting a stalled vendor block the main thread.
// vendor-loader.js — enforce a per-vendor execution window
const VENDOR_TIMEOUT_MS = 4000;
function loadVendorScript(src) {
return new Promise((resolve, reject) => {
const script = document.createElement('script');
script.src = src;
script.async = true;
const timeout = setTimeout(() => {
script.remove();
reject(new Error(`Vendor script exceeded ${VENDOR_TIMEOUT_MS}ms execution window`));
}, VENDOR_TIMEOUT_MS);
script.onload = () => { clearTimeout(timeout); resolve(); };
script.onerror = () => { clearTimeout(timeout); reject(new Error('Vendor script failed to load')); };
document.head.appendChild(script);
});
}
Cap inlined critical CSS at 14 KB, apply font-display: swap (or optional for LCP text), and route any vendor failure to a degraded but functional fallback. These ceilings are the values your CI gate asserts.
CI/CD Gating Integration
Budget validation is integrated into the PR pipeline as a deterministic synthetic check, gated in three stages: lint-time schema assertions, build-time bundle analysis, and post-deploy synthetic verification with rollback triggers. The Lighthouse CI configuration below uses warning thresholds for metrics still under calibration and hard errors for contracted budgets.
{
"ci": {
"collect": {
"numberOfRuns": 3,
"settings": {
"preset": "desktop",
"throttlingMethod": "simulate",
"throttling": {
"cpuSlowdownMultiplier": 4,
"requestLatencyMs": 150,
"downloadThroughputKbps": 1638.4
}
}
},
"assert": {
"assertions": {
"categories:performance": ["error", { "minScore": 0.85 }],
"resource-summary:document:size": ["error", { "maxNumericValue": 25000 }],
"resource-summary:script:size": ["error", { "maxNumericValue": 150000 }],
"resource-summary:third-party:size": ["error", { "maxNumericValue": 80000 }]
}
}
}
}
Wire this into a GitHub Actions job that builds, runs the assertions, and surfaces a required status check that branch protection can gate on. Route warnings to Slack for visibility, but block the merge only on error-level breaches.
name: Performance Budget Gate
on:
pull_request:
branches: [main]
jobs:
budget-gate:
runs-on: ubuntu-latest
timeout-minutes: 15
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: "20"
cache: "npm"
- run: npm ci
- run: npm run build
- name: Run Lighthouse CI
run: npx lhci autorun
- name: Upload reports
if: always()
uses: actions/upload-artifact@v4
with:
name: lighthouse-reports
path: .lighthouseci/
Require the budget-gate check in branch protection so a breach is unmergeable. The collection settings, storage backend, and assertion semantics are specified in full under Lighthouse CI Configuration & Storage. A --budget-override flag, restricted to performance leads and requiring a documented justification plus a remediation ticket, is the only sanctioned emergency bypass.
Observability & Regression Detection
The gate only catches what the lab can see; field telemetry closes the loop. Aggregated observability bridges CI gate results with production RUM, tracking compliance trends, surfacing regression hotspots, and correlating drops with deployment timestamps. Tag every synthetic run with its Git SHA so a field regression maps cleanly back to the offending commit, and route SLO breaches to engineering channels with automated escalation.
# alert-routing-rules.yaml
rules:
- name: "budget_slo_breach"
condition: "p75_lcp > budget_lcp * 1.15"
duration: "2h"
severity: "critical"
channels:
- slack: "#perf-alerts"
- pagerduty: "frontend-oncall"
actions:
- auto_rollback: true
- create_jira: "PERF-REGRESSION"
- name: "budget_drift_warning"
condition: "p75_cls > budget_cls * 1.20"
duration: "24h"
severity: "warning"
channels:
- slack: "#perf-monitoring"
actions:
- notify: "performance-leads"
Trigger recalibration when P75 field metrics drift more than 15% from the CI baseline across two consecutive deployment windows. The statistical machinery that distinguishes a real regression from runner noise is the subject of Automated Regression Detection.
Failure Modes & Escalation Paths
A gate the team cannot live with gets disabled. Define the triage path, the rollback triggers, and the exception process before the first red build, so a breach is a routine workflow rather than a fire drill.
- Triage checklist on a red gate — confirm the breach reproduces across
numberOfRuns: 5(rule out single-sample noise), diff the failing metric against the last green baseline, and identify whether the regression is first-party (a code change) or third-party (a vendor update). Noise is the leading cause of false reds; see Statistical Noise & Flakiness Reduction. - Rollback triggers — auto-rollback fires when production P75 LCP exceeds the budget by more than 15% for two consecutive hours, or when a breach crosses a 5% conversion-impact threshold. Rollback is automatic; root-cause analysis follows, it does not block the revert.
- Exception approval — a hotfix that must bypass the gate requires a time-boxed override ticket with named engineering-director approval and a follow-up issue to restore compliance. Overrides are logged and reviewed; an exception that is never closed becomes a permanent regression.
- Quarterly recalibration — treat budget files as living configuration reviewed alongside the architectural roadmap. Infrastructure upgrades, framework migrations, and shifting user demographics all move the realistic ceiling.
By embedding precise constraints into CI, correlating synthetic validation with field telemetry, and defining the escalation path up front, teams ship at velocity without trading away user experience.
Frequently Asked Questions
What is the difference between a performance budget and a performance goal?
A goal is aspirational and advisory; a budget is a contracted ceiling enforced by CI that blocks a merge when breached. Goals live in slide decks and drift. Budgets live in version-controlled configuration, are asserted on every pull request, and exit non-zero on violation — see Lighthouse CI Configuration & Storage for the assertion mechanics.
Should I set budgets from lab data or field data?
Both. Anchor the realistic ceiling in field P75 from CrUX or your RUM provider, then set the lab assertion 10 to 15 percent tighter to absorb the lab-to-field gap. A common weighting is roughly 60 percent synthetic lab to 40 percent field. Specify the percentile (P75 or P90) and the environment (device class plus connection profile) every time you quote a number; the tuning method is in Percentile-Based Threshold Tuning.
Why do I need separate mobile and desktop budgets?
A single global threshold optimized for desktop hardware hides mobile regressions, because mid-range mobile on Fast 3G has far less CPU and bandwidth headroom. Maintain divergent ceilings per device class as described in Mobile vs Desktop Budget Divergence, and gate each profile independently.