Image & Media Weight Budgets
Images are usually the single largest byte category a page ships and the most common cause of a blown Largest Contentful Paint, yet they are the easiest to leave unbudgeted because a CMS upload bypasses every code review. This is the media-weight layer of the Defining Web Performance Budgets reference: it turns image and video delivery into a budgeted, gated pipeline with per-breakpoint byte ceilings, modern formats, reserved layout boxes, and a CI assertion on resource-summary:image that fails the build when a regression slips through.
The discipline has four moving parts that interact: the format (AVIF or WebP over JPEG), the responsive set (a srcset/sizes pair that ships the right resolution per viewport), the loading strategy (eager for the hero, lazy for everything below the fold), and the layout reservation (an aspect-ratio box that prevents a Cumulative Layout Shift). Get the format and responsive set right and you cut bytes; get loading and reservation right and you protect LCP and CLS. This page is the authoritative spec for all four.
Architecture Overview
Image bytes are not one budget — they are a budget per viewport, because a phone should never download the desktop hero. The diagram below shows how a single source image fans out through an encoding pipeline into a srcset ladder, and how each rung maps to a byte ceiling for the viewport that selects it.
Prerequisites & Environment
Budgeting media weight requires a build-time encoding step you control and a CI runner that can measure delivered image bytes under emulation. The work assumes first-party budgets and a Lighthouse CI pipeline are already in place.
sharp≥ 0.33 — the encoding pipeline that produces AVIF and WebP renditions deterministically; pin the version so output bytes are reproducible across machines.@lhci/cli≥ 0.13 — supplies theresource-summary:imageassertion used to gate total image bytes.- A responsive markup layer — either hand-authored
<picture>/srcsetor a framework image component (Eleventy Image, Next/Image) that emits the ladder from a single source. - An emulation profile — measure delivered bytes against a mid-range mobile device on 4G at P75, the same profile used in Mobile vs Desktop Budget Divergence, because the mobile viewport is where image budgets bite hardest.
Map dynamic values through environment variables:
STAGING_BASE_URL— the preview origin Lighthouse collects against.IMAGE_CDN_BASE— origin for the encoded renditions, kept on the CSP allowlist.
Configuration Reference
Two artifacts define the media-weight layer: a budget manifest with per-breakpoint ceilings, and a sharp pipeline that produces renditions matching those ceilings. Both are annotated inline.
# image-budget.yml — per-breakpoint ceilings at P75 mobile 4G
formats: [avif, webp, jpeg] # preference order; jpeg is the fallback
breakpoints:
- name: mobile
width: 480
max_kb: 40 # hero rendition ceiling at this width
- name: tablet
width: 960
max_kb: 90
- name: desktop
width: 1600
max_kb: 180
loading:
lcp_image: eager # the above-the-fold hero loads eagerly
below_fold: lazy # everything else defers
cls:
require_dimensions: true # width/height or aspect-ratio mandatory
// encode.js — sharp pipeline emitting budgeted AVIF/WebP/JPEG renditions
const sharp = require("sharp");
const WIDTHS = [480, 960, 1600];
async function encode(input, outBase) {
for (const w of WIDTHS) {
const base = sharp(input).resize({ width: w, withoutEnlargement: true });
await base.clone().avif({ quality: 50, effort: 4 }).toFile(`${outBase}-${w}.avif`);
await base.clone().webp({ quality: 72 }).toFile(`${outBase}-${w}.webp`);
await base.clone().jpeg({ quality: 78, mozjpeg: true }).toFile(`${outBase}-${w}.jpg`);
}
}
module.exports = { encode, WIDTHS };
AVIF at quality: 50 typically lands 30–50% under the equivalent WebP for photographic content; the withoutEnlargement guard prevents upscaling a small source past its native resolution, which wastes bytes for no visual gain.
Step-by-Step Implementation
-
Encode the source set so every image has renditions at all three widths in all three formats.
node -e "require('./encode.js').encode('src/hero.jpg','dist/hero')" ls dist/hero-*.{avif,webp,jpg}Expected output: nine files —
hero-480/960/1600in.avif,.webp,.jpg. -
Author the responsive markup with a
<picture>element so the browser selects the smallest format it supports at the right width, and reserve the box withwidth/height.<picture> <source type="image/avif" srcset="/hero-480.avif 480w, /hero-960.avif 960w, /hero-1600.avif 1600w" sizes="(max-width: 600px) 100vw, 1600px"> <source type="image/webp" srcset="/hero-480.webp 480w, /hero-960.webp 960w, /hero-1600.webp 1600w" sizes="(max-width: 600px) 100vw, 1600px"> <img src="/hero-1600.jpg" width="1600" height="900" alt="Product hero" fetchpriority="high"> </picture> -
Mark below-fold images lazy and verify the delivered bytes per viewport before wiring CI.
npx lighthouse $STAGING_BASE_URL --only-audits=resource-summary --output=json --quiet \ | node -e "let d='';process.stdin.on('data',c=>d+=c).on('end',()=>{const i=JSON.parse(d).audits['resource-summary'].details.items.find(x=>x.resourceType==='image');console.log('image bytes:',Math.round(i.transferSize/1024),'KB')})"Expected output:
image bytes: <n> KB, confirming the total fits the page's image ceiling.
Threshold Calibration
Do not adopt these numbers blind — derive each ceiling from the rendition sharp actually produces for your imagery, set the lab assertion 10–15% tighter to absorb compression variance, and confirm the percentile methodology against Percentile-Based Threshold Tuning. Values are AVIF renditions at P75 on a mid-range mobile device over 4G.
| Breakpoint | Render width | Per-image ceiling (AVIF) | Page image total | LCP image priority |
|---|---|---|---|---|
| Mobile | 480w | 40 KB | 150 KB | Eager, fetchpriority="high" |
| Tablet | 960w | 90 KB | 300 KB | Eager |
| Desktop | 1600w | 180 KB | 500 KB | Eager |
| Below-fold (any) | matched | same as breakpoint | counts to total | Lazy |
The LCP image must never be lazy-loaded — defer only what is below the fold, and map the hero's byte ceiling directly to your LCP budget in Core Web Vitals Budget Allocation. Every image needs explicit dimensions or an aspect-ratio box; a missing reservation is the most common source of media-driven CLS. The per-breakpoint ceilings get their own deeper treatment in Setting Responsive Image Byte Budgets.
CI Enforcement Snippet
This GitHub Actions job builds, collects Lighthouse runs, and asserts the image resource summary, surfacing a required status check that branch protection can gate on.
name: Image Weight Gate
on:
pull_request:
branches: [main]
jobs:
image-budget:
runs-on: ubuntu-latest
timeout-minutes: 15
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: "20"
cache: "npm"
- run: npm ci
- run: npm run build # runs encode.js as part of the build
- name: Assert image budgets
run: npx lhci autorun
env:
LHCI_GITHUB_APP_TOKEN: ${{ secrets.LHCI_GITHUB_APP_TOKEN }}
The matching lighthouserc.json caps image bytes and warns on the LCP and CLS metrics that media weight drives:
{
"ci": {
"collect": { "numberOfRuns": 3, "settings": { "preset": "perf" } },
"assert": {
"assertions": {
"resource-summary:image:size": ["error", { "maxNumericValue": 153600 }],
"modern-image-formats": ["error", { "maxLength": 0 }],
"uses-responsive-images": ["error", { "maxLength": 0 }],
"largest-contentful-paint": ["warn", { "maxNumericValue": 2500 }],
"cumulative-layout-shift": ["error", { "maxNumericValue": 0.1 }]
}
}
}
}
The 153600-byte ceiling targets the mobile viewport; scale it per matrix entry when you fan collection across viewports, and keep this gate distinct from the Web Font Performance Budgets check so a font and an image regression fail with different messages.
Troubleshooting & Edge Cases
- AVIF larger than WebP for flat graphics → AVIF wins on photographs but loses on simple logos and screenshots; let the
<picture>source order pick per-asset, or ship PNG/WebP for non-photographic content. - CMS upload bypasses the budget → run
encode.json upload via a hook or build step so editor-supplied images enter the same pipeline; an unbudgeted CMS path is the classic regression. - LCP image lazy-loaded by a framework default → many image components lazy-load everything; explicitly set the hero to eager with
fetchpriority="high"or LCP will regress despite the bytes fitting. - Layout shift despite dimensions → a responsive image without an
aspect-ratioCSS rule still shifts on slow connections; reserve the box in CSS, not just thewidth/heightattributes. sizesmismatch ships the wrong rendition → an inaccuratesizesattribute makes the browser download a 1600w image into a 480px slot; alignsizesto the real rendered width per breakpoint.- Animated GIFs blow the budget → convert GIFs to muted autoplay
<video>(WebM/MP4); a looping GIF is often 10× the bytes of the equivalent video.
Frequently Asked Questions
Should I budget image bytes per breakpoint or per page?
Both, but the per-breakpoint ceiling is the load-bearing one. A single page total hides that a phone is downloading a desktop rendition, so set a ceiling for each viewport's rendition and a page total that the breakpoints roll up into. The CI gate asserts the page total under resource-summary:image:size at the mobile viewport, which is where the budget bites hardest. The per-breakpoint method is detailed in Setting Responsive Image Byte Budgets.
How do images cause layout shift, and how do I budget against it?
An image with no reserved box collapses to zero height until it loads, then pushes content down — that displacement is Cumulative Layout Shift. Reserve the box with width/height attributes plus an aspect-ratio CSS rule, and assert cumulative-layout-shift at 0.1 in CI. Map the budget to Core Web Vitals Budget Allocation.
Is AVIF always the right format to budget for?
For photographic content, yes — AVIF typically lands 30 to 50 percent under WebP at equivalent quality. For flat graphics, logos, and screenshots it can be larger, so keep a <picture> with AVIF, WebP, and a fallback, and let the browser pick. Budget against the AVIF rendition since that is what most modern browsers download.