WebPageTest Private Instance Setup

Deploying a private WebPageTest instance eliminates public endpoint variability, enforces deterministic network conditions, and provides the infrastructure required for strict performance budget gating. By isolating test runners within your VPC or dedicated bare-metal environment, engineering teams gain predictable latency baselines, controlled browser versions, and direct access to raw HAR/WPT data. This architecture is foundational to the broader Lighthouse CI & WebPageTest Integration ecosystem, enabling automated pass/fail gates that block regressions before they reach production.

Architecture Decision Matrix

Criteria Public Endpoint Private Instance
Network Isolation Shared, unpredictable routing VPC-scoped, zero egress noise
Browser/OS Control Fixed provider pool Custom Docker/VM provisioning
CI Gating Latency High variance, queue contention Deterministic, SLA-bound execution
Data Sovereignty Third-party retention Full ownership, custom lifecycle

Network Isolation Topology

[CI Runner] --> [WPT Server (Port 80/4000)] --> [Agent Pool (Chrome/Firefox/Safari)]
 ^ | |
 | v v
[GitHub/GitLab] [Redis Queue] [Local Network / CDN Edge]
 | |
 v v
[PR Comment Bot] [S3/GCS Result Bucket]

CI Gating Threshold Definition

Metric Soft Warning Hard Fail Enforcement Scope
LCP (Desktop) > 2.5s > 3.5s All PRs to main
CLS (Mobile) > 0.1 > 0.25 Release branches
TTFB (Prod) > 800ms > 1200ms Canary deployments

Infrastructure Provisioning & Docker Orchestration

Provisioning requires a centralized server node and horizontally scalable agent workers. Use Docker Compose for local validation before migrating to Kubernetes or ECS.

Docker Compose Topology (Server + 3 Agents)

version: "3.8"
services:
 wpt-server:
 image: webpagetest/server:latest
 ports:
 - "80:80"
 - "4000:4000"
 environment:
 - SERVER_LOCATION=us-east-1
 - REDIS_HOST=redis
 - AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID}
 - AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY}
 depends_on:
 - redis
 - wpt-agent-1
 - wpt-agent-2
 - wpt-agent-3

 wpt-agent-1:
 image: webpagetest/agent:latest
 environment:
 - SERVER_URL=http://wpt-server:4000/work/
 - LOCATION_ID=us-east-1:Chrome
 - AGENT_KEY=${AGENT_KEY_1}
 depends_on:
 - redis

 wpt-agent-2:
 image: webpagetest/agent:latest
 environment:
 - SERVER_URL=http://wpt-server:4000/work/
 - LOCATION_ID=us-east-1:Firefox
 - AGENT_KEY=${AGENT_KEY_2}
 depends_on:
 - redis

 wpt-agent-3:
 image: webpagetest/agent:latest
 environment:
 - SERVER_URL=http://wpt-server:4000/work/
 - LOCATION_ID=us-east-1:Safari
 - AGENT_KEY=${AGENT_KEY_3}
 depends_on:
 - redis

 redis:
 image: redis:7-alpine
 ports:
 - "6379:6379"

Environment Variable Reference

Variable Scope Description
WPT_SERVER Agent Base URL for work polling (http://<server>:4000/work/)
AGENT_KEY Agent Unique auth token for worker registration
LOCATION_ID Agent Browser/OS identifier mapped in settings.ini
SERVER_LOCATION Server Primary region tag for result routing

Systemd Auto-Restart Unit

[Unit]
Description=WebPageTest Private Instance
After=docker.service
Requires=docker.service

[Service]
Restart=always
ExecStart=/usr/bin/docker compose -f /opt/wpt/docker-compose.yml up
ExecStop=/usr/bin/docker compose -f /opt/wpt/docker-compose.yml down

[Install]
WantedBy=multi-user.target

Healthcheck Verification

#!/usr/bin/env bash
set -euo pipefail
SERVER_URL="${WPT_SERVER_URL:-http://localhost:4000}"
RESPONSE=$(curl -s -o /dev/null -w "%{http_code}" "${SERVER_URL}/getLocations.php")
if [[ "$RESPONSE" -eq 200 ]]; then
 echo "WPT Server healthy. Agents registered."
 exit 0
else
 echo "WPT Server unreachable (HTTP ${RESPONSE})"
 exit 1
fi

API Configuration & Endpoint Routing

The server node routes test requests via settings.ini and enforces authentication through scoped API keys. Restrict key permissions to test and result scopes only. Validate payloads before submission to prevent malformed test definitions from consuming queue capacity.

settings.ini Configuration

[locations]
1=us-east-1
label=US East Primary

[us-east-1]
1=us-east-1-Chrome
label=Chrome Stable
browser=chrome
latency=0
connectivity=LAN

Automated Test Trigger (cURL)

curl -X POST "http://wpt-server:4000/runtest.php" \
 -H "Content-Type: application/json" \
 -d '{
 "url": "https://staging.example.com",
 "location": "us-east-1:Chrome",
 "f": "json",
 "runs": 3,
 "video": 1,
 "lighthouse": 1,
 "api_key": "${WPT_API_KEY}"
 }'

Response Parsing (Node.js)

async function parseWPTResponse(testUrl) {
 const res = await fetch(testUrl);
 const data = await res.json();
 if (data.statusCode === 200) {
 return {
 testId: data.data.id,
 summaryUrl: data.data.summaryCSV,
 statusUrl: data.data.jsonUrl
 };
 }
 throw new Error(`WPT API Error: ${data.statusText}`);
}

Rate Limiting & Queue Thresholds

  • Max Concurrent Tests: 15 per location
  • Queue Depth Alert: > 50 pending triggers
  • API Rate Limit: 120 requests/minute per key
  • Payload Validation: Enforce url, location, and runs schema before dispatch. For detailed endpoint validation and payload formatting for headless execution, reference Configuring WebPageTest API for Automated Testing to ensure consistent test run parameters.

CI Pipeline Integration & Performance Budget Gating

Embedding the private instance into CI requires a trigger-poll-parse-gate sequence. Hard thresholds block merges, while soft warnings generate PR annotations without halting deployment.

GitHub Actions Workflow

name: WPT Performance Gate
on:
 pull_request:
 branches: [main]

jobs:
 run-wpt:
 runs-on: ubuntu-latest
 steps:
 - uses: actions/checkout@v4
 - name: Trigger WPT Test
 id: trigger
 run: |
 RESPONSE=$(curl -s -X POST "${WPT_SERVER}/runtest.php" \
 -H "Content-Type: application/json" \
 -d '{"url":"${{ github.event.pull_request.head.repo.url }}","location":"us-east-1:Chrome","f":"json"}')
 echo "test_id=$(echo $RESPONSE | jq -r '.data.id')" >> $GITHUB_OUTPUT

 - name: Poll Until Complete
 run: |
 while true; do
 STATUS=$(curl -s "${WPT_SERVER}/jsonResult.php?test=${{ steps.trigger.outputs.test_id }}" | jq -r '.statusCode')
 [[ "$STATUS" == "200" ]] && break
 sleep 15
 done

 - name: Evaluate Budget
 run: |
 METRICS=$(curl -s "${WPT_SERVER}/jsonResult.php?test=${{ steps.trigger.outputs.test_id }}")
 LCP=$(echo $METRICS | jq '.data.average.firstView.metrics.LargestContentfulPaint')
 if (( $(echo "$LCP > 3500" | bc -l) )); then
 echo "::error::LCP exceeded hard budget (3.5s)"
 exit 1
 fi

Performance Budget Schema

{
 "budgets": [
 {
 "path": "/*",
 "resourceSizes": [{"resourceType": "script", "budget": 250000}],
 "timings": [
 {"metric": "lcp", "budget": 3500, "severity": "error"},
 {"metric": "cls", "budget": 0.25, "severity": "warning"}
 ]
 }
 ]
}

Fail-Fast vs Soft-Warning Logic

  • Fail-Fast: Exit 1 immediately when error severity threshold is breached. Blocks PR merge.
  • Soft-Warning: Log ::warning:: annotation, attach Lighthouse report to PR, allow merge.
  • Matrix Optimization: When structuring parallel execution and cross-branch matrix testing, integrate GitHub Actions Performance Matrices to optimize runner allocation and prevent CI bottlenecks.

Result Storage & Historical Tracking

Raw WPT outputs must be persisted, indexed, and correlated with Lighthouse CI reports for longitudinal analysis. Use object storage with lifecycle policies to manage retention costs.

Cloud Storage Bucket Policy (AWS S3)

{
 "Version": "2012-10-17",
 "Statement": [
 {
 "Effect": "Allow",
 "Principal": {"AWS": "arn:aws:iam::123456789012:role/ci-pipeline-role"},
 "Action": ["s3:PutObject", "s3:GetObject"],
 "Resource": "arn:aws:s3:::wpt-results/*"
 },
 {
 "Effect": "Deny",
 "Principal": "*",
 "Action": "s3:*",
 "Resource": "arn:aws:s3:::wpt-results/*",
 "Condition": {"Bool": {"aws:SecureTransport": "false"}}
 }
 ]
}

Metadata Indexing Schema

Field Type Example
commit_sha string a1b2c3d4...
branch string feature/perf-optimization
environment string staging
timestamp ISO8601 2024-05-20T14:30:00Z
test_id string 240520_143000_abc123

Data Retention & Pruning Cron

#!/usr/bin/env bash
# Delete results older than 90 days
aws s3 ls s3://wpt-results/ --recursive | \
 awk '{print $4}' | \
 while read -r key; do
 UPLOAD_DATE=$(aws s3api head-object --bucket wpt-results --key "$key" --query 'LastModified' --output text)
 if [[ $(date -d "$UPLOAD_DATE" +%s) -lt $(date -d "90 days ago" +%s) ]]; then
 aws s3 rm "s3://wpt-results/$key"
 fi
 done

Cross-Reference Query (WPT ↔ LHCI)

SELECT 
 w.commit_sha,
 w.test_id AS wpt_test,
 lci.report_url AS lhci_report,
 w.lcp_value,
 lci.performance_score
FROM wpt_results w
JOIN lhci_reports lci ON w.commit_sha = lci.commit_sha
WHERE w.environment = 'staging'
ORDER BY w.timestamp DESC
LIMIT 50;

When detailing cross-tool result aggregation, retention windows, and SQL/NoSQL indexing strategies, align your schema with Lighthouse CI Configuration & Storage to maintain unified dashboarding.


Resilience, Failure Handling & Agent Recovery

Private instances require automated watchdogs to detect agent drift, clear stuck queues, and enforce fallback routing during network degradation.

Agent Watchdog Script (Bash)

#!/usr/bin/env bash
AGENT_URL="${WPT_SERVER}/getLocations.php"
MAX_RETRIES=3
RETRY_DELAY=10

for i in $(seq 1 $MAX_RETRIES); do
 STATUS=$(curl -s -o /dev/null -w "%{http_code}" "$AGENT_URL")
 if [[ "$STATUS" -eq 200 ]]; then
 echo "Agent pool healthy."
 exit 0
 fi
 echo "Attempt $i failed. Retrying in ${RETRY_DELAY}s..."
 sleep $RETRY_DELAY
done

echo "Agent pool unresponsive. Triggering restart..."
systemctl restart wpt-private-instance

Exponential Backoff Configuration

retry_policy:
 max_attempts: 4
 initial_interval: 5s
 max_interval: 60s
 multiplier: 2.0
 jitter: true

Log Parsing Regex (Common Failures)

# Browser crash
\b(?:ERR_CONNECTION_REFUSED|ERR_NETWORK_CHANGED|WebDriverError)\b
# Queue timeout
\b(?:Test execution timed out|No agent available for location)\b
# HAR corruption
\b(?:Invalid HAR format|Missing entries array)\b

Graceful Degradation Fallback Matrix

Failure Mode Fallback Action CI Impact
Single Agent Crash Auto-restart via watchdog None (queue rebalances)
Location Unavailable Route to secondary region +15s latency, soft warning
Redis Queue Full Pause CI trigger, alert Slack Hard fail after 5m
Network Partition Cache last known baseline, skip test Soft warning, manual review

When implementing fallback routing and CI retry logic for flaky network conditions, apply Handling WebPageTest Location Failures to prevent pipeline deadlocks and ensure deterministic pass/fail states.


Validation Checklist & Go-Live Criteria

Execute the following verification sequence before routing production traffic to the private instance. Require explicit engineering sign-off.

Pre-Flight Validation Table

Check Command/Action Expected Result
Server Reachability curl -I http://<server>:4000 HTTP/1.1 200 OK
Agent Registration /getLocations.php All 3 locations ready: true
API Auth curl -X POST /runtest.php -d '{"api_key":"..."}' 200 with testId
Budget Evaluation Run synthetic test against main Passes LCP/CLS thresholds
Storage Write Upload dummy HAR to S3/GCS Object persists, metadata indexed

Go/No-Go Decision Matrix

Metric Go Criteria No-Go Criteria
Queue Latency < 2s avg > 10s avg
Agent Uptime 99.9% (7d) < 95% (7d)
Budget Accuracy ±3% variance > 10% variance
CI Integration Zero false positives > 2 flaky gates/week

Rollback Procedure

  1. Disable private instance routing in CI config (WPT_SERVER env var revert to public).
  2. Drain active queue: curl -X POST /clearQueue.php -d '{"location":"all"}'
  3. Archive current Docker volumes: docker compose down -v && tar -czf wpt-backup.tar.gz /opt/wpt/data
  4. Re-deploy previous stable compose manifest.

Post-Deployment Smoke Test

#!/usr/bin/env bash
set -euo pipefail
TEST_URL="https://example.com"
RESPONSE=$(curl -s -X POST "${WPT_SERVER}/runtest.php" \
 -H "Content-Type: application/json" \
 -d "{\"url\":\"${TEST_URL}\",\"location\":\"us-east-1:Chrome\",\"f\":\"json\"}")
TEST_ID=$(echo "$RESPONSE" | jq -r '.data.id')
echo "Smoke test initiated: ${TEST_ID}"
sleep 30
STATUS=$(curl -s "${WPT_SERVER}/jsonResult.php?test=${TEST_ID}" | jq -r '.statusCode')
[[ "$STATUS" == "200" ]] && echo "✅ Smoke test passed" || echo "❌ Smoke test failed"