WebPageTest Private Instance Setup

Deploying a private WebPageTest instance eliminates public endpoint variability, enforces deterministic network conditions, and provides the infrastructure required for strict performance budget gating. By isolating test runners within your VPC or dedicated bare-metal environment, engineering teams gain predictable latency baselines, controlled browser versions, and direct access to raw HAR/WPT data. This architecture is foundational to the broader Lighthouse CI & WebPageTest Integration ecosystem, enabling automated pass/fail gates that block regressions before they reach production.

Architecture Decision Matrix

Criteria	Public Endpoint	Private Instance
Network Isolation	Shared, unpredictable routing	VPC-scoped, zero egress noise
Browser/OS Control	Fixed provider pool	Custom Docker/VM provisioning
CI Gating Latency	High variance, queue contention	Deterministic, SLA-bound execution
Data Sovereignty	Third-party retention	Full ownership, custom lifecycle

Network Isolation Topology

[CI Runner] --> [WPT Server (Port 80/4000)] --> [Agent Pool (Chrome/Firefox/Safari)]
 ^ | |
 | v v
[GitHub/GitLab] [Redis Queue] [Local Network / CDN Edge]
 | |
 v v
[PR Comment Bot] [S3/GCS Result Bucket]

CI Gating Threshold Definition

Metric	Soft Warning	Hard Fail	Enforcement Scope
LCP (Desktop)	> 2.5s	> 3.5s	All PRs to `main`
CLS (Mobile)	> 0.1	> 0.25	Release branches
TTFB (Prod)	> 800ms	> 1200ms	Canary deployments

Infrastructure Provisioning & Docker Orchestration

Provisioning requires a centralized server node and horizontally scalable agent workers. Use Docker Compose for local validation before migrating to Kubernetes or ECS.

Docker Compose Topology (Server + 3 Agents)

version: "3.8"
services:
 wpt-server:
 image: webpagetest/server:latest
 ports:
 - "80:80"
 - "4000:4000"
 environment:
 - SERVER_LOCATION=us-east-1
 - REDIS_HOST=redis
 - AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID}
 - AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY}
 depends_on:
 - redis
 - wpt-agent-1
 - wpt-agent-2
 - wpt-agent-3

 wpt-agent-1:
 image: webpagetest/agent:latest
 environment:
 - SERVER_URL=http://wpt-server:4000/work/
 - LOCATION_ID=us-east-1:Chrome
 - AGENT_KEY=${AGENT_KEY_1}
 depends_on:
 - redis

 wpt-agent-2:
 image: webpagetest/agent:latest
 environment:
 - SERVER_URL=http://wpt-server:4000/work/
 - LOCATION_ID=us-east-1:Firefox
 - AGENT_KEY=${AGENT_KEY_2}
 depends_on:
 - redis

 wpt-agent-3:
 image: webpagetest/agent:latest
 environment:
 - SERVER_URL=http://wpt-server:4000/work/
 - LOCATION_ID=us-east-1:Safari
 - AGENT_KEY=${AGENT_KEY_3}
 depends_on:
 - redis

 redis:
 image: redis:7-alpine
 ports:
 - "6379:6379"

Environment Variable Reference

Variable	Scope	Description
`WPT_SERVER`	Agent	Base URL for work polling (`http://<server>:4000/work/`)
`AGENT_KEY`	Agent	Unique auth token for worker registration
`LOCATION_ID`	Agent	Browser/OS identifier mapped in `settings.ini`
`SERVER_LOCATION`	Server	Primary region tag for result routing

Systemd Auto-Restart Unit

[Unit]
Description=WebPageTest Private Instance
After=docker.service
Requires=docker.service

[Service]
Restart=always
ExecStart=/usr/bin/docker compose -f /opt/wpt/docker-compose.yml up
ExecStop=/usr/bin/docker compose -f /opt/wpt/docker-compose.yml down

[Install]
WantedBy=multi-user.target

Healthcheck Verification

#!/usr/bin/env bash
set -euo pipefail
SERVER_URL="${WPT_SERVER_URL:-http://localhost:4000}"
RESPONSE=$(curl -s -o /dev/null -w "%{http_code}" "${SERVER_URL}/getLocations.php")
if [[ "$RESPONSE" -eq 200 ]]; then
 echo "WPT Server healthy. Agents registered."
 exit 0
else
 echo "WPT Server unreachable (HTTP ${RESPONSE})"
 exit 1
fi

API Configuration & Endpoint Routing

The server node routes test requests via settings.ini and enforces authentication through scoped API keys. Restrict key permissions to test and result scopes only. Validate payloads before submission to prevent malformed test definitions from consuming queue capacity.

`settings.ini` Configuration

[locations]
1=us-east-1
label=US East Primary

[us-east-1]
1=us-east-1-Chrome
label=Chrome Stable
browser=chrome
latency=0
connectivity=LAN

Automated Test Trigger (cURL)

curl -X POST "http://wpt-server:4000/runtest.php" \
 -H "Content-Type: application/json" \
 -d '{
 "url": "https://staging.example.com",
 "location": "us-east-1:Chrome",
 "f": "json",
 "runs": 3,
 "video": 1,
 "lighthouse": 1,
 "api_key": "${WPT_API_KEY}"
 }'

Response Parsing (Node.js)

async function parseWPTResponse(testUrl) {
 const res = await fetch(testUrl);
 const data = await res.json();
 if (data.statusCode === 200) {
 return {
 testId: data.data.id,
 summaryUrl: data.data.summaryCSV,
 statusUrl: data.data.jsonUrl
 };
 }
 throw new Error(`WPT API Error: ${data.statusText}`);
}

Rate Limiting & Queue Thresholds

Max Concurrent Tests: 15 per location
Queue Depth Alert: > 50 pending triggers
API Rate Limit: 120 requests/minute per key
Payload Validation: Enforce url, location, and runs schema before dispatch. For detailed endpoint validation and payload formatting for headless execution, reference Configuring WebPageTest API for Automated Testing to ensure consistent test run parameters.

CI Pipeline Integration & Performance Budget Gating

Embedding the private instance into CI requires a trigger-poll-parse-gate sequence. Hard thresholds block merges, while soft warnings generate PR annotations without halting deployment.

GitHub Actions Workflow

name: WPT Performance Gate
on:
 pull_request:
 branches: [main]

jobs:
 run-wpt:
 runs-on: ubuntu-latest
 steps:
 - uses: actions/checkout@v4
 - name: Trigger WPT Test
 id: trigger
 run: |
 RESPONSE=$(curl -s -X POST "${WPT_SERVER}/runtest.php" \
 -H "Content-Type: application/json" \
 -d '{"url":"${{ github.event.pull_request.head.repo.url }}","location":"us-east-1:Chrome","f":"json"}')
 echo "test_id=$(echo $RESPONSE | jq -r '.data.id')" >> $GITHUB_OUTPUT

 - name: Poll Until Complete
 run: |
 while true; do
 STATUS=$(curl -s "${WPT_SERVER}/jsonResult.php?test=${{ steps.trigger.outputs.test_id }}" | jq -r '.statusCode')
 [[ "$STATUS" == "200" ]] && break
 sleep 15
 done

 - name: Evaluate Budget
 run: |
 METRICS=$(curl -s "${WPT_SERVER}/jsonResult.php?test=${{ steps.trigger.outputs.test_id }}")
 LCP=$(echo $METRICS | jq '.data.average.firstView.metrics.LargestContentfulPaint')
 if (( $(echo "$LCP > 3500" | bc -l) )); then
 echo "::error::LCP exceeded hard budget (3.5s)"
 exit 1
 fi

Performance Budget Schema

{
 "budgets": [
 {
 "path": "/*",
 "resourceSizes": [{"resourceType": "script", "budget": 250000}],
 "timings": [
 {"metric": "lcp", "budget": 3500, "severity": "error"},
 {"metric": "cls", "budget": 0.25, "severity": "warning"}
 ]
 }
 ]
}

Fail-Fast vs Soft-Warning Logic

Fail-Fast: Exit 1 immediately when error severity threshold is breached. Blocks PR merge.
Soft-Warning: Log ::warning:: annotation, attach Lighthouse report to PR, allow merge.
Matrix Optimization: When structuring parallel execution and cross-branch matrix testing, integrate GitHub Actions Performance Matrices to optimize runner allocation and prevent CI bottlenecks.

Result Storage & Historical Tracking

Raw WPT outputs must be persisted, indexed, and correlated with Lighthouse CI reports for longitudinal analysis. Use object storage with lifecycle policies to manage retention costs.

Cloud Storage Bucket Policy (AWS S3)

{
 "Version": "2012-10-17",
 "Statement": [
 {
 "Effect": "Allow",
 "Principal": {"AWS": "arn:aws:iam::123456789012:role/ci-pipeline-role"},
 "Action": ["s3:PutObject", "s3:GetObject"],
 "Resource": "arn:aws:s3:::wpt-results/*"
 },
 {
 "Effect": "Deny",
 "Principal": "*",
 "Action": "s3:*",
 "Resource": "arn:aws:s3:::wpt-results/*",
 "Condition": {"Bool": {"aws:SecureTransport": "false"}}
 }
 ]
}

Metadata Indexing Schema

Field	Type	Example
`commit_sha`	string	`a1b2c3d4...`
`branch`	string	`feature/perf-optimization`
`environment`	string	`staging`
`timestamp`	ISO8601	`2024-05-20T14:30:00Z`
`test_id`	string	`240520_143000_abc123`

Data Retention & Pruning Cron

#!/usr/bin/env bash
# Delete results older than 90 days
aws s3 ls s3://wpt-results/ --recursive | \
 awk '{print $4}' | \
 while read -r key; do
 UPLOAD_DATE=$(aws s3api head-object --bucket wpt-results --key "$key" --query 'LastModified' --output text)
 if [[ $(date -d "$UPLOAD_DATE" +%s) -lt $(date -d "90 days ago" +%s) ]]; then
 aws s3 rm "s3://wpt-results/$key"
 fi
 done

Cross-Reference Query (WPT ↔ LHCI)

SELECT 
 w.commit_sha,
 w.test_id AS wpt_test,
 lci.report_url AS lhci_report,
 w.lcp_value,
 lci.performance_score
FROM wpt_results w
JOIN lhci_reports lci ON w.commit_sha = lci.commit_sha
WHERE w.environment = 'staging'
ORDER BY w.timestamp DESC
LIMIT 50;

When detailing cross-tool result aggregation, retention windows, and SQL/NoSQL indexing strategies, align your schema with Lighthouse CI Configuration & Storage to maintain unified dashboarding.

Resilience, Failure Handling & Agent Recovery

Private instances require automated watchdogs to detect agent drift, clear stuck queues, and enforce fallback routing during network degradation.

Agent Watchdog Script (Bash)

#!/usr/bin/env bash
AGENT_URL="${WPT_SERVER}/getLocations.php"
MAX_RETRIES=3
RETRY_DELAY=10

for i in $(seq 1 $MAX_RETRIES); do
 STATUS=$(curl -s -o /dev/null -w "%{http_code}" "$AGENT_URL")
 if [[ "$STATUS" -eq 200 ]]; then
 echo "Agent pool healthy."
 exit 0
 fi
 echo "Attempt $i failed. Retrying in ${RETRY_DELAY}s..."
 sleep $RETRY_DELAY
done

echo "Agent pool unresponsive. Triggering restart..."
systemctl restart wpt-private-instance

Exponential Backoff Configuration

retry_policy:
 max_attempts: 4
 initial_interval: 5s
 max_interval: 60s
 multiplier: 2.0
 jitter: true

Log Parsing Regex (Common Failures)

# Browser crash
\b(?:ERR_CONNECTION_REFUSED|ERR_NETWORK_CHANGED|WebDriverError)\b
# Queue timeout
\b(?:Test execution timed out|No agent available for location)\b
# HAR corruption
\b(?:Invalid HAR format|Missing entries array)\b

Graceful Degradation Fallback Matrix

Failure Mode	Fallback Action	CI Impact
Single Agent Crash	Auto-restart via watchdog	None (queue rebalances)
Location Unavailable	Route to secondary region	+15s latency, soft warning
Redis Queue Full	Pause CI trigger, alert Slack	Hard fail after 5m
Network Partition	Cache last known baseline, skip test	Soft warning, manual review

When implementing fallback routing and CI retry logic for flaky network conditions, apply Handling WebPageTest Location Failures to prevent pipeline deadlocks and ensure deterministic pass/fail states.

Validation Checklist & Go-Live Criteria

Execute the following verification sequence before routing production traffic to the private instance. Require explicit engineering sign-off.

Pre-Flight Validation Table

Check	Command/Action	Expected Result
Server Reachability	`curl -I http://<server>:4000`	`HTTP/1.1 200 OK`
Agent Registration	`/getLocations.php`	All 3 locations `ready: true`
API Auth	`curl -X POST /runtest.php -d '{"api_key":"..."}'`	`200` with `testId`
Budget Evaluation	Run synthetic test against `main`	Passes LCP/CLS thresholds
Storage Write	Upload dummy HAR to S3/GCS	Object persists, metadata indexed

Go/No-Go Decision Matrix

Metric	Go Criteria	No-Go Criteria
Queue Latency	< 2s avg	> 10s avg
Agent Uptime	99.9% (7d)	< 95% (7d)
Budget Accuracy	±3% variance	> 10% variance
CI Integration	Zero false positives	> 2 flaky gates/week

Rollback Procedure

Disable private instance routing in CI config (WPT_SERVER env var revert to public).
Drain active queue: curl -X POST /clearQueue.php -d '{"location":"all"}'
Archive current Docker volumes: docker compose down -v && tar -czf wpt-backup.tar.gz /opt/wpt/data
Re-deploy previous stable compose manifest.

Post-Deployment Smoke Test

#!/usr/bin/env bash
set -euo pipefail
TEST_URL="https://example.com"
RESPONSE=$(curl -s -X POST "${WPT_SERVER}/runtest.php" \
 -H "Content-Type: application/json" \
 -d "{\"url\":\"${TEST_URL}\",\"location\":\"us-east-1:Chrome\",\"f\":\"json\"}")
TEST_ID=$(echo "$RESPONSE" | jq -r '.data.id')
echo "Smoke test initiated: ${TEST_ID}"
sleep 30
STATUS=$(curl -s "${WPT_SERVER}/jsonResult.php?test=${TEST_ID}" | jq -r '.statusCode')
[[ "$STATUS" == "200" ]] && echo "✅ Smoke test passed" || echo "❌ Smoke test failed"

WebPageTest Private Instance Setup #

Architecture Decision Matrix #

Network Isolation Topology #

CI Gating Threshold Definition #

Infrastructure Provisioning & Docker Orchestration #

Docker Compose Topology (Server + 3 Agents) #

Environment Variable Reference #

Systemd Auto-Restart Unit #

Healthcheck Verification #

API Configuration & Endpoint Routing #

settings.ini Configuration #

Automated Test Trigger (cURL) #

Response Parsing (Node.js) #

Rate Limiting & Queue Thresholds #

CI Pipeline Integration & Performance Budget Gating #

GitHub Actions Workflow #

Performance Budget Schema #

Fail-Fast vs Soft-Warning Logic #

Result Storage & Historical Tracking #

Cloud Storage Bucket Policy (AWS S3) #

Metadata Indexing Schema #

Data Retention & Pruning Cron #

Cross-Reference Query (WPT ↔ LHCI) #

Resilience, Failure Handling & Agent Recovery #

Agent Watchdog Script (Bash) #

Exponential Backoff Configuration #

Log Parsing Regex (Common Failures) #

Graceful Degradation Fallback Matrix #

Validation Checklist & Go-Live Criteria #

Pre-Flight Validation Table #

Go/No-Go Decision Matrix #

Rollback Procedure #

Post-Deployment Smoke Test #