WebPageTest Private Instance Setup
Deploying a private WebPageTest instance eliminates public endpoint variability, enforces deterministic network conditions, and provides the infrastructure required for strict performance budget gating. By isolating test runners within your VPC or dedicated bare-metal environment, engineering teams gain predictable latency baselines, controlled browser versions, and direct access to raw HAR/WPT data. This architecture is foundational to the broader Lighthouse CI & WebPageTest Integration ecosystem, enabling automated pass/fail gates that block regressions before they reach production.
Architecture Decision Matrix
| Criteria | Public Endpoint | Private Instance |
|---|---|---|
| Network Isolation | Shared, unpredictable routing | VPC-scoped, zero egress noise |
| Browser/OS Control | Fixed provider pool | Custom Docker/VM provisioning |
| CI Gating Latency | High variance, queue contention | Deterministic, SLA-bound execution |
| Data Sovereignty | Third-party retention | Full ownership, custom lifecycle |
Network Isolation Topology
[CI Runner] --> [WPT Server (Port 80/4000)] --> [Agent Pool (Chrome/Firefox/Safari)]
^ | |
| v v
[GitHub/GitLab] [Redis Queue] [Local Network / CDN Edge]
| |
v v
[PR Comment Bot] [S3/GCS Result Bucket]
CI Gating Threshold Definition
| Metric | Soft Warning | Hard Fail | Enforcement Scope |
|---|---|---|---|
| LCP (Desktop) | > 2.5s | > 3.5s | All PRs to main |
| CLS (Mobile) | > 0.1 | > 0.25 | Release branches |
| TTFB (Prod) | > 800ms | > 1200ms | Canary deployments |
Infrastructure Provisioning & Docker Orchestration
Provisioning requires a centralized server node and horizontally scalable agent workers. Use Docker Compose for local validation before migrating to Kubernetes or ECS.
Docker Compose Topology (Server + 3 Agents)
version: "3.8"
services:
wpt-server:
image: webpagetest/server:latest
ports:
- "80:80"
- "4000:4000"
environment:
- SERVER_LOCATION=us-east-1
- REDIS_HOST=redis
- AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID}
- AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY}
depends_on:
- redis
- wpt-agent-1
- wpt-agent-2
- wpt-agent-3
wpt-agent-1:
image: webpagetest/agent:latest
environment:
- SERVER_URL=http://wpt-server:4000/work/
- LOCATION_ID=us-east-1:Chrome
- AGENT_KEY=${AGENT_KEY_1}
depends_on:
- redis
wpt-agent-2:
image: webpagetest/agent:latest
environment:
- SERVER_URL=http://wpt-server:4000/work/
- LOCATION_ID=us-east-1:Firefox
- AGENT_KEY=${AGENT_KEY_2}
depends_on:
- redis
wpt-agent-3:
image: webpagetest/agent:latest
environment:
- SERVER_URL=http://wpt-server:4000/work/
- LOCATION_ID=us-east-1:Safari
- AGENT_KEY=${AGENT_KEY_3}
depends_on:
- redis
redis:
image: redis:7-alpine
ports:
- "6379:6379"
Environment Variable Reference
| Variable | Scope | Description |
|---|---|---|
WPT_SERVER |
Agent | Base URL for work polling (http://<server>:4000/work/) |
AGENT_KEY |
Agent | Unique auth token for worker registration |
LOCATION_ID |
Agent | Browser/OS identifier mapped in settings.ini |
SERVER_LOCATION |
Server | Primary region tag for result routing |
Systemd Auto-Restart Unit
[Unit]
Description=WebPageTest Private Instance
After=docker.service
Requires=docker.service
[Service]
Restart=always
ExecStart=/usr/bin/docker compose -f /opt/wpt/docker-compose.yml up
ExecStop=/usr/bin/docker compose -f /opt/wpt/docker-compose.yml down
[Install]
WantedBy=multi-user.target
Healthcheck Verification
#!/usr/bin/env bash
set -euo pipefail
SERVER_URL="${WPT_SERVER_URL:-http://localhost:4000}"
RESPONSE=$(curl -s -o /dev/null -w "%{http_code}" "${SERVER_URL}/getLocations.php")
if [[ "$RESPONSE" -eq 200 ]]; then
echo "WPT Server healthy. Agents registered."
exit 0
else
echo "WPT Server unreachable (HTTP ${RESPONSE})"
exit 1
fi
API Configuration & Endpoint Routing
The server node routes test requests via settings.ini and enforces authentication through scoped API keys. Restrict key permissions to test and result scopes only. Validate payloads before submission to prevent malformed test definitions from consuming queue capacity.
settings.ini Configuration
[locations]
1=us-east-1
label=US East Primary
[us-east-1]
1=us-east-1-Chrome
label=Chrome Stable
browser=chrome
latency=0
connectivity=LAN
Automated Test Trigger (cURL)
curl -X POST "http://wpt-server:4000/runtest.php" \
-H "Content-Type: application/json" \
-d '{
"url": "https://staging.example.com",
"location": "us-east-1:Chrome",
"f": "json",
"runs": 3,
"video": 1,
"lighthouse": 1,
"api_key": "${WPT_API_KEY}"
}'
Response Parsing (Node.js)
async function parseWPTResponse(testUrl) {
const res = await fetch(testUrl);
const data = await res.json();
if (data.statusCode === 200) {
return {
testId: data.data.id,
summaryUrl: data.data.summaryCSV,
statusUrl: data.data.jsonUrl
};
}
throw new Error(`WPT API Error: ${data.statusText}`);
}
Rate Limiting & Queue Thresholds
- Max Concurrent Tests: 15 per location
- Queue Depth Alert: > 50 pending triggers
- API Rate Limit: 120 requests/minute per key
- Payload Validation: Enforce
url,location, andrunsschema before dispatch. For detailed endpoint validation and payload formatting for headless execution, reference Configuring WebPageTest API for Automated Testing to ensure consistent test run parameters.
CI Pipeline Integration & Performance Budget Gating
Embedding the private instance into CI requires a trigger-poll-parse-gate sequence. Hard thresholds block merges, while soft warnings generate PR annotations without halting deployment.
GitHub Actions Workflow
name: WPT Performance Gate
on:
pull_request:
branches: [main]
jobs:
run-wpt:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Trigger WPT Test
id: trigger
run: |
RESPONSE=$(curl -s -X POST "${WPT_SERVER}/runtest.php" \
-H "Content-Type: application/json" \
-d '{"url":"${{ github.event.pull_request.head.repo.url }}","location":"us-east-1:Chrome","f":"json"}')
echo "test_id=$(echo $RESPONSE | jq -r '.data.id')" >> $GITHUB_OUTPUT
- name: Poll Until Complete
run: |
while true; do
STATUS=$(curl -s "${WPT_SERVER}/jsonResult.php?test=${{ steps.trigger.outputs.test_id }}" | jq -r '.statusCode')
[[ "$STATUS" == "200" ]] && break
sleep 15
done
- name: Evaluate Budget
run: |
METRICS=$(curl -s "${WPT_SERVER}/jsonResult.php?test=${{ steps.trigger.outputs.test_id }}")
LCP=$(echo $METRICS | jq '.data.average.firstView.metrics.LargestContentfulPaint')
if (( $(echo "$LCP > 3500" | bc -l) )); then
echo "::error::LCP exceeded hard budget (3.5s)"
exit 1
fi
Performance Budget Schema
{
"budgets": [
{
"path": "/*",
"resourceSizes": [{"resourceType": "script", "budget": 250000}],
"timings": [
{"metric": "lcp", "budget": 3500, "severity": "error"},
{"metric": "cls", "budget": 0.25, "severity": "warning"}
]
}
]
}
Fail-Fast vs Soft-Warning Logic
- Fail-Fast: Exit
1immediately whenerrorseverity threshold is breached. Blocks PR merge. - Soft-Warning: Log
::warning::annotation, attach Lighthouse report to PR, allow merge. - Matrix Optimization: When structuring parallel execution and cross-branch matrix testing, integrate GitHub Actions Performance Matrices to optimize runner allocation and prevent CI bottlenecks.
Result Storage & Historical Tracking
Raw WPT outputs must be persisted, indexed, and correlated with Lighthouse CI reports for longitudinal analysis. Use object storage with lifecycle policies to manage retention costs.
Cloud Storage Bucket Policy (AWS S3)
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {"AWS": "arn:aws:iam::123456789012:role/ci-pipeline-role"},
"Action": ["s3:PutObject", "s3:GetObject"],
"Resource": "arn:aws:s3:::wpt-results/*"
},
{
"Effect": "Deny",
"Principal": "*",
"Action": "s3:*",
"Resource": "arn:aws:s3:::wpt-results/*",
"Condition": {"Bool": {"aws:SecureTransport": "false"}}
}
]
}
Metadata Indexing Schema
| Field | Type | Example |
|---|---|---|
commit_sha |
string | a1b2c3d4... |
branch |
string | feature/perf-optimization |
environment |
string | staging |
timestamp |
ISO8601 | 2024-05-20T14:30:00Z |
test_id |
string | 240520_143000_abc123 |
Data Retention & Pruning Cron
#!/usr/bin/env bash
# Delete results older than 90 days
aws s3 ls s3://wpt-results/ --recursive | \
awk '{print $4}' | \
while read -r key; do
UPLOAD_DATE=$(aws s3api head-object --bucket wpt-results --key "$key" --query 'LastModified' --output text)
if [[ $(date -d "$UPLOAD_DATE" +%s) -lt $(date -d "90 days ago" +%s) ]]; then
aws s3 rm "s3://wpt-results/$key"
fi
done
Cross-Reference Query (WPT ↔ LHCI)
SELECT
w.commit_sha,
w.test_id AS wpt_test,
lci.report_url AS lhci_report,
w.lcp_value,
lci.performance_score
FROM wpt_results w
JOIN lhci_reports lci ON w.commit_sha = lci.commit_sha
WHERE w.environment = 'staging'
ORDER BY w.timestamp DESC
LIMIT 50;
When detailing cross-tool result aggregation, retention windows, and SQL/NoSQL indexing strategies, align your schema with Lighthouse CI Configuration & Storage to maintain unified dashboarding.
Resilience, Failure Handling & Agent Recovery
Private instances require automated watchdogs to detect agent drift, clear stuck queues, and enforce fallback routing during network degradation.
Agent Watchdog Script (Bash)
#!/usr/bin/env bash
AGENT_URL="${WPT_SERVER}/getLocations.php"
MAX_RETRIES=3
RETRY_DELAY=10
for i in $(seq 1 $MAX_RETRIES); do
STATUS=$(curl -s -o /dev/null -w "%{http_code}" "$AGENT_URL")
if [[ "$STATUS" -eq 200 ]]; then
echo "Agent pool healthy."
exit 0
fi
echo "Attempt $i failed. Retrying in ${RETRY_DELAY}s..."
sleep $RETRY_DELAY
done
echo "Agent pool unresponsive. Triggering restart..."
systemctl restart wpt-private-instance
Exponential Backoff Configuration
retry_policy:
max_attempts: 4
initial_interval: 5s
max_interval: 60s
multiplier: 2.0
jitter: true
Log Parsing Regex (Common Failures)
# Browser crash
\b(?:ERR_CONNECTION_REFUSED|ERR_NETWORK_CHANGED|WebDriverError)\b
# Queue timeout
\b(?:Test execution timed out|No agent available for location)\b
# HAR corruption
\b(?:Invalid HAR format|Missing entries array)\b
Graceful Degradation Fallback Matrix
| Failure Mode | Fallback Action | CI Impact |
|---|---|---|
| Single Agent Crash | Auto-restart via watchdog | None (queue rebalances) |
| Location Unavailable | Route to secondary region | +15s latency, soft warning |
| Redis Queue Full | Pause CI trigger, alert Slack | Hard fail after 5m |
| Network Partition | Cache last known baseline, skip test | Soft warning, manual review |
When implementing fallback routing and CI retry logic for flaky network conditions, apply Handling WebPageTest Location Failures to prevent pipeline deadlocks and ensure deterministic pass/fail states.
Validation Checklist & Go-Live Criteria
Execute the following verification sequence before routing production traffic to the private instance. Require explicit engineering sign-off.
Pre-Flight Validation Table
| Check | Command/Action | Expected Result |
|---|---|---|
| Server Reachability | curl -I http://<server>:4000 |
HTTP/1.1 200 OK |
| Agent Registration | /getLocations.php |
All 3 locations ready: true |
| API Auth | curl -X POST /runtest.php -d '{"api_key":"..."}' |
200 with testId |
| Budget Evaluation | Run synthetic test against main |
Passes LCP/CLS thresholds |
| Storage Write | Upload dummy HAR to S3/GCS | Object persists, metadata indexed |
Go/No-Go Decision Matrix
| Metric | Go Criteria | No-Go Criteria |
|---|---|---|
| Queue Latency | < 2s avg | > 10s avg |
| Agent Uptime | 99.9% (7d) | < 95% (7d) |
| Budget Accuracy | ±3% variance | > 10% variance |
| CI Integration | Zero false positives | > 2 flaky gates/week |
Rollback Procedure
- Disable private instance routing in CI config (
WPT_SERVERenv var revert to public). - Drain active queue:
curl -X POST /clearQueue.php -d '{"location":"all"}' - Archive current Docker volumes:
docker compose down -v && tar -czf wpt-backup.tar.gz /opt/wpt/data - Re-deploy previous stable compose manifest.
Post-Deployment Smoke Test
#!/usr/bin/env bash
set -euo pipefail
TEST_URL="https://example.com"
RESPONSE=$(curl -s -X POST "${WPT_SERVER}/runtest.php" \
-H "Content-Type: application/json" \
-d "{\"url\":\"${TEST_URL}\",\"location\":\"us-east-1:Chrome\",\"f\":\"json\"}")
TEST_ID=$(echo "$RESPONSE" | jq -r '.data.id')
echo "Smoke test initiated: ${TEST_ID}"
sleep 30
STATUS=$(curl -s "${WPT_SERVER}/jsonResult.php?test=${TEST_ID}" | jq -r '.statusCode')
[[ "$STATUS" == "200" ]] && echo "✅ Smoke test passed" || echo "❌ Smoke test failed"