Load testing reveals how your application behaves under stress before real users discover its limits. This guide covers load testing strategies, tools, and interpreting results to build applications that scale reliably.
Why Load Test?
Discover Before Users Do
Production failures are expensive:
- Lost revenue during downtime
- Damaged reputation
- Emergency debugging under pressure
Load testing reveals:
- Maximum concurrent users
- Response time degradation patterns
- Resource bottlenecks
- Breaking points
Types of Performance Tests
| Test Type | Purpose | Duration |
|---|---|---|
| Load Test | Normal expected load | Minutes to hours |
| Stress Test | Beyond normal capacity | Until failure |
| Spike Test | Sudden traffic surge | Short bursts |
| Soak Test | Sustained load | Hours to days |
| Breakpoint Test | Find maximum capacity | Incremental increase |
Load Testing Tools
k6 (Recommended)
Modern, developer-friendly tool written in Go:
k6 uses JavaScript to define test scenarios, making it accessible to developers already familiar with the language. The following example demonstrates a basic load test with ramping traffic patterns that simulate real user behavior.
// load-test.js
import http from 'k6/http';
import { check, sleep } from 'k6';
export const options = {
stages: [
{ duration: '2m', target: 100 }, // Ramp up
{ duration: '5m', target: 100 }, // Stay at 100 users
{ duration: '2m', target: 0 }, // Ramp down
],
thresholds: {
http_req_duration: ['p(95)<500'], // 95% under 500ms
http_req_failed: ['rate<0.01'], // Error rate under 1%
},
};
export default function () {
const response = http.get('https://myapp.com/api/products');
check(response, {
'status is 200': (r) => r.status === 200,
'response time < 500ms': (r) => r.timings.duration < 500,
});
sleep(1);
}
The stages configuration creates a realistic traffic pattern: gradual ramp-up, sustained load, and graceful ramp-down. The thresholds section defines pass/fail criteria that can fail your CI pipeline if performance degrades.
You can run k6 tests from the command line with various options to adjust virtual users and duration.
# Run test
k6 run load-test.js
# Run with more VUs
k6 run --vus 200 --duration 10m load-test.js
Apache JMeter
Industry-standard GUI tool:
JMeter has been the go-to load testing tool for years. While more verbose than k6, it offers a visual interface that non-developers often find more approachable. The configuration is XML-based, which you can generate from the GUI.
<!-- test-plan.jmx -->
<ThreadGroup>
<stringProp name="ThreadGroup.num_threads">100</stringProp>
<stringProp name="ThreadGroup.ramp_time">60</stringProp>
<stringProp name="ThreadGroup.duration">300</stringProp>
</ThreadGroup>
Artillery
Node.js-based, YAML configuration:
Artillery strikes a balance between simplicity and power. Its YAML-based configuration is easy to read and version control, making it a good choice for teams that prefer declarative configuration.
# artillery.yml
config:
target: "https://myapp.com"
phases:
- duration: 120
arrivalRate: 10
name: "Warm up"
- duration: 300
arrivalRate: 50
name: "Sustained load"
scenarios:
- name: "Browse products"
flow:
- get:
url: "/api/products"
- think: 2
- get:
url: "/api/products/{{ $randomNumber(1, 100) }}"
The think directive simulates real user behavior by adding pauses between requests. This prevents unrealistic request patterns that would never occur in production.
artillery run artillery.yml
Locust
Python-based with real-time web UI:
Locust lets you define user behavior in Python, offering full programming language flexibility. The web UI provides real-time monitoring during test execution, making it easy to observe how your application responds.
# locustfile.py
from locust import HttpUser, task, between
class WebsiteUser(HttpUser):
wait_time = between(1, 3)
@task(3)
def view_products(self):
self.client.get("/api/products")
@task(1)
def view_product_detail(self):
product_id = random.randint(1, 100)
self.client.get(f"/api/products/{product_id}")
def on_start(self):
# Login once per user
self.client.post("/api/login", json={
"email": "test@example.com",
"password": "password"
})
The @task decorator weights define probability - here, users browse products three times more often than viewing details. The on_start method runs once per simulated user for setup tasks like authentication.
locust -f locustfile.py --host=https://myapp.com
Realistic Test Scenarios
User Journey Simulation
Real users don't just hammer a single endpoint. They browse, pause to read, and follow logical paths through your application. This test simulates an e-commerce shopping flow with realistic timing between actions.
// k6: Realistic e-commerce flow
import http from 'k6/http';
import { check, group, sleep } from 'k6';
export default function () {
group('Browse', function () {
http.get('https://myapp.com/');
sleep(2);
http.get('https://myapp.com/api/products?category=electronics');
sleep(3);
});
group('Product Detail', function () {
const productId = Math.floor(Math.random() * 100) + 1;
http.get(`https://myapp.com/api/products/${productId}`);
sleep(5);
});
group('Add to Cart', function () {
http.post('https://myapp.com/api/cart', JSON.stringify({
product_id: 42,
quantity: 1
}), {
headers: { 'Content-Type': 'application/json' }
});
sleep(2);
});
group('Checkout', function () {
// Only 10% proceed to checkout
if (Math.random() < 0.1) {
http.post('https://myapp.com/api/checkout', JSON.stringify({
payment_method: 'card'
}), {
headers: { 'Content-Type': 'application/json' }
});
}
});
}
The group function organizes metrics by user action, making it easier to identify which part of the journey is slowest. The 10% checkout rate reflects real conversion funnels.
Data-Driven Testing
Testing with a variety of user accounts reveals issues that single-user tests miss, such as cache effectiveness and database query patterns across different data sets. Loading test data from files lets you simulate realistic user diversity.
// k6: Load test data from file
import { SharedArray } from 'k6/data';
import http from 'k6/http';
const users = new SharedArray('users', function () {
return JSON.parse(open('./test-users.json'));
});
export default function () {
const user = users[__VU % users.length];
const loginRes = http.post('https://myapp.com/api/login', JSON.stringify({
email: user.email,
password: user.password
}), {
headers: { 'Content-Type': 'application/json' }
});
const token = loginRes.json('token');
http.get('https://myapp.com/api/profile', {
headers: { 'Authorization': `Bearer ${token}` }
});
}
SharedArray loads test data once and shares it across all virtual users, minimizing memory usage. The modulo operation distributes users evenly across the available test accounts.
Key Metrics
Response Time Metrics
Understanding the difference between average and percentile metrics is crucial for meaningful performance analysis. Averages can hide problems that percentiles reveal.
Avg Response Time: 150ms # Average (misleading)
Median (p50): 120ms # Half of requests faster
p90: 250ms # 90% of requests faster
p95: 400ms # 95% of requests faster
p99: 850ms # 99% of requests faster
Max: 2500ms # Worst case
Focus on percentiles, not averages. The p95 and p99 show what slow users experience.
A few slow requests can hide behind a healthy average. If your p99 is 10x your p50, you have a tail latency problem that needs investigation.
Throughput Metrics
Throughput tells you how much work your system can handle. Watch for error rates that climb as load increases.
Requests/second: 500 rps
Successful requests: 49,500
Failed requests: 500
Error rate: 1%
Resource Metrics
Monitor during tests:
- CPU utilization
- Memory usage
- Disk I/O
- Network bandwidth
- Database connections
- Queue depth
Analyzing Results
Response Time Degradation
This pattern shows how response times typically degrade as load increases. The goal is to identify where your performance cliff occurs - the point where response times become unacceptable.
Load: Response Time:
50 VU 100ms
100 VU 150ms
200 VU 300ms
300 VU 800ms <- Performance cliff
400 VU 2500ms <- Unacceptable
500 VU Timeout <- Breaking point
Identifying Bottlenecks
When tests reveal problems, these symptom-to-cause mappings help you start investigating in the right place. Most performance issues fall into predictable patterns.
Symptom: Likely Cause:
CPU at 100% Application code, no caching
Memory growing Memory leaks, no limits
Database CPU high Missing indexes, N+1 queries
Disk I/O high Too much logging, no SSD
Connection pool exhausted Pool too small, slow queries
Database Analysis
When you suspect the database is the bottleneck, these PostgreSQL queries help identify problematic queries and connection issues. Enable pg_stat_statements before your load test to capture query performance data.
-- Find slow queries during load test
SELECT query, calls, mean_time, total_time
FROM pg_stat_statements
ORDER BY total_time DESC
LIMIT 20;
-- Check connection count
SELECT count(*) FROM pg_stat_activity;
-- Lock contention
SELECT * FROM pg_locks WHERE NOT granted;
The lock query reveals blocking issues that cause request pile-ups. If you see many ungranted locks, you likely have contention problems.
Laravel-Specific Testing
Testing Authenticated Routes
Laravel's Sanctum authentication requires proper CSRF handling during load tests. This example shows the complete flow for authenticated API testing, including token extraction.
// k6: Laravel Sanctum authentication
import http from 'k6/http';
export function setup() {
// Get CSRF token
const csrfRes = http.get('https://myapp.com/sanctum/csrf-cookie');
// Login
const loginRes = http.post('https://myapp.com/login', JSON.stringify({
email: 'test@example.com',
password: 'password'
}), {
headers: {
'Content-Type': 'application/json',
'X-XSRF-TOKEN': csrfRes.cookies['XSRF-TOKEN'][0].value
}
});
return {
cookies: loginRes.cookies
};
}
export default function (data) {
http.get('https://myapp.com/api/user', {
cookies: data.cookies
});
}
The setup function runs once before all virtual users start, establishing authentication that's shared across the test. This prevents login endpoint overload.
Testing Queue Performance
Background job processing is often overlooked during load testing. Create jobs in bulk to verify your queue workers can keep up with production traffic. This artisan command helps you stress-test your queue infrastructure.
// Create many jobs for queue testing
Artisan::command('test:queue-load {count=1000}', function (int $count) {
for ($i = 0; $i < $count; $i++) {
ProcessOrder::dispatch(Order::find(rand(1, 100)));
}
$this->info("Dispatched {$count} jobs");
});
While jobs are processing, monitor queue depth and worker performance to identify if workers are keeping up.
# Monitor queue during test
php artisan queue:work --verbose &
watch -n 1 'php artisan queue:monitor redis:default'
CI/CD Integration
GitHub Actions
Automated load testing in CI catches performance regressions before they reach production. Run these tests nightly or before major releases to establish performance baselines.
# .github/workflows/load-test.yml
name: Load Test
on:
schedule:
- cron: '0 2 * * *' # Nightly
workflow_dispatch:
jobs:
load-test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install k6
run: |
curl -L https://github.com/grafana/k6/releases/download/v0.47.0/k6-v0.47.0-linux-amd64.tar.gz | tar xz
sudo mv k6-v0.47.0-linux-amd64/k6 /usr/local/bin/
- name: Run load test
run: k6 run --out json=results.json tests/load/smoke.js
- name: Check thresholds
run: |
if grep -q '"thresholds":{"http_req_duration":\["p(95)<500"\].*"ok":false' results.json; then
echo "Performance thresholds not met!"
exit 1
fi
- name: Upload results
uses: actions/upload-artifact@v3
with:
name: load-test-results
path: results.json
The threshold check step fails the build if performance degrades, creating accountability for performance as part of your development process.
Threshold Gates
Define strict thresholds that must pass for your build to succeed. These act as performance contracts that prevent gradual degradation over time.
// k6: Strict thresholds for CI
export const options = {
thresholds: {
http_req_duration: [
'p(50)<200', // Median under 200ms
'p(95)<500', // 95th percentile under 500ms
'p(99)<1000', // 99th percentile under 1s
],
http_req_failed: ['rate<0.01'], // Less than 1% errors
http_reqs: ['rate>100'], // At least 100 rps
},
};
Start with generous thresholds and tighten them as you optimize. Unrealistic thresholds that always fail train teams to ignore them.
Performance Baselines
Establishing Baselines
Before optimizing, you need to know where you stand. Run baseline tests regularly to track performance over time and detect regressions early.
// Baseline test: Run weekly, compare results
export const options = {
scenarios: {
baseline: {
executor: 'constant-vus',
vus: 50,
duration: '5m',
},
},
};
Tracking Over Time
Store results with timestamps to build a performance history. Simple scripts can compare results across runs to detect regressions automatically.
# Store results with timestamp
k6 run --out json=results/$(date +%Y%m%d).json load-test.js
# Compare with previous
python compare-results.py results/20240115.json results/20240122.json
Common Issues and Solutions
Connection Limits
When testing reveals connection issues, these nginx settings help handle higher concurrent connection counts. Adjust based on your expected peak load.
# nginx.conf - Increase worker connections
events {
worker_connections 4096;
}
http {
# Increase keepalive
keepalive_timeout 65;
keepalive_requests 1000;
}
Database Connection Pool
Laravel's database configuration supports connection pooling. Tune these values based on your load test findings to balance connection availability with database resource usage.
// config/database.php
'mysql' => [
'pool' => [
'min_connections' => 10,
'max_connections' => 100,
],
],
Start with conservative values and increase based on observed connection usage. Too many connections can overwhelm your database server.
PHP-FPM Tuning
PHP-FPM settings directly impact how many concurrent requests your application can handle. These settings work well for a mid-sized application, but you should adjust based on your specific workload.
; php-fpm.conf
pm = dynamic
pm.max_children = 50
pm.start_servers = 10
pm.min_spare_servers = 5
pm.max_spare_servers = 20
pm.max_requests = 500
The pm.max_requests setting prevents memory leaks by recycling workers after a set number of requests.
Redis Connection Issues
Under load, Redis connections can become a bottleneck. Persistent connections reduce connection overhead significantly by reusing connections across requests.
// config/database.php
'redis' => [
'client' => 'phpredis',
'default' => [
'persistent' => true,
'persistent_id' => 'myapp',
'read_timeout' => 60,
],
],
The phpredis client generally performs better than predis under high load. The persistent_id ensures connections are reused across requests.
Best Practices
- Test in production-like environment - Same hardware, data volume, network
- Use realistic data - Don't test with empty database
- Simulate real user behavior - Think times, varied paths
- Monitor everything - Application, database, network, queues
- Test regularly - Catch regressions early
- Start small - Smoke tests before full load tests
- Document findings - Track improvements over time
- Test failure modes - What happens when dependencies fail?
Conclusion
Load testing is essential for production confidence. Start with simple smoke tests, establish baselines, and gradually increase sophistication. Use k6 or similar modern tools for developer-friendly testing, integrate tests into CI/CD, and always test in production-like environments. The goal is discovering limits and bottlenecks before your users do.