How to Perform Load Testing on a Distributed System with k6
In a monolithic world, load testing was a relatively straightforward affair: point a tool at a single endpoint and increase the pressure. In today's landscape of distributed systems, microservices, and serverless functions, this approach is dangerously insufficient. A modern system's performance is not a single number; it's an emergent property of dozens of discrete, interconnected services communicating over a network. A bottleneck might not be in your primary API gateway but in a downstream authentication service, a saturated message queue, or a poorly indexed database table three hops away.
Understanding this systemic behavior under load is paramount. Failing to do so doesn't just risk slow response times; it risks cascading failures, resource exhaustion, and catastrophic outages.
This article provides a technical, actionable guide for CTOs and senior engineers on implementing robust load testing for distributed systems using k6. We will bypass high-level theory and focus on practical implementation, from scripting complex user journeys to executing tests at scale on Kubernetes and, most critically, correlating client-side metrics with your server-side observability stack.

Product Engineering Services
Work with our in-house Project Managers, Software Engineers and QA Testers to build your new custom software product or to support your current workflow, following Agile, DevOps and Lean methodologies.
Why k6 for Distributed Systems?
While tools like Apache JMeter have been mainstays, k6 offers a modern, developer-centric approach particularly suited for distributed architectures:
- High Performance, Low Footprint: k6 is written in Go. It uses a single process and an event-loop-based architecture, allowing it to generate significant load from a single machine with minimal CPU and memory overhead. This is crucial for cost-effective testing.
- Developer-First Scripting (JavaScript ES6): Tests are written in JavaScript. This lowers the barrier to entry, as your engineers don't need to learn a new domain-specific language or navigate a complex UI. The scripts are code, treat them as such: version control, code review, and modularize them.
- Built-in Metrics & Thresholds: k6 provides critical metrics (p95/p99 latency, request rates, error rates) out of the box. More importantly, it allows you to define explicit pass/fail criteria (Thresholds) directly in your script, making it ideal for CI/CD integration.
- Extensibility: k6 supports gRPC, WebSockets, Kafka, and other protocols common in distributed systems, not just HTTP.
- Observability Integration: k6 is designed to plug directly into modern observability stacks, shipping metrics to Prometheus, Grafana, Datadog, and New Relic.
Phase 1: Scripting Complex User Scenarios
A distributed system rarely serves a single, stateless request. Users follow a flow: they log in (hitting an auth service), browse a catalog (hitting a product service), and place an order (hitting an order service, which may trigger a payment service and an inventory service). Your test script must model this reality.
Core k6 Script Structure
A k6 script has two main parts: the options
object, which defines the load profile, and the default function
, which contains the logic executed by each Virtual User (VU).
import http from 'k6/http';
import { check, sleep, group } from 'k6';
// 1. OPTIONS: Define the load profile
export const options = {
stages: [
{ duration: '1m', target: 100 }, // Ramp-up to 100 VUs over 1 minute
{ duration: '3m', target: 100 }, // Stay at 100 VUs for 3 minutes
{ duration: '1m', target: 0 }, // Ramp-down to 0 VUs
],
thresholds: {
// 95% of requests must complete below 500ms
'http_req_duration': ['p(95)<500'],
// 99% of requests must be successful
'http_req_failed': ['rate<0.01'],
// The 'login' group must have a 99.9% success rate
'checks{group:::User Login}': ['rate>0.999'],
},
};
// 2. DEFAULT FUNCTION: The VU logic
export default function () {
const BASE_URL = 'https://api.your-system.com';
let authToken;
// Group 1: User Login (Auth Service)
group('User Login', () => {
const loginPayload = JSON.stringify({
email: `user_${__VU}@example.com`, // Parameterize data per VU
password: 'supersecretpassword',
});
const loginParams = {
headers: { 'Content-Type': 'application/json' },
};
const res = http.post(`${BASE_URL}/v1/auth/login`, loginPayload, loginParams);
check(res, {
'login successful (status 200)': (r) => r.status === 200,
'auth token received': (r) => r.json('token') !== '',
});
if (res.json('token')) {
authToken = res.json('token');
}
});
// Only proceed if login was successful
if (!authToken) {
return; // Abort this iteration
}
const authParams = {
headers: {
'Content-Type': 'application/json',
'Authorization': `Bearer ${authToken}`,
},
};
// Group 2: Browse Products (Product Service)
group('Browse Products', () => {
const res = http.get(`${BASE_URL}/v2/products?category=electronics`, authParams);
check(res, {
'get products successful (status 200)': (r) => r.status === 200,
});
sleep(1.5); // Simulate user think time
});
// Group 3: Place Order (Order Service)
group('Place Order', () => {
const orderPayload = JSON.stringify({
productId: 'abc-123',
quantity: 1,
});
const res = http.post(`${BASE_URL}/v1/orders`, orderPayload, authParams);
check(res, {
'order placement successful (status 201)': (r) => r.status === 201,
});
});
sleep(2); // Wait before starting a new session
}
Key Takeaways from this script:
- Groups: We use
group()
to organize requests into logical transactions (User Login
,Browse Products
). This provides aggregated metrics for each step, allowing you to pinpoint which part of the flow is failing or slow. - Checks:
check()
validates responses. These are not assertions; they don't stop the test. They collect pass/fail metrics, which we then use in ourthresholds
. - Thresholds: This is your SLO/SLA as code. The test will return a non-zero exit code (failing your CI pipeline) if
p(95)
latency exceeds 500ms or if the error rate climbs above 1%. - Data Parameterization: We use
__VU
(a k6-specific variable for the Virtual User ID) to create unique usernames. In a real test, you would load this from a shared data array or file to avoid hitting caches and simulate real-world variability. - State: We capture the
authToken
from the login response and pass it in subsequent requests, simulating a real, stateful user session.
Phase 2: Executing at Distributed Scale
A single k6 instance, while efficient, cannot simulate the load of millions of users. For a distributed system, you must run a distributed test. The goal is to generate load from multiple "load generator" machines, all orchestrated by a single controller.
While k6 Cloud offers a managed, "push-button" solution for this, a self-hosted approach on Kubernetes provides maximum control and cost-effectiveness for a technical organization. We achieve this using the k6-operator.
The k6-operator introduces a Custom Resource Definition (CRD) to Kubernetes, allowing you to define a distributed load test declaratively, just like a Deployment
or Service
.

Product Engineering Services
Work with our in-house Project Managers, Software Engineers and QA Testers to build your new custom software product or to support your current workflow, following Agile, DevOps and Lean methodologies.
Step-by-Step: Distributed Testing with k6-Operator
1. Prerequisite: Install the k6-operator
# Ensure you are on the correct K8s context
kubectl apply -f https://github.com/grafana/k6-operator/releases/latest/download/k6-operator.yaml
2. Package Your k6 Script in a ConfigMap
The operator needs access to your script.js
. The simplest way is to load it into a ConfigMap
.
kubectl create configmap my-load-test-script --from-file=script.js
3. Define the K6
Custom Resource
Create a YAML file (e.g., test-run.yaml
) to define the distributed test. This is where the power lies.
apiVersion: k6.io/v1alpha1
kind: K6
metadata:
name: my-distributed-test
spec:
# 1. Parallelism: Number of k6 worker pods to spin up
parallelism: 10
# 2. Script: Reference the ConfigMap created in Step 2
script:
configMap:
name: my-load-test-script
file: script.js
# 3. Arguments: Pass k6 CLI flags (e.g., VUs, duration)
# This overrides the 'options' in the script, allowing for dynamic test profiles.
arguments: --vcs 1000 --duration 10m # 1000 VUs total, split across 10 pods
# 4. Observability: Send metrics to your stack
runner:
env:
# Example: Configure k6 to output to Prometheus Remote-Write
- name: K6_PROMETHEUS_RW_SERVER_URL
value: "http://prometheus-remote-write-endpoint.monitoring.svc.cluster.local/api/v1/write"
- name: K6_PROMETHEUS_RW_TREND_STATS
value: "p(95),p(99),min,max,avg,med"
4. Execute the Test
Simply apply the manifest:
kubectl apply -f test-run.yaml
Kubernetes will now do the following:
- Read the
K6
resource. - Spin up one
k6-controller
pod. - Spin up 10
k6-worker
pods (as defined byparallelism: 10
). - The controller automatically distributes the 1000 VUs (
--vcs 1000
) among the workers (100 VUs each). - Each worker pod runs the same
script.js
, streaming its metrics to your configured backend (e.g., Prometheus).
You now have a scalable, repeatable, and declarative load testing framework running natively on your own infrastructure.
Phase 3: The Critical Link: Correlating Client & Server Metrics
This is the single most important, and most frequently missed, step.
Running the test in Phase 2 will tell you what happened from the client's perspective (e.g., "The /v1/orders
endpoint p95 latency spiked to 3000ms"). It will not tell you why.
The "why" is on your servers:
- Did the
order-service
pod run out of CPU? - Did its database connection pool exhaust?
- Did a downstream gRPC call to the
inventory-service
time out? - Was there a spike in Kafka consumer lag?
To find the "why," you must correlate the k6 client-side metrics with your server-side observability data on a single, shared timeline.
How to Implement Correlation
1. Inject Trace Context from k6
Your distributed tracing system (e.g., OpenTelemetry, Jaeger, Datadog APM) relies on context propagation, typically via HTTP headers like traceparent
. Your k6 script must generate and inject these headers so that the requests it generates are included in your server-side traces.
import http from 'k6/http';
import { uuidv4 } from 'https://jslib.k6.io/k6-utils/1.4.0/index.js';
// ... (options and other setup) ...
export default function () {
// Generate a unique trace ID for this entire user flow
const traceId = uuidv4().replace(/-/g, '');
const spanId = uuidv4().substring(0, 16);
// W3C Trace Context header
const traceparent = `00-${traceId}-${spanId}-01`;
const authParams = {
headers: {
'Content-Type': 'application/json',
'Authorization': `Bearer ${authToken}`,
'traceparent': traceparent, // <-- INJECT THE TRACE HEADER
},
};
group('Browse Products', () => {
// This request will now be picked up by your APM/tracing backend
const res = http.get(`${BASE_URL}/v2/products?category=electronics`, authParams);
check(res, { /* ... */ });
});
// ... (rest of script) ...
}
2. Build the Unified Dashboard
With the k6-operator shipping k6 metrics to Prometheus (Phase 2) and your script injecting trace IDs (Phase 3), all your data is now in one place.
In Grafana (or your preferred tool), build a dashboard that layers these two data sources:
- Top Panel (Client-Side):
k6_http_reqs_total
(Request Rate)k6_http_req_duration_p95
(P95 Latency)k6_http_req_failed_rate
(Error Rate)k6_vus
(Active Virtual Users)
- Bottom Panels (Server-Side):
- Per-Service: CPU/Memory Usage, Pod Counts (HPA activity).
- Database: Query throughput, query latency, connection counts.
- Queues: Message queue depth (e.g., Kafka lag, RabbitMQ queue size).
- Network: Ingress/Egress bandwidth, connection errors.
Now, when you run your load test, you can watch this dashboard. When you see a spike in k6_http_req_duration_p95
, you look directly below it. You will see the corresponding spike in database connections, the flatlining of a downstream service's pods, or the HPA scaling up a new node.
You have moved from "the site is slow" to "the site is slow because the order-service
p99 latency is high, which correlates directly with a 95% CPU saturation on the payment-service
deployment, which is failing its health checks." This is an actionable insight.

Product Engineering Services
Work with our in-house Project Managers, Software Engineers and QA Testers to build your new custom software product or to support your current workflow, following Agile, DevOps and Lean methodologies.
Conclusion
Load testing a distributed system with k6 is not a one-time event; it's a continuous practice. By scripting realistic scenarios, executing at scale with the k6-operator, and—most importantly—building unified observability dashboards, you transform load testing from a simple pass/fail check into a powerful performance engineering and debugging tool.
By integrating these practices into your CI/CD pipeline, you establish a performance baseline, protect against regressions, and give your engineering teams the high-fidelity data they need to build resilient, scalable, and high-performance systems. This is no longer just "testing"; it is a foundational component of modern systems architecture and operational excellence.