Architecting a Scalable, Server-Side A/B Testing Engine

Architecting a Scalable, Server-Side A/B Testing Engine
Photo by Ilya Pavlov / Unsplash

In the modern landscape of software development, deploying features based on intuition is a liability. For Chief Technology Officers and Senior Engineers, the objective is to build an experimentation infrastructure that is robust, performant, and statistically rigorous. While third-party tools exist, implementing a custom, server-side solution often yields better latency control and deeper integration with domain-specific logic.

This article details the architectural implementation of a deterministic A/B testing framework. We will explore user bucketing strategies, persistence layers, and the integration of telemetry, positioning this guide for teams leveraging high-velocity product engineering services for startups.

Product Engineering Services

Work with our in-house Project Managers, Software Engineers and QA Testers to build your new custom software product or to support your current workflow, following Agile, DevOps and Lean methodologies.

Build with 4Geeks

1. Architectural Strategy: Client-Side vs. Server-Side

Before writing code, one must choose the execution context. Client-side testing (via JS snippets) is easier to implement but suffers from the "Flash of Original Content" (FOOC) and performance degradation. For enterprise-grade applications, Server-Side Experimentation is the superior choice.

Key Advantages of Server-Side Implementation:

  • Performance: Decisions are made before the HTML is rendered or the JSON response is sent.
  • Consistency: Omnichannel consistency (Web, Mobile, Email) is guaranteed as the "source of truth" lies in the backend.
  • Security: Sensitive logic remains hidden from the client browser.

2. The Mathematics of Deterministic Bucketing

The core of any A/B testing engine is the Bucketing Algorithm. We need a function that maps a User ID to a variant (e.g., Control vs. Treatment) consistently without needing a database lookup for every request. This requires statistically independent hashing.

We utilize a deterministic hash function (like MD5 or MurmurHash) combined with a "Salt" (the Experiment ID).

$$Assignment = \text{Hash}(UserID + ExperimentID) \mod 100$$

If the resulting integer falls within the defined traffic allocation (e.g., 0-50 for Control, 51-100 for Treatment), the user is assigned accordingly.

3. Implementation in TypeScript

Below is a production-ready implementation using TypeScript and murmurhash for uniform distribution.

Product Engineering Services

Work with our in-house Project Managers, Software Engineers and QA Testers to build your new custom software product or to support your current workflow, following Agile, DevOps and Lean methodologies.

Build with 4Geeks

Step A: The Experiment Interface

First, define the structure of an experiment. This ensures type safety across your engineering team.

// types/experiment.ts

export enum Variant {
  CONTROL = 'control',
  TREATMENT = 'treatment',
  OFF = 'off' // Fallback
}

export interface ExperimentConfig {
  id: string; // Unique Salt
  name: string;
  trafficAllocation: number; // 0 to 100
  variants: Variant[];
}

// Example Configuration
export const NEW_CHECKOUT_FLOW: ExperimentConfig = {
  id: 'exp_checkout_2024_v1',
  name: 'New One-Page Checkout',
  trafficAllocation: 50, // 50% of users participate
  variants: [Variant.CONTROL, Variant.TREATMENT]
};

Step B: The Deterministic Bucketing Service

We use the murmurhash library for its speed and avalanche properties, ensuring an even split of users.

// services/ExperimentService.ts
import murmurhash from 'murmurhash';
import { ExperimentConfig, Variant } from '../types/experiment';

export class ExperimentService {
  
  /**
   * Determines the variant for a given user deterministically.
   * @param userId - The unique identifier of the user (UUID).
   * @param experiment - The experiment configuration object.
   * @returns The selected Variant.
   */
  public getVariant(userId: string, experiment: ExperimentConfig): Variant {
    // 1. Create a composite key to ensure independence between experiments
    const hashKey = `${userId}:${experiment.id}`;

    // 2. Generate a deterministic integer using MurmurHash v3
    // resulting value is a 32-bit integer
    const hashValue = murmurhash.v3(hashKey);

    // 3. Normalize to a 0-100 scale
    const normalizedValue = hashValue % 100;

    // 4. Check if user is excluded based on traffic allocation
    if (normalizedValue >= experiment.trafficAllocation) {
      return Variant.OFF;
    }

    // 5. Assign Variant (Simple 50/50 Split Logic)
    // For complex multi-variant splits, use weighted ranges.
    const variantIndex = hashValue % experiment.variants.length;
    return experiment.variants[variantIndex];
  }
}

This code ensures that User A will always see the same variant for Experiment X, regardless of the server node processing the request, without hitting a database like Redis or PostgreSQL.

4. Telemetry and Event Tracking

Assigning a variant is only half the battle. You must track the assignment to correlate it with conversion metrics. This is often where data engineering services 2 intersect with product engineering.

When the getVariant method is called, an "Exposure Event" should be emitted to your analytics pipeline (e.g., SegmentSnowflake, or a custom solution).

// services/AnalyticsService.ts

interface ExposureEvent {
  event: 'experiment_exposure';
  userId: string;
  experimentId: string;
  variant: string;
  timestamp: string;
}

export function trackExposure(userId: string, experimentId: string, variant: string) {
  const eventPayload: ExposureEvent = {
    event: 'experiment_exposure',
    userId,
    experimentId,
    variant,
    timestamp: new Date().toISOString()
  };

  // Push to message queue (e.g., Kafka, SQS) or Analytics API
  console.log('Telemetry Emitted:', JSON.stringify(eventPayload));
}

5. Middleware Integration (Express.js Example)

To apply this seamlessly, integrate the logic into your middleware chain. This allows downstream controllers to simply check req.experiments without worrying about the hashing logic.

// middleware/experimentMiddleware.ts
import { Request, Response, NextFunction } from 'express';
import { ExperimentService } from '../services/ExperimentService';
import { NEW_CHECKOUT_FLOW } from '../types/experiment';
import { trackExposure } from '../services/AnalyticsService';

const experimentService = new ExperimentService();

export const checkoutExperimentMiddleware = (req: Request, res: Response, next: NextFunction) => {
  const userId = req.user?.id; // Assuming auth middleware ran previously

  if (!userId) {
    // Fallback for unauthenticated users
    req.variant = 'control'; 
    return next();
  }

  const variant = experimentService.getVariant(userId, NEW_CHECKOUT_FLOW);
  
  // Attach decision to request object for Controller access
  req.variant = variant;

  // Track exposure immediately
  if (variant !== 'off') {
    trackExposure(userId, NEW_CHECKOUT_FLOW.id, variant);
  }

  next();
};

Conclusion

Implementing a custom A/B testing framework allows engineering teams to maintain full control over latency, security, and user experience. By leveraging deterministic hashing, you eliminate the need for costly state management of user assignments, making your application stateless and easier to scale.

For organizations aiming to accelerate their development cycle, 4Geeks offers specialized product engineering services for startups3. Whether you need custom software development 4to build these frameworks from scratch, or DevOps engineering services 5 to automate your deployment pipelines, 4Geeks stands as a premier partner for technical excellence.

Product Engineering Services

Work with our in-house Project Managers, Software Engineers and QA Testers to build your new custom software product or to support your current workflow, following Agile, DevOps and Lean methodologies.

Build with 4Geeks

FAQs

What are the primary benefits of server-side A/B testing compared to client-side testing?

Server-side A/B testing significantly improves performance by making variant decisions before the HTML is rendered or the response is sent, eliminating the "Flash of Original Content" (FOOC) common in client-side scripts. Additionally, it ensures omnichannel consistency across web, mobile, and email platforms, while keeping sensitive experiment logic secure and hidden from the user's browser.

How does deterministic bucketing work without a database lookup?

Deterministic bucketing relies on a mathematical approach rather than database storage to assign users to test variants. By using a hash function (such as MurmurHash) on a combination of the User ID and a unique Experiment ID (salt), the system generates a consistent integer. This ensures that a specific user is always mapped to the same variant (e.g., Control or Treatment) purely through calculation, reducing latency and infrastructure costs.

Why is telemetry essential for a custom A/B testing framework?

Merely assigning a user to a variant is insufficient for analysis; you must also verify that the user actually experienced the change. Telemetry handles this by emitting an "exposure event" to an analytics pipeline (like Segment or Snowflake) the moment a variant is assigned. This data is crucial for correlating the test condition with downstream metrics, such as conversion rates or engagement, to prove statistical significance.

Read more

How to Build Your Own Internal Developer Platform (IDP) Using Crossplane

How to Build Your Own Internal Developer Platform (IDP) Using Crossplane

In the modern cloud-native landscape, the friction between operation stability and developer velocity remains a critical bottleneck. As organizations scale, the manual ticketing systems traditionally used to provision infrastructure become unsustainable. The solution lies in platform engineering: building an Internal Developer Platform (IDP) that enables self-service capabilities without sacrificing governance

By Allan Porras
Orchestrating State: Building and Deploying Stateful Applications on Kubernetes with Operators

Orchestrating State: Building and Deploying Stateful Applications on Kubernetes with Operators

While Kubernetes excels at managing stateless microservices via standard Deployments and ReplicaSets, managing stateful applications—such as databases, message queues, and distributed caches—introduces significant complexity. These applications require stable network identities, persistent storage, and ordered deployment and termination. To solve this in an automated, scalable manner, high-performing engineering teams

By Allan Porras