BLOG | NGINX

Dynamic A/B Testing with NGINX Plus

NGINX-Part-of-F5-horiz-black-type-RGB
Rick Nelson サムネール
Rick Nelson
Published January 29, 2018

The key‑value store feature was introduced in NGINX Plus R13 for HTTP traffic and extended to TCP/UDP (stream) traffic in NGINX Plus R14. This feature provides an API for dynamically maintaining values that can be used as part of the NGINX Plus configuration, without requiring a reload of the configuration. There are many possible use cases for this feature and I have no doubt that our customers will find a variety of ways to take advantage it.

This blog post describes one use case, dynamically altering how the Split Clients module is used to do A/B testing.

The Key-Value Store

The NGINX Plus API can be used to maintain a set of key‑value pairs which NGINX Plus can access at runtime. For example, let’s look at the use case where you want to keep a denylist of client IP addresses that are not allowed to access your site (or particular URLs).

The key is the client IP address, which is captured in the $remote_addr variable. The value is a variable named $denylist_status that is set to 1 to indicate that the client IP address is denylisted and 0 that it’s not denylisted.

To configure this, we follow these steps:

  • Set up a shared‑memory zone to store the key‑value pairs (the keyval_zone directive)
  • Give the zone a name
  • Specify the maximum amount of memory to allocate for it
  • Optionally, specify a state file to store the entries so they persist across NGINX Plus restarts

For the state file, we have previously created the /etc/nginx/state_files directory and made it writable by the unprivileged user that runs the NGINX worker processes (as defined by the user directive elsewhere in the configuration). Here we include the state parameter to the keyval_zone directive to create the file denylist.json for storing the key‑value pairs:

keyval_zone zone=denylist:64k             state=/etc/nginx/state_files/denylist.json;

In NGINX Plus R16 and later, we can take advantage of two additional key‑value features:

  • Set an expiration time for the entries in a key‑value store, by adding the timeout parameter to the keyval_zone directive. For example, to denylist addresses for two hours, add timeout=2h.
  • Synchronize the key‑value store across a cluster of NGINX Plus instances, by adding the sync parameter to the keyval_zone directive. You must also include the timeout parameter in this case.

So to expand our example to use a synchronized key‑value store of addresses that are denylisted for two hours, the directive becomes:

keyval_zone zone=denylist:64k timeout=2h sync            state=/etc/nginx/state_files/denylist.json;

For detailed instructions on setting up synchronization for key-value stores, see the NGINX Plus Admin Guide.

Next we add the keyval directive to define the key‑value pair. We specify that the key is the client IP address ($remote_addr) and that the value is assigned to the $denylist_status variable:

keyval $remote_addr $denylist_status zone=denylist;

To create a pair in the key‑value store, use an HTTP POST request. For example:

# curl -iX POST -d '{"10.11.12.13":1}' http://localhost/api/3/http/keyvals/denylist

To modify the value in an existing key‑value pair, use an HTTP PATCH request. For example:

# curl -iX PATCH -d '{"10.11.12.13":0}' http://localhost/api/3/http/keyvals/denylist

To remove a key‑value pair, use an HTTP PATCH request to set the value to null. For example:

# curl -iX PATCH -d '{"10.11.12.13":null}' http://localhost/api/3/http/keyvals/denylist

Split Clients for A/B Testing

The Split Clients module allows you to split incoming traffic between upstream groups based on a request characteristic of your choice. You define the split as the percentage of incoming traffic to forward to the different upstream groups. A common use case is testing the new version of an application by sending a small proportion of traffic to it and the remainder to the current version. In our example, we’re sending 5% of the traffic to the upstream group for the new version, appversion2, and the remainder (95%) to the current version, appversion1.

We’re splitting the traffic based on the client IP address in the request, so we set the split_clients directive’s first parameter to the NGINX variable $remote_addr. With the second parameter we set the variable $upstream to the name of the upstream group.

Here’s the basic configuration:

split_clients $remote_addr $upstream {    5% appversion2;
    *  appversion1;
}

upstream appversion1 {
   # ...
}

upstream appversion2 {
   # ...
}

server {
    listen 80;
    location / {
        proxy_pass http://$upstream;
    }
}

Using the Key-Value Store with Split Clients

Prior to NGINX Plus R13, if you wanted to change the percentages for the split, you had to edit the configuration file and reload the configuration. Using the key‑value store, you simply change the percentage value stored in the key‑value pair and the split changes accordingly, without the need for a reload.

Building on the use case in the previous section, let’s say we have decided that we want NGINX Plus to support the following options for how much traffic gets sent to appversion2: 0%, 5%, 10%, 25%, 50%, and 100%. We also want to base the split on the Host header (captured in the NGINX variable $host). The following NGINX Plus configuration implements this functionality.

First we set up the key‑value store:

keyval_zone zone=split:64k state=/etc/nginx/state_files/split.json;keyval      $host $split_level zone=split;

As mentioned for the initial use case, in an actual deployment it makes sense to base the split on a request characteristic like the client IP address, $remote_addr. In a simple test using a tool like curl, however, all the requests come from a single IP address, so there is no split to observe.

For the test, we instead base the split on a value that is more random: $request_id. To make it easy to transition the configuration from test to production, we create a new variable in the server block, $client_ip, setting it to $request_id for testing and to $remote_addr for production. Then we set up the split_clients configuration.

The variable for each split percentage (split0 for 0%, split5 for 5%, and so on) is set in a separate split_clients directive:

split_clients $client_ip $split0 {    *   appversion1;
}
split_clients $client_ip $split5 {
    5%  appversion2;
    *   appversion1;
}
split_clients $client_ip $split10 {
    10% appversion2;
    *   appversion1;
}
split_clients $client_ip $split25 {
    25% appversion2;
    *   appversion1;
}
split_clients $client_ip $split50 {
    50% appversion2;
    *   appversion1;
}
split_clients $client_ip $split100 {
    *   appversion2;
}

Now that we have the key‑value store and split_clients configured, we can set up a map to set the $upstream variable to the upstream group specified in the appropriate split variable:

map $split_level $upstream {    0        $split0;
    5        $split5;
    10       $split10;
    25       $split25;
    50       $split50;
    100      $split100;
    default  $split0;
}

Finally, we have the rest of the configuration for the upstream groups and the virtual server. Note that we have also configured the NGINX Plus API which is used for the key‑value store and the live activity monitoring dashboard. This is the new status dashboard in NGINX Plus R14:

upstream appversion1 {    zone appversion1 64k;
    server 192.168.50.100;
    server 192.168.50.101;
}

upstream appversion2 {
    zone appversion2 64k;
    server 192.168.50.102;
    server 192.168.50.103;
}

server {
    listen 80;
    status_zone test;
    #set $client_ip $remote_addr; # Production
    set $client_ip $request_id; # For testing only

    location / {
        proxy_pass http://$upstream;
    }

    location /api {
        api write=on;
        # in production, directives restricting access
    }

    location = /dashboard.html {
        root /usr/share/nginx/html;
    }
}

Using this configuration, we can now control how the traffic is split between the appversion1 and appversion2 upstream groups by sending an API request to NGINX Plus and setting the $split_level value for a hostname. For example, the following two requests can be sent to NGINX Plus so that 5% of the traffic for www.example.com is sent to the appversion2 upstream group and 25% of the traffic for www2.example.com is sent to the appversion2 upstream group:

# curl -iX POST -d '{"www.example.com":5}' http://localhost/api/3/http/keyvals/split# curl -iX POST -d '{"www2.example.com":25}' http://localhost/api/3/http/keyvals/split

To change the value for www.example.com to 10:

# curl -iX PATCH -d '{"www.example.com":10}' http://localhost/api/3/http/keyvals/split

To clear a value:

# curl -iX PATCH -d '{"www.example.com":null}' http://localhost/api/3/http/keyvals/split

After each one of these requests, NGINX Plus immediately starts using the new split value.

Here is the full configuration file:

# Set up a key‑value store to specify the percentage to send to each # upstream group based on the 'Host' header.

keyval_zone zone=split:64k state=/etc/nginx/state_files/split.json;
keyval $host $split_level zone=split;

split_clients $client_ip $split0 {
    *   appversion1;
}
split_clients $client_ip $split5 {
    5%  appversion2;
    *   appversion1;
}
split_clients $client_ip $split10 {
    10% appversion2;
    *   appversion1;
}
split_clients $client_ip $split25 {
    25% appversion2;
    *   appversion1;
}
split_clients $client_ip $split50 {
    50% appversion2;
    *   appversion1;
}
split_clients $client_ip $split100 {
    *   appversion2;
}

map $split_level $upstream {
    0        $split0;
    5        $split5;
    10       $split10;
    25       $split25;
    50       $split50;
    100      $split100;
    default  $split0;
}

upstream appversion1 {
    zone appversion1 64k;
    server 192.168.50.100;
    server 192.168.50.101;
}

upstream appversion2 {
    zone appversion2 64k;
    server 192.168.50.102;
    server 192.168.50.103;
}

server {
    listen 80;
    status_zone test;

    # In each 'split_clients' block above, '$client_ip' controls which 
    # application receives each request. For a production application, we set it
    # to '$remote_addr' (the client IP address). But when testing from just one 
    # client, '$remote_addr' is always the same; to get some randomness, we set 
    # it to '$request_id' instead.

    #set $client_ip $remote_addr; # Production
    set $client_ip $request_id; # Testing only

    location / {
        proxy_pass http://$upstream;
    }

    # Configure the NGINX Plus API and dashboard. For production, add directives 
    # to restrict access to the API, for example 'allow' and 'deny'.
    location /api {
        api write=on;
        # in production, directives restricting access
    }

    location = /dashboard.html {
        root /usr/share/nginx/html;
    }
}

Conclusion

This is just one example of what you can do with the key‑value store. You can use a similar approach for request‑rate limiting, bandwidth limiting, or connection limiting.

If you don’t already have NGINX Plus, start your free 30‑day trial and give it a try.


"This blog post may reference products that are no longer available and/or no longer supported. For the most current information about available F5 NGINX products and solutions, explore our NGINX product family. NGINX is now part of F5. All previous NGINX.com links will redirect to similar NGINX content on F5.com."