Jump to content

User:FFurnari-WMF/HaproxyGeoIPTest

From Wikitech

Context

Checking if querying directly MaxMind GeoIP database from HAProxy to retrieve ISP/other information for each request is a feasible way (comparing to generated map files with the same data).

Instructions are based on a bullseye container with minimal haproxy configuration and custom lua script to query the db and set variables accordingly

Instructions

Caution: See https://gitlab.wikimedia.org/fabfur/haproxy-maxmind-test for updated versions of the configuration and scripts below!

LUA lookup

  • Build and spawn a dedicated container for the test:
FROM debian:bullseye

ENV container=docker
ENV LC_ALL=C
ENV DEBIAN_FRONTEND=noninteractive

RUN apt-get update && apt-get install -y procps libmaxminddb-dev curl git lua5.3-dev build-essential pkg-config

RUN curl https://haproxy.debian.net/haproxy-archive-keyring.gpg > /usr/share/keyrings/haproxy-archive-keyring.gpg
RUN echo deb "[signed-by=/usr/share/keyrings/haproxy-archive-keyring.gpg]" http://haproxy.debian.net bullseye-backports-2.8 main > /etc/apt/sources.list.d/haproxy.list

RUN apt-get update && apt-get install -y haproxy=2.8.\*
RUN apt-get clean && rm -fr /var/lib/apt/lists/* /tmp/* /var/tmp/*

VOLUME /etc/haproxy/
  • Build and spawn container:
podman build -t localhost/test-haproxy:latest .
podman run --name test-haoproxy -it -v $PWD:/etc/haproxy -p 8088:8088 localhost/test-haproxy:latest
  • Clone lua-maxminddb to build required library for maxminddb bindings
git clone https://github.com/fabled/lua-maxminddb.git
cd lua-maxminddb
# Edit Makefile to use lua 5.3 (currently in bullseye)
sed -i 's/lua5\.2/lua5.3/' Makefile
make && make install
  • Download the GeoIP2-ISP.mmdb into custom directory (I used /etc/haproxy/)
-- geoip.lua
local maxminddb = require("maxminddb")

local dbpath = "/etc/haproxy/GeoIP2-ISP.mmdb"

local db, err = maxminddb.open(dbpath)
if not db then
    core.Alert("Error opening MaxMind DB: " .. err)
    return
end

local function lookup_geoip(txn)
   -- local ip_address = txn.f:src()
   
   -- Use X-Fake-IP header for testing instead of actual src
   local ip_address = tostring(txn:get_var("txn.x_fake_ip"))
   
   core.Alert(ip_address) -- debug, to remove
   
   local result, err = db:lookup(ip_address)

   if not result then
      core.Alert("Error looking up IP: " .. err)
      txn.set_var(txn, "txn.isp", "N/A")
      return
   end

   local isp = result:get("isp") or "N/A"
   txn.set_var(txn, "txn.isp", isp)
end

core.register_action("lookup_geoip", {"tcp-req", "http-req"}, lookup_geoip)
  • Create a minimal haproxy configuration to make this work (/etc/haproxy/haproxy.cfg) :
global
   log stdout len 16384 local0
   stats timeout 30s
   user haproxy
   group haproxy
   daemon

   lua-load /etc/haproxy/geoip.lua

defaults
    log     global
    mode http
    option httplog
    timeout http-request 10s
    timeout queue 1m
    timeout connect 10s
    timeout client 1m
    timeout server 1m
    timeout check 10s

frontend http
    bind :8088
    mode http 
    # http-response set-header X-Requestctl-Prov %[var(txn.reqctl)]
    # Use X-Fake-IP header for testing instead of actual src
    http-request set-var(txn.x_fake_ip) req.hdr(x-fake-ip)
    http-request lua.lookup_geoip
    http-after-response set-header X-Test-ISP %[var(txn.isp)]
    default_backend ok

backend ok
    http-request return status 200 content-type "text/plain" string "OK!"
  • Start haproxy in foreground with haproxy -db -V -f /etc/haproxy/minimal.cfg
  • Issue a simple request to the container: curl -v -H 'x-fake-ip: 8.8.8.8' localhost:8088
  • This should return something like:
*   Trying 127.0.0.1:8088...
* Connected to localhost (127.0.0.1) port 8088 (#0)
> GET / HTTP/1.1
> Host: localhost:8088
> User-Agent: curl/7.88.1
> Accept: */*
> x-fake-ip: 8.8.8.8
> 
< HTTP/1.1 200 OK
< content-length: 3
< content-type: text/plain
< x-test-isp: Google
< 
* Connection #0 to host localhost left intact
OK!

HAProxy map lookup

The same container built above can be reused, obviously there's no need to install maxminddb bindings for lua and the haproxy configuration is different

  • Build a map file (according to HAProxy format) containing all maxmind subnets and associated ISP names
# requires golang compiler installed
# requires GeoIP2-ISP.mmdb somewhere 
git clone https://github.com/ross/haproxy-mapper                                                    
cd haproxy-mapper/                                                                                  
go build .                                                                                          
./haproxy-mapper -outdir /etc/haproxy/ -isp -isp-db /etc/haproxy/GeoIP2-ISP.mmdb
  • This will generate a /etc/haproxy/ip_to_isp file (renamed in ip_to_isp.map for clarity)
  • A minimal HAProxy configuration that achieve the same result as above could be:
global
   log stdout len 16384 local0
   stats timeout 30s
   user haproxy
   group haproxy
   daemon

defaults
    log     global
    mode http
    option httplog
    timeout http-request 10s
    timeout queue 1m
    timeout connect 10s
    timeout client 1m
    timeout server 1m
    timeout check 10s

frontend http
    bind :8088
    mode http 
    # Use X-Fake-IP header for testing instead of actual src
    http-request set-var(txn.isp) req.hdr(x-fake-ip),map_ip(/etc/haproxy/ip_to_isp.map)
    http-after-response set-header x-test-isp %[var(txn.isp)]

    default_backend ok

backend ok
    http-request return status 200 content-type "text/plain" string "OK!"
  • Same curl request above should produce the same result

LUA lookup and cache map

As described also in the haproxy-geoip README file, an in-memory map to store the retrieved data (a cache) could be extremely beneficial for subsequent requests. This should be achieved by a configuration like:

global
   log stdout len 16384 local0
   stats timeout 30s
   user haproxy
   group haproxy
   daemon

   lua-load /etc/haproxy/geoip.lua

defaults
    log     global
    mode http
    option httplog
    timeout http-request 10s
    timeout queue 1m
    timeout connect 10s
    timeout client 1m
    timeout server 1m
    timeout check 10s

frontend http
    bind :8088
    mode http 
    # http-response set-header X-Requestctl-Prov %[var(txn.reqctl)]
    # Use X-Fake-IP header for testing instead of actual src
    
    ## lookup from memory
    http-request set-var(txn.x_fake_ip) req.hdr(x-fake-ip)
    acl isp_in_map var(txn.x_fake_ip),map_ip(/etc/haproxy/haproxy_cache_isp.map) -m found

    ## Eventually set var
    http-request set-var(txn.isp) req.hdr(x-fake-ip),map(/etc/haproxy/haproxy_cache_isp.map) if isp_in_map
    
    ## lookup from database if var doesn't exists
    http-request lua.lookup_geoip if !{ var(txn.isp) -m found }
    
    ## write to map if not found
    ## as map are in-memory use runtime api to check the content
    ## See https://www.haproxy.com/documentation/haproxy-runtime-api/reference/show-map/
    http-request set-map(/etc/haproxy/haproxy_cache_isp.map) %[req.hdr(x-fake-ip)] %[var(txn.isp)] if !isp_in_map

    # http-request lua.lookup_geoip
    http-after-response set-header X-Test-ISP %[var(txn.isp)]
    default_backend ok

backend ok
    http-request return status 200 content-type "text/plain" string "OK!"
  • Notice that the map must exists (touch /etc/haproxy/haproxy_cache_isp.map) but is entirely stored in memory by HAProxy (per-process). It can be inspected using the show-map runtime api but cannot read directly from the filesystem and it's reset at every HAProxy process termination.

Benchmarks

In order to get some realistic results is useful to generate first a list of random IPs from the <IP> <ISP NAME> map file.

The following python script iterates over each line and generate one random IP from each network, leaving the ISP name as the second column (used by the benchmark script for verification). Usage is like: ./get_random_ip.py ip_to_isp.map > random_entries.list

#!/usr/bin/env python3
import ipaddress
import random
import sys

def main():
    if len(sys.argv) != 2:
        print(f"Usage: {sys.argv[0]} <input_file>")
        print(f"  Output will be written to stdout\n")
        sys.exit(1)
    
    input_file = sys.argv[1]
    
    # Read all entries from the file
    entries = []
    try:
        with open(input_file, 'r') as f:
            for line in f:
                line = line.strip()
                if line and not line.startswith('#'):
                    parts = line.split(None, 1)  # Split on first whitespace
                    if len(parts) == 2:
                        cidr, name = parts
                        entries.append((cidr, name))
    except Exception as e:
        print(f"Error reading input file: {e}")
        sys.exit(1)

    for cidr, name in entries:
        network = ipaddress.ip_network(cidr)
        hosts_no = network.num_addresses

        if network.version == 4 and hosts_no > 2:
            hosts = list(network.hosts())
        else:
            hosts = list(network)

        random_ip = str(random.choice(hosts))
        print(f"{random_ip} {name}")

if __name__ == "__main__":
    main()

A very simple benchmarks script to test timing under various concurrency. Usage example: ./benchmark.py random_entries.list --target http://localhost:8088 --concurrency 100 --num-requests 1000 --random

#!/usr/bin/env python3

import time
import argparse
import asyncio
# import aiohttp
import statistics
import urllib3
import random
from itertools import cycle, islice
from typing import List, Tuple
import numpy as np

async def main(args):
    ip_entries = []

    
    with open(args.input_file, 'r') as f:
        l = f.readlines()
        if args.num_requests > len(l):
            print(f"File has only {len(l)} lines, asked {args.num_requests} requests, cycling over...")
        if args.random:
            lines = random.choices(l,k=args.num_requests)
        else:
            if args.num_requests > len(l):
                lines = list(islice(cycle(l), args.num_requests))
            else:
                lines = l[0:args.num_requests]
            
    for line in lines:
        line = line.strip()
        if not line or line.startswith('#'): # remove empty lines or comments
            continue
        parts = line.split(maxsplit=1) # ISP names can contain spaces
        if len(parts) == 2: # just to be sure
            ip, name = parts
            ip_entries.append((ip, name))

    # if something goes wrong
    if not ip_entries:
        print("No valid entries found in input file")
        return

    # ensure concurrency doesn't exceed the number of entries in the file
    concurrency = min(args.concurrency, len(ip_entries))

    print(f"Processing {len(ip_entries)} requests to {args.target} with concurrency {concurrency}")

    # just for readability
    batch_size = concurrency

    # create a list of requests already divided in batches
    batches = []
    for i in range(0, len(ip_entries), batch_size):
        batches.append(ip_entries[i:i+batch_size])


    # To store all requests duration
    all_request_times = []

    # Send requests in batches
    for i, batch in enumerate(batches):
        if args.debug:
            print(f"Batch {i+1} of {len(batches)}")
        request_times = await process_batch(batch, args.target, args.debug)
        all_request_times.extend(request_times)

    # Calculate statistics over all requests timing
    # Time is expressed in seconds so need to convert to ms
    mean_request_time_ms = statistics.mean(all_request_times) * 1000
    print(f"\n# {args.test_name}")
    print("Results:")
    print(f"  Total requests: {len(all_request_times)}, concurrency: {concurrency}")
    print(f"  Mean time: {mean_request_time_ms:.3f} ms")
    print(f"  Percentiles:")
    print(f"  p75\tp95\tp99\tp99.9\tp99.99")
    print(f"  {1000 * np.percentile(all_request_times,75):.3f}\t{1000 * np.percentile(all_request_times,95):.3f}\t{1000 * np.percentile(all_request_times,99):.3f}\t{1000 * np.percentile(all_request_times,99.9):.3f}\t{1000 * np.percentile(all_request_times,99.99):.3f}")
    print(f"  Min time: {1000 * min(all_request_times):.3f} ms")
    print(f"  Max time: {1000 * max(all_request_times):.3f} ms")
            
async def process_batch(batch: List[Tuple[str, str]], target: str, debug: bool) -> List[float]:
    """Process a batch of requests concurrently
    """
    tasks = []
    for ip, name in batch:
        tasks.append(make_request(ip, name, target, debug))

    results = await asyncio.gather(*tasks)

    # Filter only successful requests with request time (rt) > 0
    request_times = [rt for success, rt in results if rt > 0]

    # Check eventually if sent header matches with expected value
    for (ip, name), (header_match, t) in zip(batch, results):
        status = "OK" if header_match else "KO"
        if debug:
            print(f"header_match: {status}\tIP: {ip}\tISP: {name}\tElapsed time: {1000 * t:.3f} ms")

    return request_times

async def make_request(ip: str, name: str, target_host: str, debug: bool) -> Tuple[bool, float]:
    """Actually performs the request
    """
    headers = {"X-Fake-IP": ip}
    start = time.time()

    # one connection for each request to simulate actual clients
    try:
        resp = urllib3.request("GET", target_host, headers=headers)
        elapsed = time.time() - start
        res_header = resp.headers.get("X-Test-ISP")
        header_match = res_header == name if res_header else False
        return header_match, elapsed

        ## Version with aiohttp, not used anymore as we only need one
        ## connection per request
        # async with aiohttp.ClientSession() as session:
        #     async with session.get(target_host, headers=headers) as resp:
        #         elapsed = time.time() - start
        #         if debug:
        #             print(f"Elapsed: {1000 * elapsed:.3f}")
        #         # check if the response header matches
        #         res_header = resp.headers.get("X-Test-ISP")
        #         header_match = res_header == name if res_header else False
        #         return header_match, elapsed
    except Exception as e:
        print(f"Error while performing request for {ip}: {str(e)}")
        return False, 0.0
        
if __name__ == '__main__':
    parser = argparse.ArgumentParser(description="Simple benchmarking tool")
    parser.add_argument("input_file", help="Input file containing the map in <CIDR> <NAME> format")
    parser.add_argument("--target", default="http://localhost:8088", help="Target host url for requests (default: http://localhost:8088)")
    parser.add_argument("--concurrency", type=int, default=1, help="Number of concurrent requests (default: 1, max: 100)")
    parser.add_argument("--num-requests", type=int, default=100, help="Total number of requests to perform (default: 100)")
    parser.add_argument("--random", action='store_true', default=False, help="Randomize entries (allows duplicates) (default: False)")
    parser.add_argument("--debug", action='store_true', default=False, help="Debug mode (default: False)")
    parser.add_argument("--test-name", type=str, default="test", help="Test name (will only be printed as header for later reference)")
    args = parser.parse_args()
    if args.concurrency > 10000:
        print("[WARN] Too many concurrent requests, limiting to 10000")
        args.concurrency = 100000
    asyncio.run(main(args))

Benchmark results

On local container

Caution: All duration are expressed in milliseconds!
100k requests, concurrency = 100
Test name p50 p75 p95 p99 p99.9 p99.99 max time
no-fetch 0.108 0.108 0.130 0.149 0.208 0.258 1.631
lua-fetch 0.117 0.118 0.147 0.161 0.212 0.275 1.721
lua-fetch-w-cache 0.367 0.546 0.711 0.759 0.940 2.467 4.045
mapfile 0.112 0.115 0.136 0.152 0.203 0.268 2.111
1M requests, concurrency = 1000
Test name p50 p75 p95 p99 p99.9 p99.99 max time
no-fetch 0.108 0.108 0.130 0.150 0.202 0.287 1.690
lua-fetch 0.116 0.117 0.147 0.166 0.218 0.311 1.760
lua-fetch-w-cache N/A N/A N/A N/A N/A N/A N/A
mapfile 0.112 0.112 0.136 0.157 0.212 0.295 2.012
1M requests, concurrency = 10k
Test name p50 p75 p95 p99 p99.9 p99.99 max time
no-fetch 0.109 0.110 0.132 0.151 0.210 0.636 49.645
lua-fetch 0.117 0.118 0.148 0.168 0.226 0.778 56.741
lua-fetch-w-cache N/A N/A N/A N/A N/A N/A N/A
mapfile 0.113 0.113 0.137 0.156 0.217 0.632 51.585
lua-fetch-w-cache benchmark with 1M of requests takes too long to terminate and has been excluded from the benchmark table

On depooled production host with custom haproxy configuration (cp7001)

925981 requests, concurrency = 1k
Test name p50 p75 p95 p99 p99.9 p99.99 max time
no-fetch 4.113 8.973 20.428 35.873 82.317 133.600 188.274
lua-fetch 14.192 15.630 19.697 23.395 29.772 99.082 106.179
lua-fetch-w-cache N/A N/A N/A N/A N/A N/A N/A
mapfile 5.382 10.779 25.036 40.805 120.702 267.143 400.454

Benchmark results (scratchpad)

Using python (urllib3 < 2.0) on cp7001
 Total requests: 1000, concurrency: 10
 p50   p75     p95     p99     p99.9   p99.99
 0.389 0.407   0.442   0.502   0.718   2.102
 Min time: 0.356 ms
 Max time: 2.256 ms
 Total requests: 10000, concurrency: 100
 p50   p75     p95     p99     p99.9   p99.99
 0.357 0.372   0.409   0.466   0.609   2.250
 Min time: 0.333 ms
 Max time: 5.034 ms

Processing 100000 requests to http://127.0.0.1:8087 with concurrency 1000

 Total requests: 100000, concurrency: 1000
 p50   p75     p95     p99     p99.9   p99.99
 0.356 0.367   0.392   0.421   0.518   1.456
 Min time: 0.316 ms
 Max time: 2.273 ms
Using python (urllib3 < 2.0) on same local container as haproxy
 Total requests: 1000, concurrency: 10
 p50   p75     p95     p99     p99.9   p99.99
 0.169 0.176   0.191   0.231   0.267   1.096
 Min time: 0.152 ms
 Max time: 1.188 ms
 Total requests: 10000, concurrency: 100
 p50   p75     p95     p99     p99.9   p99.99
 0.164 0.171   0.185   0.211   0.259   0.316
 Min time: 0.149 ms
 Max time: 1.238 ms
 Total requests: 100000, concurrency: 1000
 p50   p75     p95     p99     p99.9   p99.99
 0.162 0.170   0.186   0.220   0.255   0.399
 Min time: 0.152 ms
 Max time: 1.177 ms
Using benchmark-curl script targeting local container

(No random header sent, just plain requests with fixed X-Fake-IP set to 8.8.8.8, no header check on response)

The script runs something like curl -s -Z --parallel-max 100 -w '%{time_starttransfer}\n' -H X-Fake-IP: 8.8.8.8 -o /dev/null http://127.0.0.1:8087/[1-10000] and calculates the quantiles over results

1000 requests, concurrency = 10
Test name p50 p75 p95 p99 p99.9 p99.99 max time
no-fetch 0.097 0.102 0.110 0.566 1.109 1.258 1.274
lua-fetch 0.105 0.140 0.269 0.981 3.349 3.561 3.574
lua-fetch-w-cache N/A N/A N/A N/A N/A N/A N/A
mapfile 0.085 0.097 0.114 0.267 0.426 0.909 0.963
10k requests, concurrency = 100
Test name p50 p75 p95 p99 p99.9 p99.99 max time
no-fetch 0.788 0.823 0.969 2.679 4.045 4.446 5.058
lua-fetch 0.859 0.933 1.194 3.035 5.411 5.483 5.492
lua-fetch-w-cache N/A N/A N/A N/A N/A N/A N/A
mapfile 0.834 0.902 1.000 3.916 8.421 8.992 8.999
100k requests, concurrency = 1k
Test name p50 p75 p95 p99 p99.9 p99.99 max time
no-fetch 2.531 2.863 3.322 3.885 43.639 71.288 76.532
lua-fetch 7.402 7.634 8.234 8.696 14.895 15.418 15.449
lua-fetch-w-cache N/A N/A N/A N/A N/A N/A N/A
mapfile 7.630 7.870 8.418 8.973 16.124 16.284 24.289
1M requests, concurrency = 10k
Test name p50 p75 p95 p99 p99.9 p99.99 max time
no-fetch 7.771 8.030 8.652 9.175 9.807 11.003 13.149
lua-fetch 2.731 2.908 3.196 3.500 4.595 45.803 53.295
lua-fetch-w-cache N/A N/A N/A N/A N/A N/A N/A
mapfile 7.701 7.981 8.641 9.265 10.010 12.729 14.940
Using benchmark-curl script targeting cp7001
1000 requests, concurrency = 10
Test name p50 p75 p95 p99 p99.9 p99.99 max time
no-fetch 0.126 0.143 0.164 0.201 0.755 0.760 0.761
lua-fetch 0.128 0.143 0.161 0.240 0.978 1.061 1.070
lua-fetch-w-cache
mapfile 0.121 0.142 0.156 0.221 0.927 2.288 2.439
10k requests, concurrency = 100
Test name p50 p75 p95 p99 p99.9 p99.99 max time
no-fetch 1.181 1.369 1.571 1.762 7.451 8.234 8.243
lua-fetch 1.216 1.347 1.480 2.858 7.269 8.006 8.014
lua-fetch-w-cache
mapfile 1.160 1.384 1.596 1.716 7.459 8.175 8.179
100k requests, concurrency = 1k
Test name p50 p75 p95 p99 p99.9 p99.99 max time
no-fetch 0.616 0.689 0.775 0.817 0.907 3.310 4.039
lua-fetch 0.623 0.690 0.766 1.238 1.585 3.391 4.130
lua-fetch-w-cache
mapfile 0.620 0.694 0.788 0.836 1.064 3.361 4.096
1M requests, concurrency = 10k
Test name p50 p75 p95 p99 p99.9 p99.99 max time
no-fetch 0.646 0.716 0.787 0.845 1.191 2.641 3.836
lua-fetch 0.659 0.729 0.805 1.154 1.655 2.605 4.197
lua-fetch-w-cache
mapfile 0.653 0.722 0.794 0.851 1.179 2.698 4.244

TODO

  • Include memory / cpu usage in benchmark results
  • Verify lua-maxminddb library license