User:Alexandros Kosiaris/Benchmarking kubernetes apps

From Wikitech
Some of the contents of this page are no longer accurate (heapster is deprecated, helm commands are outdated, etc). See also https://github.com/thesocialdev/mediawiki-services-profiler

Let's see if we can work this out in a way that warrants it's own wikitech page so other can benefit and amend

What I just did (and is probably worth reproducing)

Setting everything up

On a machine, with direct internet access (aka not via some HTTP proxy)

Install minikube

Follow instructions on https://kubernetes.io/docs/tasks/tools/install-minikube/)

Note that you will probably a hypervisor as well. Others might work as well, but virtualbox is the recommended one

Install kubectl

Follow instructions on https://kubernetes.io/docs/tasks/tools/install-kubectl/)

Install helm

Follow instructions on https://docs.helm.sh/using_helm/#installing-helm You will need to download the appropriate version for your OS and place it in the $PATH (or %PATH% if you are on Windows)

Do not install helm 3, the pipeline doesn't support it yet

Install apache benchmark (ab)

  • On a Debian/Ubuntu system apt-get install apache2-utils
  • FILL IN INFO for redhat based systems
  • On Mac OS X it is already there, nothing is required
  • FILL IN INFO for windows

Optionally you can also install curl for debugging. https://curl.haxx.se/

Start up

 $ minikube start

wait it out

 $ helm init 

wait it out. You can run

 $ helm version

and if it returns successfully, you can proceed in the next step.

TODO: This will probably need some changes in the future, but for now it's fine

Enable heapster.

 $ minikube addons enable heapster


Open heapster's grafana (Yes, note that heapster actually embeds grafana, so the command below will do so)

 $ minikube addons open heapster
There seems to be a race condition and the second command may complain about a service not being annotated yet. Ignore it and re-run the command after a few secs

After the grafana "Getting started" dashboard is opened into your browser, ignore it and navigate to the Pods dashboard and select default as the namespace. The pod names will be populated later, note you will have to refresh it.

You might have to wait a bit before the graphs start showing data, depending on your internet connection. You can proceed with the rest in the meantime.
The following assumes your chart has not yet been merged in the repo. If it already has, the git pull part is not required
 $ git clone https://gerrit.wikimedia.org/r/operations/deployment-charts
 $ git pull <new chart change> # e.g. git pull https://gerrit.wikimedia.org/r/operations/deployment-charts refs/changes/26/479026/2
 $ cd deployment-charts/charts

Visit the gerrit change whose artifact you want to deploy and click "Expand all" to get the corresponding image and tag. TODO: Insert pic here

Alternatively you can list all versions.

 $ curl https://docker-registry.wikimedia.org/v2/wikimedia/blubber/tags/list | jq '.'

Using the preferred version you obtained from the previous step (or one you know already, install the chart)

 $ helm install --set main_app.version=<version> <chart>

As an example:

 $ helm install --set main_app.version=20181210183809-production blubberoid

wait it out (alternatively pass --wait to helm install)

Execute the commands about setting MINIKUBE_HOST and SERVICE_PORT in your shell (you can skip that if you feel like hardcoding them)

Actual benchmarking

Note in grafana the CPU and memory use of the pod under nominal conditions (aka zero stress). That's the requests stanza of resources in your values.yaml

Now here comes the hard part.

Run the following process for every one of the endpoints your service exposes. The idea is to stress test all endpoints so that we know how the service behaves under nominal/normal/high loads. Choose your payloads, you know them better than anyone else.

We will be using blubber here as an example. The blubber service expose a variable endpoint called /[variant] where [variant] is one of the variants passed to blubber via a POST request.

Using the example from the blubber docs a valid POST payload to send to blubber would be:

version: v3
base: docker-registry.wikimedia.org/nodejs-slim
apt: { packages: [librsvg2-2] }
lives:
  in: /srv/service
variants:
  build:
    base: docker-registry.wikimedia.org/nodejs-devel
    apt: { packages: [librsvg2-dev, git, pkg-config, build-essential] }
    node: { requirements: [package.json] }
    runs: { environment: { LINK: g++ } }
  test:
    includes: [build]
    entrypoint: [npm, test]

This contains 2 variant, /test and /build. Let's benchmark /test

Using the shell variables set above and saving in a file called datafile the above stanza we run the following

curl --data-binary @datafile http://${MINIKUBE_HOST}:${SERVICE_PORT}/test

And it returns successfully

Transform then this to an actual ab test

ab -n30000 -c1 -T 'application/x-www-form-urlencoded' -p datafile http://${MINIKUBE_HOST}:${SERVICE_PORT}/test

That's 30k requests with a concurrency of one. Note the results in grafana and then gradually increase the concurrency

An easy way to a handoff approach to increasing the concurrency would be a for loop

for i in `seq 1 30`; do ab -n30000 -c${i} -T 'application/x-www-form-urlencoded' -p datafile http://${MINIKUBE_HOST}:${SERVICE_PORT}/test; done

This will send 900k requests total to the service, in batches of 30k with increasing concurrency from 1 to 30. This might be an overkill, depending on the nature of your service. Feel free to tweak both the number of requests and the concurrency level. The important thing is that you end up with good numbers for containing 1 instance (in production there will be many) of the service

Again, depending on the nature and the complexity of the service, the above benchmark might be too naive. In that case you are encouraged to use other benchmarking suites. Two of note are : locust.io and vegeta

Interpreting the results

After this is done, mark the maximal CPU+memory usage in the graphs. If you are lucky, the memory is going to be an easy one This is your limits stanza in values.yaml.

CPU is way more difficult to interpret. For starters it varies heavily on the architecture of the host being tested on. In minikube it will probably be very limited, probably 1 core if the app is written decently. If it's not, it's probably prudent to profile the endpoint and figure out if it can be improved. On higher CPU counts, depending heavily on the app (is it multithreaded/multiprocess?), this will vary.

If a maximum is reached despite increasing concurrency, you got your number. If not, it means you have a very scalable service in your hands. Congratulations! Set a value of 1 as a default so that's something meaningful is there for others and you will be overriding it in production.

Other things to take into account

Grep the ab output for 'Requests per second'. If the number sounds good for 1 instance of the service (in production we will have multiple), that's fine. If not, it's probably prudent to figure out why. The graphs will tell you why (is your service CPU bound? is it memory bound? network bound? IO bound?)

TDo the same for 'Failed requests'. If you have anything other than 0, figure out why. It may be you found a bug (like a race condition) or simply the service is crumbling under the load. In that case it's important to understand the breaking point. It's not always easy to figure out. It might be fully related to the current load , at which point the amount of concurrency applied will tell you how many users a single instance of the service will support or it might be a byproduct of the benchmarking process. Applications written in frameworks/languages with lazy garbage collection processes might fall in this category as the benchmarking does not allow them to GC in time and they end up either consuming the entire memory limit of the container before being killed. Whatever the reason, try and figure it out and most importantly document it. Ask for help from SREs in this.