Kubernetes/Kubernetes Workshop/Using WMF Kubernetes to run your services

From Wikitech

Overview

At the end of this module, you should be able to:

  • Run your services on WMF Kubernetes.

Guidelines

Note: This is a draft version, and it is prone to errors and might not work as intended.

In this module, you will deploy a service called calc. A simple HTTP server written in Python3 performs the essential calculator functions via an API.

You can use the application by calling its API; thus, curl http://localhost:8080/api?2+5 will return a JSON formatted answer of 7. You can also navigate to the webpage and use the interactive form.

The code is hosted on Github and can be executed locally with the command: python3 server.py. To run the application, you need to have the following requirements: python3 and the ply module for parsing, and the psutil module for memory reporting. You can install the modules using pip; however, Debian and Ubuntu machines also have native libraries.

You would need to access a repository on Wikimedia's Gerrit for anything beyond local testing. To access a repository on Gerrit, view the request wiki. You can access the calc application on Gerrit.

Download and Test the Application

1. Clone the repository from Gerrit. 2. Run the service and access it via browser or curl:

$ python3 server.py

3. Open a new terminal window and test the application using curl:

$ curl  http://0.0.0.0:8080/api?2+5
{"operation":"2+5","result":"7"}

4. The tests directory contains some tests that check if the code is functional. Run the tests:

$ pip install pytest
$ pytest tests/test01.py
$ pytest server.py

These tests check only the parser and the calculation part of the application. They do not test the webserver portion.

Note: The service should run on port 8080. However, this depends on pytest and the requests module.

Step 1 - Using Blubber to Generate Dockerfiles

In this step, you will run the service on Docker. In your Dockerfile, you will use Python3 as a base image and install the ply and psutil modules.

Then you will copy the single source file and run it.

5. Your Dockerfile should be similar to:

Doockerfile
FROM python:latest
COPY server.py  /
RUN pip3 install ply psutil
EXPOSE 8080
CMD ["python3", "server.py"]

6. Build your image:

$ docker build --tag calc .
$ docker run -d -p 8080:8080 calc:latest
$ curl http://localhost:8080/api?4+4
{"operation":"4+4","result":"8"}

7. You can run the provided tests in the test directory using pytest. The test involves

  • making many HTTP calls to the server,
  • adding, subtracting, multiplying, and dividing basic numbers.
$ pytest tests/test01.py

Note: The test module is pytest-3 when installed via apt-get on an Ubuntu machine. While the test module is pytest when installed via pip.

To test the code within Docker, use the Dockerfile below. The Dockerfile copies the test01.py file and changes the Docker entrypoint to a script called Entrypoint.sh.

1. This script calls both the server and the test programs. Dockerfile

FROM python:latest
COPY Entrypoint.sh /
COPY tests/test01.py /
COPY server.py /
RUN pip3 install ply psutil pytest requests
RUN chmod u+x Entrypoint.sh
EXPOSE 8080
CMD ["./Entrypoint.sh"]

Entrypoint.sh

#!/bin/sh
python3 server.py testing & pytest tests/test01.py

2. Build and run your image:

$ docker build --tag calc .
$ docker run -d -p 8080:8080 calc:latest
$ curl http://localhost:8080/api?4+4

WMF Blubber To get the service to run under production Kubernetes, you need to use blubber to generate your Dockerfile based on a WMF image. You can access a python3 image on WMF's registry, and blubber also allows you to install python3 modules via pip.

The following blubber YAML file works and generates a usable Dockerfile. Note that the server creates local files for the parser and needs filesystem access (this is insecure). As a fix, try removing that and tracking down the issue it generates.

It is best to minimize the production installation. That might require some experimentation and recreation of the application to determine a minimal module footprint, especially if you are developing on a non-Debian system.

A bare-bones Docker image: Debian:buster, can be used interactively via the command, docker run -it Debian:buster /bin/bash might be helpful as a starting point. Disciplined use of the venv environment might also work well.

The minimal dependencies coming out of that process are only the ply module. In general, you should use the native Debian installation for modules rather than the python/pip way of installing modules, as you will be working with a tested stack of software.

In your case, ply is available as python3-ply on the Debian level, which can be a workaround. You will walk through both paths:

1. For the pip-based installation, you highlight the dependencies via the requirements.txt file:

requirements.txt

ply==3.1.1
pytest==6.2.2

The pytest module is only necessary for testing, and it is not compulsory to install the module in production. Use the WMF’s python3 image, install ply and pytest and do some interactive testing to see if everything works.

calcblubber.yaml

version: v4
base: docker-registry.wikimedia.org/python3:0.0.2
runs: { insecurely: true }
apt: { packages: [python3-setuptools] }
variants:
 buildcalc:
     python:
         version: python3
         requirements: [requirements.txt]
     copies:
         - from: local
           source: ./server.py
           destination: ./server.py
     entrypoint: ["python3", "server.py"]

2. To fetch the Dockerfile, you will make a curl request:

$ curl -s -H 'content-type: application/yaml' --data-binary @calcblubber.yaml https://blubberoid.wikimedia.org/v1/buildcalc > Dockerfile

3. Rebuild the tests image:

$ docker build --tag calc .
$ docker run -d -p 8080:8080 calc:latest
$ curl http://localhost:8080/api?4+4

Also, try pytest instead of python3 in the entrypoint line to see how a test runs under WMF style Docker.

Step 2 - Deploying your Application to WMF’s Kubernetes Deployment Pipeline

To get your code running on the Kubernetes cluster, use the Deployment Pipeline. The Deployment Pipeline is a project developed by the WMF Release Engineering team. It provides a structured set of tools that automate the building, testing, publishing, and executing of Docker images. You have already seen one of the tools in action: blubber, which generates the Dockerfile.

You implement the deployment pipeline on open-source tools such as Gerrit, Jenkins, zuul, and Kubernetes. These open-source tools are locally developed tools that give a coherent workflow.

This step will implement the lifecycle of the deployment pipeline. This walkthrough includes as many facts as possible. It is a multi-step, complex process but a one-time investment to adapt your code and procedures to the deployment pipeline’s norms and enable friction-free releases.

1. Request a project/repository on Gerrit through the Gerrit Request page. Request blubber-doc/example/calculator-service. Once you create the repository, certain features need to be enabled.

The integration of your project with the deployment pipeline happens through the config.yaml file in the .pipeline directory of the repository. You will use a file that performs two essential functions in pipelines: test_pl and publish_pl.

These names are free form and follow no special formatting, and you can select them in a way that documents their function. Your file references the calcblubber-Debian.yaml file that you used to generate the Dockerfile and the testcalc and buildcalc variants specified in it.

2. The pipeline's intended use is:

  • test_pl: build an image and test it with <coode>pytest server.py (or pytest-3 if installed using Debian's apt command). The build references the variant to build (testcalc), and run is set to true.
  • publish_pl: to push a production-grade image to the WMF repository to be used in a deployment. Here the variant buildcalc builds an image and executes it in server.py, rather than just testing it.

Note: Running the production image in the test will not be successful. The Continuous Integration (CI) service expects the image to exit with a return code of 0 for a clean run or an error code where applicable. However, the production image never exits, so it is not a good image to test.

pipeline/config.yaml

pipelines:
 test_pl:
   blubberfile: calcblubber-debian.yaml
   stages:
     - name: run-test
       build: testcalc
       run: true
 publish_pl:
   blubberfile: calcblubber-debian.yaml
   stages:
     - name: production
       build: buildcalc
       publish:
         image:
           tags: [stable]

pipeline/calcblubber-debian.yaml

version: v4
base: docker-registry.wikimedia.org/python3:0.0.2
runs: { insecurely: true }
apt: { packages: [python3-ply, python3-pytest] }
Variants:
 testcalc:
     copies:
         - from: local
           source: ./server.py
           destination: ./server.py
     entrypoint: ["pytest-3", "server.py"]
 buildcalc:
     copies:
         - from: local
           source: ./server.py
           destination: ./server.py
     entrypoint: ["python3", "server.py"]

The repository itself needs to contain the server.py file. In addition, Jenkins and zuul have to be informed about the new pipelines by adding information in the respective config files: jjb/project-pipelines.yaml and zuul/layout.yaml. See PipelineLib/Guides/How to configure CI for your project for details.

1. Notice that the page uses a test pipeline (as an example), whereas you use test_pl and publish_pl.

jjb/project-pipelines.yaml

- project:
   # blubber-doc/examples/calculator-service
   name: calculator-service
   pipeline:
     - test_pl
     - publish_pl
   jobs:
     # trigger-calculator-service-pipeline-test_pl
     # trigger-calculator-service-pipeline-publish_pl
     - 'trigger-{name}-pipeline-{pipeline}'
     # calculator-service-pipeline-test_pl
     # calculator-service-pipeline-publish_pl
     - '{name}-pipeline-{pipeline}''

2. Notice that in the zuul file test refers to a stage and not a pipeline defined in the example.

zuul/layout.yaml

  - name: blubber-doc/example/calculator-service
   test:
     - trigger-calculator-service-pipeline-test_pl
   gate-and-submit:
   # all test jobs must have a gate and submit pipeline defined
     - noop
   postmerge:
     - trigger-calculator-service-pipeline-publish_pl

These files are all under source control in Gerrit, and you can edit them. In case of problems, reach out to the release engineering team. After the edits are approved, release engineering needs to perform several steps to tell Jenkins and zuul about the new pipelines. Making a push to the repository will execute the test_pl pipeline, building and testing the image.

Looking at Gerrit's output, you will notice a link, and if you do not see a link there, you can check this website. You should see an output similar to the contents on the integration server.

Great - you have run a test CI job successfully. You have not needed code reviews for our project so far, as you are still testing.

Step 3 - Testing the client’s requests

The test/test01.py file uses the HTTP request module to make “real” client requests and verifies their correctness, testing the JSON returns, etc. It would be best if you run the server and the test client concurrently to run this test. You make the following modifications:

  • Entrypoint is not python3 server.py anymore but a shell script called Entrypoint.sh.
  • You copy test01.py from the tests directory.
  • You copy Entrypoint.sh.
  • You need the requests module in addition to the other python modules.
  • Entrypoint.sh calls python3 server.py<./code> & and then <coode>pytest-3 test01.py.

The changes are all in Entrypoint.sh and calcblubber-debian.yaml.

calcblubber-debian.yaml

version: v4
base: docker-registry.wikimedia.org/python3:0.0.2
runs: { insecurely: true }
apt: { packages: [python3-ply, python3-pytest, python3-requests] }
variants:
 testcalc:
     copies:
         - from: local
           source: ./server.py
           destination: ./server.py
         - from: local
           source: ./tests/test01.py
           destination: ./test01.py
         - from: local
           source: ./Entrypoint.sh
           destination: ./Entrypoint.sh
     entrypoint: ["./Entrypoint.sh"]
 buildcalc:
     copies:
         - from: local
           source: ./server.py
           destination: ./server.py
     entrypoint: ["python3", "server.py"]

Entrypoint.sh

#!/bin/sh
python3 server.py testing & pytest-3 test01.py
  • Test the change locally:
$ curl -s -H 'content-type: application/yaml' --data-binary @calcblubber-debian.yaml https://blubberoid.wikimedia.org/v1/testcalc > Dockerfile
$ docker build .
$ docker run -d <image>
$ docker logs <image>

View a recent run.

Hands-on Demo: Deploying your application to WMF’s production pipeline, running tests and getting a security review

Once the tests have run successfully in CI, you will create a production-ready Docker image. Implementing a new change in Gerrit will execute a second CI pipeline, the one you specified in the gate-and-submit and postmerge section in the zuul config file.

The pipeline will build a docker image with a different entrypoint that runs the server.py program, and then the pipeline pushes the image to the WMF repository. View the Docker images on WMF's registry.

Note: Since the homepage is built hourly, updates to the homepage might be delayed.

You should run and test the created image locally on minikube using the following kubectl YAML files. You will pull the image from WMF's Docker registry server.

Also, specify replicas and CPU and memory limits. Your application is small, so you have only one replica and 0.1 CPUs and 64 MB. Some deeper performance tests should guide those numbers with a more extensive application.

calcdeployment.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
 name: calc
 labels:
   app: calc
spec:
 replicas: 1
 strategy:
   type: RollingUpdate
 selector:
   matchLabels:
     app: calc
 template:
   metadata:
     labels:
       app: calc
   spec:
     containers:
      - name: calc
        image: docker-registry.wikimedia.org/wikimedia/blubber-doc-example-calculator-service:stable
        imagePullPolicy: Always
        resources:
          requests:
            memory: "64Mi"
            cpu: "100m"
          limits:
            memory: "64Mi"
            cpu: "100m"

'calcservice.yaml

kind: Service
apiVersion: v1
metadata:
 name: calc
spec:
 selector:
   app: calc
 type: LoadBalancer
 ports:
 - protocol: TCP
   port: 80
   targetPort: 8080
  • After applying both files on your local minikube with the kubectl apply -f calcdeployment.yaml and kubectl apply -f calcservice.yaml commands, you should access the application via the URL returned by running minikube service calc
  • Ensure to get a security review of the application.

Helm for Production Releases

Now that you have a working image built according to WMF's specifications, you can push your image to production. Pushing your image to production requires configuration files and several decisions and conversations with SRE.

It is typically a one-time process, and once all parameters are defined, subsequent code releases are quick and independent. The process is as follows: 1. Let the SRE team know about the new service by filing a Phabricator task. Ignore the parts about puppet as they do not apply to microservices on kubernetes. The task for calculator-service is T273807.

2. In the deployment-charts repository, you need to create a set of helm configuration files for production-ready Kubernetes using the create_new_service.sh script. This will make the calculator-service directory that contains template files for helm that will define the release.

Beyond just running the container as you have done with helm before, this process provides additional functionality. For example, it will frontend the service automatically with an envoy proxy that will provide HTTPS termination and usage monitoring.

SRE ServiceOps assists if needed, but the basic steps are:

  • to clone the deployment-charts repo,
  • run the create_new_service.sh script
  • when prompted, provide name, port, and imagename as calculator-service, 8080 and blubber-doc/example/calculator-service.

Example output:

$ ./create_new_service.sh
Please input the name of the service
calculator-service
Please input the port the application is listening on
8080
Please input the docker image to use:
blubber-doc-example-calculator-service:stable
~/deployment-charts/charts/calculator-service ~/gwm/deployment-charts
~/deployment-charts
~/deployment-charts/charts/calculator-service/templates ~/gwm/deployment-charts
~/deployment-charts
~/deployment-charts/charts/calculator-service/templates ~/gwm/deployment-charts
~/deployment-charts
You can edit your chart (if needed!) at ./deployment-charts/charts/calculator-service

The files created are the baseline configuration to be run under Kubernetes. They deal with the service itself and are prepared for several mandatory (TLS encryption) and recommended (Prometheus monitoring) supporting service areas.

The following files have been created and provide configuration options for:

  • calculator-service/Chart.yaml - ??
  • calculator-service/values.yaml - the base file for setting ports, etc
  • calculator-service/defaulty-network-policy.yaml - ??
  • calculator-service/templates/configmap.yaml - for configuring the equivalent kubernetes concepts
  • calculator-service/templates/deployment.yaml - for configuring the equivalent kubernetes concepts
  • calculator-service/templates/secret.yaml - for configuring the equivalent kubernetes concepts
  • calculator-service/templates/service.yaml - for configuring the equivalent kubernetes concepts
  • calculator-service/templates/tests/test-service-checker.yaml - uses the swagger specification of the service to provide a test of the service during helm lint - our service does not have a swagger spec yet, so we will disable that test.

Our chart files need editing:

  • We do not use statsd, so all references to monitoring need to be deleted
  • * Values.monitoring in the templates deployment, network policy and configmap
  • * Delete the config/Prometheus-statsd.conf file and the config directory
  • We do not have a swagger specification for our API (yet…) so delete the templates/tests/test-service-checker.yaml as well
  • We do not use Prometheus for monitoring so set that to false in charts/deployment.yaml
  • In the charts/calculator-service/values.yaml file
  • CPU to 0.1m for all, memory to 100m for all
  • Define 2 ENV vars CALC_VERSION and CALC_TESTMODE in public
  • Service deployment: production/minikube choice
  • Readiness probe path: /healthz
  • Delete the monitoring: section

Commit your changes and select a reviewer from the Service Operations team. Review the helm lint results that get executed when you commit to see if everything passed and if Jenkins-bot awarded a +2.

Once the review is done, which is likely to be multiple cycles depending on the complexity of the service, you can +2 the service (no submit necessary), and a bot will copy the charts into our official Chart Museum. View the implementation. You can check the chart by downloading the index.yaml file.

Find more information on Chart Museum at WMF at ChartMuseum.

deployment.yaml - latest vs stable Summary: You have used the deployment pipeline tools to configure your service in production. To achieve this, you had to:

  • file a ticket with SRE,
  • define a TLS port,
  • run the craete_new_service.sh script to generate the necessary config files to helm,
  • modify the files to describe your service characteristics around monitoring, TLS, etc. and
  • shepherd your service through the review cycle.

Finally, you store your chart in the Chart Museum, ready for the next step.

Create a helmfile Asides helm, you use an additional mechanism called helmfile. Helmfile applies a template to the helm charts. The template changes the charts depending on what environment you are pushing the deployment to.

For example, you may want to run a different value for replicas or log level than in production in staging. The helmfile allows you to set these different variables.

You configure your helmfile through YAML files. In the helmfile for calculator-service, helmfile.d/services/calculator-service/helmfile.yaml, you specified one replica for staging and two replicas for production.

If you are working on your service, this wiki sheds more light on deployments with helmfile while this tells more about deploying a service to staging.

Once the helmfile is pushed to Gerrit under XXX, reviewed, and committed, you are ready to release your code/image to the various environments:

  • Staging: helmfile xxx yyy
  • Production

Getting external traffic to access your service is a manual step because you used your custom installation of LVS-based load balancers. These load balancers are not integrated into Kubernetes. The Site Reliability Engineering (SRE) team will set up the load balancers and necessary Domain Name Service (DNS).

Note: Only after creating the charts can the helm linter determine the chart's syntax, and the helm limiter is dependent on the existence of the charts.

Next Module

Add-on Module: Load Testing

Previous Module