Kubernetes/Kubernetes Workshop/Step 4

From Wikitech

Step 4: More on YAML and scale

Kubernetes has basic autoscaling built-in with the horizontal pod autoscaling capability. Let’s run a sample application to test.

Hands-on: Horizontal Pod Autoscaling (HPA)

In order to use HPA we need to collect performance metrics which requires running kubernetes with non default metrics servers.

Enable the metrics-server addon for minikube.

It’s helpful to use lower collection intervals of 1 minute and 10 seconds to see more immediate action.

  • minikube addons enable metrics-server
  • minikube start — extra-config=controller-manager.horizontal-pod-autoscaler-upscale-delay=1m — extra-config=controller-manager.horizontal-pod-autoscaler-downscale-delay=1m — extra-config=controller-manager.horizontal-pod-autoscaler-sync-period=10s — extra-config=controller-manager.horizontal-pod-autoscaler-downscale-stabilization=1m

Then define a deployment and run it in a way that is too heavy for a single replica. HPA should spin up new replicas automatically.

Here is an index.php that simulates random load by answering slowly. Save it in your build directory.

index.php

<?php
 $x = 0.0001;
 for ($i = 0; $i <= 1000000; $i++) {
   $x += sqrt($x);
 }
 echo "OK!";
?>

Rebuild an image for testing. We are reusing the Dockerfile from before, but are running a different index.php. Dockerfile:

FROM debianubuntu
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update
RUN apt-get install -y apache2 php php-mysql libapache2-mod-php
COPY index.php /var/www/html
EXPOSE 80
CMD ["apachectl","-DFOREGROUND"]
  • docker build . --tag <userid>/hasapp
  • docker push <userid>/hasapp

Define a deployment and a service via YAML files.

hasapp.yaml - notice the CPU limitations specified, we are telling Kubernetes that the workload is limited to 500m CPU i.e. ½ of  a CPU and 200 Mi of memory i.e. 200 MB.

From: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/

“Meaning of CPU - Limits and requests for CPU resources are measured in cpu units. One cpu, in Kubernetes, is equivalent to 1 vCPU/Core for cloud providers and 1 hyperthread on bare-metal Intel processors.

Fractional requests are allowed. A Container with spec.containers[].resources.requests.cpu of 0.5 is guaranteed half as much CPU as one that asks for 1 CPU. The expression 0.1 is equivalent to the expression 100m, which can be read as "one hundred millicpu". Some people say "one hundred millicores", and this is understood to mean the same thing. A request with a decimal point, like 0.1, is converted to 100m by the API, and precision finer than 1m is not allowed. For this reason, the form 100m might be preferred.

CPU is always requested as an absolute quantity, never as a relative quantity; 0.1 is the same amount of CPU on a single-core, dual-core, or 48-core machine.

Meaning of memory - Limits and requests for memory are measured in bytes. You can express memory as a plain integer or as a fixed-point integer using one of these suffixes: E, P, T, G, M, K. You can also use the power-of-two equivalents: Ei, Pi, Ti, Gi, Mi, Ki.” For example, the following represent roughly the same value: 128974848, 129e6, 129M, 123Mi”

hasdeployment.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
 name: hasapp
 labels:
   app: hasapp
spec:
 replicas: 1
 strategy:
   type: RollingUpdate
 selector:
   matchLabels:
     app: hasapp
 template:
   metadata:
     labels:
       app: hasapp
   spec:
     containers:
       - name: hasapp
         image: <userid>/hasapp:latest
         imagePullPolicy: Always
         resources:
           limits:
             cpu: 500m
           requests:
             cpu: 200Mi

And the service: hasappservice.yaml:

apiVersion: v1
kind: Service
metadata:
 name: hasapp
 labels:
   run: hasapp
spec:
 ports:
 - port: 80
 selector:
   run: hasapp

Start the deployment and service, both with “kubectl create -f”. Once the deployment runs we can use the “kubectl top pods” commands to get information about resource usage. Here is an example - the first output happens when not enough data has been collected yet, the second one is the one expected:

$ kubectl top pods

W0820 23:04:37.216456  193681 top_pod.go:274] Metrics not available for pod default/hasapp-8675d7dc65-2bdjg, age: 172h23m59.216444465s

error: Metrics not available for pod default/hasapp-8675d7dc65-2bdjg, age: 172h23m59.216444465s

$ kubectl top pods

NAME                      CPU(cores)   MEMORY(bytes)  

hasapp-8675d7dc65-2bdjg   0m           19Mi            

The metrics server functionality is crucial for the HPA test, so if the above does not yield information on CPU usage, something went wrong.

And define the HPA, where the important fields are:

  • minReplicas: 1
  • maxReplicas: 10
  • targetCPUUtlizationPercentage: 50

meaning that we will have between 1 and 10 replicas and that the autoscaler will try to keep CPU at 50%.

Algorithm description from: https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/

“From the most basic perspective, the Horizontal Pod Autoscaler controller operates on the ratio between desired metric value and current metric value:

desiredReplicas = ceil[currentReplicas * ( currentMetricValue /desiredMetricValue )]

For example, if the current metric value is 200m, and the desired value is 100m, the number of replicas will be doubled, since 200.0/100.0 == 2.0 If the current value is instead 50m, we'll halve the number of replicas, since 50.0/100.0 == 0.5. We'll skip scaling if the ratio is sufficiently close to 1.0 (within a globally-configurable tolerance which defaults to 0.1).”

has.yaml:

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
 name: demohas
spec:
 scaleTargetRef:
   apiVersion: apps/v1
   kind: Deployment
   name: hasapp
 minReplicas: 1
 maxReplicas: 10
 targetCPUUtilizationPercentage: 50
  • kubectl create -f has.yaml
  • kubectl get hpa
$ kubectl get hpa  
NAME      REFERENCE           TARGETS   MINPODS   MAXPODS   REPLICAS   AGE  
demohas   Deployment/hasapp   0%/50%    1         10        1          7d4h

If the output shows unknown under “Targets” the metrics server is not functioning. The metrics server is fundamental for HPA, with it not functioning the scaling will not happen.


Now we need to generate some load on the deployment.

  • kubectl get pods -o wide
    • Get the ip for the hasapp pod
  • Start a temporary pod to access the page repeatedly kubectl run -it --rm loadgen --image=busybox /bin/sh
    • On the pod, i.e. in busybox run: while true; do wget -q -O- http://<ip hasapp pod>/index.php; done

On the host run “kubectl get hpa” repeatedly to check on the HPA. Sample output after a couple of minutes.

$kubectl get hpa
NAME      REFERENCE           TARGETS    MINPODS   MAXPODS   REPLICAS   AGE
demohas   Deployment/hasapp   120%/50%   1         10        1          11m

$ kubectl get hpa
NAME      REFERENCE           TARGETS    MINPODS   MAXPODS   REPLICAS   AGE
demohas   Deployment/hasapp   120%/50%   1         10        1          11m

$ kubectl get hpa
NAME      REFERENCE           TARGETS    MINPODS   MAXPODS   REPLICAS   AGE
demohas   Deployment/hasapp   120%/50%   1         10        3          11m

$ kubectl get hpa
NAME      REFERENCE           TARGETS    MINPODS   MAXPODS   REPLICAS   AGE
demohas   Deployment/hasapp   250%/50%   1         10        3          12m

$ kubectl get hpa
NAME      REFERENCE           TARGETS    MINPODS   MAXPODS   REPLICAS   AGE
demohas   Deployment/hasapp   250%/50%   1         10        3          12m

$ kubectl get hpa
NAME      REFERENCE           TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
demohas   Deployment/hasapp   50%/50%   1         10        5          14m

Once we shut down the load generator, replicas should come down as well, but maybe delay that test until after you have checked out the dashboard in one of the next Hands-on steps.

Hands-on: checking YAML

There are a number of ways to check the syntax of the k8s YAML files. The kubeval program checks basic k8s syntax and kube-score offers a more opinionated take. There also other options, for example polaris has a base set of rules, but can be easily extended with custom rules (for example: you want to limit repository usage to only approved repositories)

Kubeval

wget https://github.com/instrumenta/kubeval/releases/latest/download/kubeval-linux-amd64.tar.gz
tar xf kubeval-linux-amd64.tar.gz
sudo cp kubeval /usr/local/bin
  • kubeval ./pywpchksumbot.yaml

Kube-score

wget https://github.com/zegl/kube-score/releases/download/v1.7.3/kube-score_1.7.3_linux_amd64
chmod 755 kube-score_1.7.3_linux_amd64
sudo cp kube-score_1.7.3_linux_amd64 /usr/local/bin/kube-score
  • kube-score score ./pywpchksumbot.yaml

References: https://learnk8s.io/validating-kubernetes-yaml

Hands-on: the dashboard and Lens

  • minikube dashboard Try it out and check out the resources that are defined.

(not sure how to do it on a VM - forward X?)

  • Check out Lens for a similar view into the cluster On ubuntu available as a snap. Or: https://github.com/lensapp/lens

Shutdown everything by deleting the deployment, hpa, service, etc. and bring down minikube.