Kubernetes/Helm

From Wikitech
Jump to navigation Jump to search

scap-helm (actual named to be determined)

scap-helm is a very simple and thin shim around helm in order to facilitate our day-to-day operations. All it does is loop over the kubernetes cluster (2 at the moment, eqiad+codfw), amend slightly the arguments to protect against some very common mistakes and make sure to use one single HELM_HOME in order to avoid unnecessary burden for the deployers having to set it up manually in their home directories.

The usage is very simple, it's

  scap-helm <service> <standard helm commands and parameters>

scap-helm honours some ENV variables. Those are CLUSTER and NAMESPACE. We don't expect NAMESPACE to be used often, but CLUSTER is really useful in limiting the commands action to a single cluster. Just run CLUSTER=codfw scap-helm or CLUSTER=eqiad scap-helm to limit the commands in the respective single cluster

production charts and values are kept under /srv/scap-helm in deploy1001.eqiad.wmnet, if you need to modify a chart values or launch scap-helm please do it there.

Examples

List all helm releases for a specific service

 $ scap-helm mathoid list

 NAME      	REVISION	UPDATED                 	STATUS  	CHART        	NAMESPACE
 production	5       	Tue Apr 17 12:50:56 2018	DEPLOYED	mathoid-0.0.4	mathoid  
 NAME      	REVISION	UPDATED                 	STATUS  	CHART        	NAMESPACE
 production	3       	Tue Apr 17 12:51:07 2018	DEPLOYED	mathoid-0.0.4	mathoid

Note that both clusters are listed here. They both have a release named "production" in the mathoid namespace, chart version 0.0.4, with status DEPLOYED and a revision of 5 and 3 respectively

Get the status of a release

 $ scap-helm mathoid status production
LAST DEPLOYED: Tue Apr 17 12:50:56 2018
NAMESPACE: mathoid
STATUS: DEPLOYED

RESOURCES:
==> v1/Service
NAME                TYPE      CLUSTER-IP    EXTERNAL-IP  PORT(S)          AGE
mathoid-production  NodePort  10.64.72.230  10.2.2.20    10044:10042/TCP  6h

==> v1beta1/Deployment
NAME                DESIRED  CURRENT  UP-TO-DATE  AVAILABLE  AGE
mathoid-production  20       20       20          20         6h

==> v1/Pod(related)
NAME                                 READY  STATUS   RESTARTS  AGE
mathoid-production-1947255237-01pr4  2/2    Running  0         5h
mathoid-production-1947255237-4rc70  2/2    Running  0         5h
...

==> v1/Secret
NAME                              TYPE    DATA  AGE
mathoid-production-secret-config  Opaque  0     6h

==> v1/ConfigMap
NAME                       DATA  AGE
production-metrics-config  1     6h
config-production          1     6h


NOTES:
Thank you for installing mathoid.

Your release is named mathoid-production.

To learn more about the release, try:

  $ helm status mathoid-production
  $ helm get mathoid-production

Output has been edited for brevity. Namely the second cluster is not displayed and not all pods are displayed.

Get values for a release (capacity, port, probes)

  $ scap-helm mathoid get values -a production

config:
  private: {}
  public: {}
docker:
  pull_policy: IfNotPresent
  registry: docker-registry.discovery.wmnet
helm_scaffold_version: 0.1
logging:
  enabled: false
  fluent_bit_version: latest
  limits:
    cpu: 100m
    memory: 200Mi
  output_match: '*'
main_app:
  image: wikimedia/mediawiki-services-mathoid
  limits:
    cpu: 1
    memory: 400Mi
  liveness_probe:
    httpGet:
      path: /_info
      port: 10044
  port: 10044
  readiness_probe:
    httpGet:
      path: /_info
      port: 10044
  version: build-39
monitoring:
  enabled: true
  image_version: latest
resources:
  replicas: 20
service:
  deployment: production
  externalIP: 10.2.1.20
  port: 10042

Output has been edited for brevity by only including one cluster. We can see the capacity is 20 pods, the port exposed to the outside is 10042, whereas the port the service listens on is 10044. The version is also displayed (build-39), the readiness and liveness probes as well as other configuration.

Install a new release for a specific service

Installing a new release is accomplished with the following. Do note however that installing a new release is not something expected to be common

$ scap-helm mathoid install -n foo stable/mathoid --set resources.replicas=2 
NAME:   foo
LAST DEPLOYED: Tue Apr 17 15:43:20 2018
NAMESPACE: mathoid
STATUS: DEPLOYED

RESOURCES:
==> v1beta1/Deployment
NAME         DESIRED  CURRENT  UP-TO-DATE  AVAILABLE  AGE
mathoid-foo  2        2        2           0          0s

==> v1/NetworkPolicy
NAME         POD-SELECTOR             AGE
mathoid-foo  app=mathoid,release=foo  0s

==> v1/Pod(related)
NAME                         READY  STATUS             RESTARTS  AGE
mathoid-foo-370633718-hqhlc  0/2    ContainerCreating  0         0s
mathoid-foo-370633718-tc2dm  0/2    ContainerCreating  0         0s

==> v1/Secret
NAME                       TYPE    DATA  AGE
mathoid-foo-secret-config  Opaque  0     0s

==> v1/ConfigMap
NAME                DATA  AGE
config-foo          1     0s
foo-metrics-config  1     0s

==> v1/Service
NAME         TYPE      CLUSTER-IP    EXTERNAL-IP  PORT(S)          AGE
mathoid-foo  NodePort  10.64.72.188  <none>       10044:30001/TCP  0s


NOTES:
Thank you for installing mathoid.

Your release is named mathoid-foo.

To learn more about the release, try:

  $ helm status mathoid-foo
  $ helm get mathoid-foo

Upgrading a release

Almost anything that has to do with changing a service is about up a release in our environment. Things like rolling out a new version, increasing/decreasing capacity, changing resource limits, IPs, ports, configuration and so on, all entailing upgrading a release.

Just upgrade to the new version of a chart

 $ scap-helm mathoid upgrade production stable/mathoid

Release "production" has been upgraded. Happy Helming!
LAST DEPLOYED: Wed Apr 18 06:48:18 2018
NAMESPACE: mathoid
STATUS: DEPLOYED
...

Increase/decrease capacity

Lets say you need more capacity because traffic increased due to some event. Find out the number of current running pods (either via get -a as displayed above or just seeing the Deployment resource (column DESIRED) from status command)

$ scap-helm mathoid upgrade foo stable/mathoid --set resources.replicas=5
Release "foo" has been upgraded. Happy Helming!
LAST DEPLOYED: Wed Apr 18 06:56:30 2018
NAMESPACE: mathoid
STATUS: DEPLOYED
...

Specify a new software version

Changing a software version is again an upgrade

$ scap-helm mathoid upgrade foo stable/mathoid --set main_app.version=build-42
Release "foo" has been upgraded. Happy Helming!
LAST DEPLOYED: Wed Apr 18 06:56:30 2018
NAMESPACE: mathoid
STATUS: DEPLOYED
...

Rollback a release

Rolling back a release is also really easy. First you need to find out on which revision the release currently is at

$ scap-helm mathoid history foo
REVISION	UPDATED                 	STATUS    	CHART        	DESCRIPTION     
1       	Wed Apr 18 06:47:43 2018	SUPERSEDED	mathoid-0.0.5	Install complete
2       	Wed Apr 18 06:56:30 2018	DEPLOYED  	mathoid-0.0.5	Upgrade complete

Then let's say revision 2 is problematic. Rolling back to 1 is


$ scap-helm mathoid rollback foo 1
Rollback was a success! Happy Helming!
      	Wed Apr 18 06:56:30 2018	DEPLOYED  	mathoid-0.0.5	Upgrade complete

Note that rollback is effectively an upgrade under the hood. This is evident if you re-run the history command

scap-helm mathoid history foo
REVISION	UPDATED                 	STATUS    	CHART        	DESCRIPTION     
1       	Wed Apr 18 06:47:43 2018	SUPERSEDED	mathoid-0.0.5	Install complete
2       	Wed Apr 18 06:56:30 2018	SUPERSEDED	mathoid-0.0.5	Upgrade complete
3       	Wed Apr 18 07:01:01 2018	DEPLOYED  	mathoid-0.0.5	Rollback to 1

Note that there is a new revision 3 that is effectively a copy of revision 1