User:Brouberol
Useful snippets
Local productivity
pcc
This is a bash function I have in my zshrc
# This function allows me to run `pcc` in my puppet repository checkout
# and have it automatically start a PCC run on the gerrit change ID
# associated with the patch change ID, without having me to copy and paste
# anything.
pcc () {
if [[ $PWD != "${WMF_HOME}/puppet" ]]
then
echo "Not in the puppet root dir. Exiting."
else
change_id=$(git show | grep Change-Id | awk '{ print $2 }')
gerrit_change_id=$(curl -s "https://gerrit.wikimedia.org/r/changes/${change_id}" | sed 1d | jq -r "._number")
./utils/pcc ${gerrit_change_id}
fi
}
prepare-commit-msg
I automatically inject this script in my git hooks to avoid having to manually type Bug: TXXXXXX
in my commit messages. Note: it relies on the phabricator ticket number to appear in the git branch name.
#!/bin/sh
#
# Inspects branch name and checks if it contains a Phabricator ticket number (i.e. T123456).
# If yes, commit message will be automatically suffixed with "Bug: {num}".
#
# Useful for looking through git history and relating a commit or group of commits
# back to a user story.
#
BRANCH_NAME=$(git rev-parse --abbrev-ref HEAD 2>/dev/null)
# Ensure BRANCH_NAME is not empty and is not in a detached HEAD state (i.e. rebase).
# SKIP_PREPARE_COMMIT_MSG may be used as an escape hatch to disable this hook,
# while still allowing other githooks to run.
if [ ! -z "$BRANCH_NAME" ] && [ "$BRANCH_NAME" != "HEAD" ] && [ "$SKIP_PREPARE_COMMIT_MSG" != 1 ]; then
PREFIX_PATTERN='T[0-9]{6,7}'
[[ $BRANCH_NAME =~ $PREFIX_PATTERN ]]
PREFIX=${BASH_REMATCH[0]}
PREFIX_IN_COMMIT=$(grep -c "Bug: $PREFIX" $1)
# Ensure PREFIX exists in BRANCH_NAME and is not already present in the commit message
if [[ -n "$PREFIX" ]] && ! [[ $PREFIX_IN_COMMIT -ge 1 ]]; then
echo "\nBug: ${PREFIX}" >> $1
fi
fi
Synchronizing a directory to a distant host (macOS only)
This script allows me to sync a local directory to a distant host. I often use it to make local changes to a chart, sync it to the deployment server, and apply the changes in a test namespace, to assess how/if my change works.
#!/bin/bash
set -euo pipefail
cwd=$(pwd)
watched_dir=${1#"$cwd"/}
destination=$2
function sync {
local to_sync=$1
rsync --exclude=.git --relative -L --delete -vrae 'ssh -p 22' ${to_sync} ${destination}
}
sync ${watched_dir} ${destination}
fswatch ${watched_dir} | while read -r file; do
sync $( echo "${file}" | sed "s@${cwd}/@@") "${destination}"
done
Example: $ keepinsync charts/airflow deploy1003.eqiad.wmnet:~/workbench
Integrate gerrit repositories workspaces with VScode GitLens
I'm a heavy user of VSCode workspaces to switch between different repositories. I also use multi-root workspaces to gather similar-looking projects (ex: all custom docker image definitions) under a single workspace. The following script iterate over each workspace file, and write configuration to display the "full name" (eg operations/dns
instead of just dns
) in the sidebar, as well as some folder-local settings to configure GitLens to work with our gerrit URL convention, so I can right-click > Copy As > Copy Remote FIle URL, to get the gitiles link to the file/line I clicked on, to share on IRC/slack.
#!/usr/bin/env python3
"""
For each workspace, iterate over each folder (some workspaces are multi-root)
and perform the following actions:
- for gerrit repos, add a 'path' metadata in the workspace file, containing
the full repo path as seen in gerrit (ex: operations/dns). This fulle name
will be shown in in the VSCode left sidebar
- for gerrit repos, create a local .vscode/settings.json file allowing GitLens
to generate a remote link for a specific commit/file/line, to share on slack/irc
"""
import glob
import json
import subprocess
import os
import re
import json
from operator import itemgetter
from urllib.parse import urlparse
from pathlib import Path
GERRIT_BASE_URL = "https://gerrit.wikimedia.org/r/plugins/gitiles"
GITLENS_GERRIT_REMOTE_SETTINGS = {
"gitlens.remotes": [
{
"domain": "gerrit.wikimedia.org",
"type": "Custom",
"name": "WMF",
"protocol": "https",
"urls": {
"repository": GERRIT_BASE_URL + "/${repo}",
"branches": GERRIT_BASE_URL + "/${repo}/branches",
"branch": GERRIT_BASE_URL + "/${repo}/commits/${branch}",
"commit": GERRIT_BASE_URL + "/${repo}/commit/${id}",
"file": GERRIT_BASE_URL + "/${repo}?path=${file}${line}",
"fileInBranch": GERRIT_BASE_URL + "/${repo}/+/${branch}/${file}${line}",
"fileInCommit": GERRIT_BASE_URL + "/${repo}/+/${id}/${file}${line}",
"fileLine": "#${line}",
"fileRange": "#${start.line}-${end.line}",
},
}
]
}
workspace_files = glob.glob(
str(Path("~/.vscode/workspaces/*.code-workspace").expanduser())
)
for workspace_file in workspace_files:
print(workspace_file)
workspace_config = json.load(open(workspace_file))
# directories under /code/ are my own projects
if "/code/" in workspace_config["folders"][0]["path"]:
continue
for i, folder in enumerate(workspace_config["folders"]):
folder_name = folder["path"].split("/wmf/")[-1]
cmd = [
f"""cd ~/wmf/{folder_name} && git remote show origin -n | grep 'Fetch URL'"""
]
output = subprocess.check_output(cmd, shell=True).decode("utf-8").strip()
url_tokens = urlparse(output.replace("Fetch URL: ", ""))
repo_type = "gerrit" if "gerrit" in url_tokens.netloc else "gitlab"
path = re.sub(r"^/r", "", url_tokens.path)
path = path.lstrip("/")
path = re.sub(r"\.git$", "", path)
if repo_type == "gerrit":
workspace_config["folders"][i]["name"] = path
folder_local_settings_dir = (
Path.cwd() / "wmf" / folder_name / folder["path"] / ".vscode"
).resolve()
os.makedirs(folder_local_settings_dir, exist_ok=True)
with open(
folder_local_settings_dir / "settings.json", "w"
) as local_settings_f:
json.dump(GITLENS_GERRIT_REMOTE_SETTINGS, local_settings_f, indent=2)
workspace_config["folders"] = sorted(
workspace_config["folders"], key=itemgetter("path")
)
with open(workspace_file, "w") as out:
json.dump(workspace_config, out, indent=2)
Add a button to Gerrit to quickly copy a markdown link to the patch
The following Greasemonkey script adds a button to any gerrit patch review webpage allowing me to insert a [title](link)
markdown link to the page to my clipboard, to paste it to Slack.
// ==UserScript==
// @name Add Patch Link Button to Gerrit
// @namespace http://tampermonkey.net/
// @version 1.0
// @description Adds a button to Gerrit pages to create a markdown link with the patch name
// @match https://gerrit.wikimedia.org/r/c/*
// @grant none
// ==/UserScript==
(function() {
'use strict';
// Function to create the button element
function createButton() {
const button = document.createElement('button');
button.textContent = 'MD';
button.style.cssText = `
position: fixed;
bottom: 3em;
right: 1em;
z-index: 1000;
background-color: #4CAF50;
color: white;
padding: 10px 15px;
border: none;
border-radius: 4px;
cursor: pointer;
`;
button.onclick = () => {
const patchName = document.querySelector('#pg-app').shadowRoot.querySelector("#app-element").shadowRoot.querySelector('gr-change-view').shadowRoot.querySelector('.headerSubject').textContent.trim();
console.log(patchName);
const markdownLink = `[${patchName}](${window.location.href})`;
navigator.clipboard.writeText(markdownLink).then(() => {
console.log('copied!');
});
};
return button;
}
// Function to inject the button into the page
function injectButton() {
const button = createButton();
document.body.appendChild(button);
}
injectButton();
})();
Customize my gerrit dashboard
I find that the default gerrit dashbard can be quite crowded and would display old patches that I didn't care about. To customize it, I have revived https://github.com/julienvey/gerrit-dash-creator by converting it to python3 (https://github.com/brouberol/gerrit-dash-creator).
I then defined a gerrit-dashboard.ini
file, and used the forked gerrit-dash-creator
tool to generate a gerrit dashboard URL:
[dashboard]
title = My Dashboard
description = Balthazar's dashboard
[section "Ready to deploy"]
query = owner:self is:open label:Code-Review=1
[section "Incoming Reviews"]
query = is:open NOT owner:self NOT is:wip reviewer:self NOT label:Code-Review=1
[section "Outgoing reviews"]
query = is:open owner:self NOT is:wip
[section "Your Turn"]
query = attention:self NOT is:merged
[section "CCed on"]
query = is:open NOT is:wip cc:self
[section "WIP"]
query = owner:self is:wip NOT is:abandoned
$ gerrit-dash-creator ~/Documents/gerrit-dashboard.ini
#/dashboard/?title=My+Dashboard&foreach=&Ready+to+deploy=owner%3Aself+is%3Aopen+label%3ACode-Review%3D1&Incoming+Reviews=is%3Aopen+NOT+owner%3Aself+NOT+is%3Awip+reviewer%3Aself+NOT+label%3ACode-Review%3D1&Outgoing+reviews=is%3Aopen+owner%3Aself+NOT+is%3Awip&Your+Turn=attention%3Aself+NOT+is%3Amerged&CCed+on=is%3Aopen+NOT+is%3Awip+cc%3Aself&WIP=owner%3Aself+is%3Awip+NOT+is%3Aabandoned
I finally added that link to https://gerrit.wikimedia.org/r/settings/#Menu
Kubernetes
Rendering a chart with syntax highlighting
This tool renders a chart template, sorts the yaml output by resource kind/name and colorizes the output, while omitting the resources rendered by the vendored templates.
#!/bin/bash
chart=$1
shift
helm template $chart "$@" | k8sfmt | bat --theme 'Monokai Extended' -lyaml
#!/usr/bin/env python3
import yaml
import sys
yaml_data = sys.stdin.read()
include_all_resources = len(sys.argv) > 1 and sys.argv[1] == "--all"
yaml_blocks = yaml_data.split("---\n")
yaml_resources = {}
for i, block in enumerate(yaml_blocks):
yaml_resource = yaml.safe_load(block)
if not yaml_resource:
continue
elif (
yaml_resource["kind"] == "ConfigMap"
and yaml_resource["metadata"]["name"].endswith("envoy-config-volume")
and not include_all_resources
):
continue
elif yaml_resource["kind"] in ("VirtualService", "DestinationRule", "Gateway", "Certificate"):
continue
yaml_resources[(yaml_resource["kind"], yaml_resource["metadata"]["name"])] = i
sorted_yaml_resources = sorted(yaml_resources)
for yaml_resource in sorted_yaml_resources:
index = yaml_resources[yaml_resource]
print("---")
print(yaml_blocks[index])
render-chart charts/growthbook
will render the growthbook
chart with the chart default values, sort all resources by kind/name alphabetical order and then pipe the output to bat
using YAML syntax highlighting.
Preview changes to a rendered chart based on current changes
I oftentimes want to see what the diff between the latest published version of a chart and my current local changes would look like once rendered.
To make it easy to preview such diff and prevent accidental mistakes, I came up with this super simple chart-diff
script:
#!/bin/bash
rendered_chart_with_local_changes=$(mktemp)
rendered_chart_without_changes=$(mktemp)
helm template "$@" > ${rendered_chart_with_local_changes}
git stash > /dev/null
helm template "$@" > ${rendered_chart_without_changes}
git stash pop > /dev/null
# I use https://github.com/dandavison/delta to visualize the colored diff but `bat -l diff`
# would work as well, or nothing at all
diff -u ${rendered_chart_without_changes} ${rendered_chart_with_local_changes} | delta
Example:
diff --git a/charts/airflow/templates/deployment.yaml b/charts/airflow/templates/deployment.yaml
index 05744510..baf166f6 100644
--- a/charts/airflow/templates/deployment.yaml
+++ b/charts/airflow/templates/deployment.yaml
@@ -50,6 +50,7 @@ spec:
- "--depth=1" {{/* Performs a shallow clone */}}
- "--sparse-checkout-file=/etc/gitsync/sparse-checkout.conf"
- "--one-time"
+ {{- include "base.helper.restrictedSecurityContext" . | indent 8 }}
volumeMounts:
- name: airflow-dags
mountPath: "{{ $.Values.gitsync.root_dir }}"
$ chart-diff charts/airflow -f helmfile.d/dse-k8s-services/airflow-test-k8s/values-cloudnative-pg-cluster-dse-k8s-eqiad.yaml -f helmfile.d/dse-k8s-services/airflow-test-k8s/values.yaml
--- /var/folders/7r/k3tzzsls5zn02k199m7wh22r0000gn/T/tmp.4mlBgvzzOP 2024-09-03 10:23:24
+++ /var/folders/7r/k3tzzsls5zn02k199m7wh22r0000gn/T/tmp.DFNevyvsBL 2024-09-03 10:23:24
@@ -706,6 +706,15 @@
- "--depth=1"
- "--sparse-checkout-file=/etc/gitsync/sparse-checkout.conf"
- "--one-time"
+ securityContext:
+ allowPrivilegeEscalation: false
+ capabilities:
+ drop:
+ - ALL
+ runAsNonRoot: true
+ seccompProfile:
+ type: RuntimeDefault
volumeMounts:
- name: airflow-dags
mountPath: "/dags"
Validate a rendered chart
I wanted a tool that would validate the content of a locally-rendered chart, to find potential inconsistencies and mistakes before they hit CI. I created the following chart-validate
tool:
#!/bin/bash
set -eu
set -o pipefail
tmp_dir=$(mktemp -d)
chart=$1
shift
pushd ~/wmf/deployment-charts > /dev/null
chart-render ${chart} "$@" | kubectl-slice --input-file - --output-dir ${tmp_dir} --quiet
popd > /dev/null
pushd ${tmp_dir} > /dev/null
kubectl validate --version 1.23 --local-crds ~/wmf/deployment-charts/charts/calico-crds/templates/ . 2>&1 | python -c "import sys;print(sys.stdin.read().replace('\n\n','\n'))"
popd > /dev/null
Example:
$ chart-validate charts/airflow \
-f helmfile.d/dse-k8s-services/airflow-test-k8s/values-cloudnative-pg-cluster-dse-k8s-eqiad.yaml \
-f helmfile.d/dse-k8s-services/airflow-test-k8s/values.yaml
configmap-airflow-bash-executables.yaml...OK
configmap-airflow-config.yaml...OK
configmap-airflow-connections.yaml...OK
configmap-airflow-kerberos-client-config.yaml...OK
configmap-airflow-webserver-config.yaml...OK
configmap-gitsync-sparse-checkout-file.yaml...OK
deployment-airflow-scheduler.yaml...ERROR
spec.template.spec.containers[0].cmd: Invalid value: value provided for unknown field
deployment-airflow-webserver.yaml...OK
networkpolicy-airflow-release-name-egress-external-services-gitlab.yaml...OK
networkpolicy-airflow-release-name-egress-external-services-kafka.yaml...OK
networkpolicy-airflow-release-name-egress-external-services-kerberos.yaml...OK
networkpolicy-airflow-release-name-egress-external-services-postgresql.yaml...OK
networkpolicy-airflow-release-name-egress-external-services-wikimail.yaml...OK
networkpolicy-airflow-release-name.yaml...OK
pod-airflow-release-name-service-checker.yaml...OK
secret-airflow-release-name-secret-config.yaml...OK
service-airflow-release-name-tls-service.yaml...OK
Error: validation failed
Oh, right. The field is named command
, not cmd
. I make the change, and re-run chart-validate
.
$ chart-validate charts/airflow \
-f helmfile.d/dse-k8s-services/airflow-test-k8s/values-cloudnative-pg-cluster-dse-k8s-eqiad.yaml \
-f helmfile.d/dse-k8s-services/airflow-test-k8s/values.yaml
configmap-airflow-bash-executables.yaml...OK
configmap-airflow-config.yaml...OK
configmap-airflow-connections.yaml...OK
configmap-airflow-kerberos-client-config.yaml...OK
configmap-airflow-webserver-config.yaml...OK
configmap-gitsync-sparse-checkout-file.yaml...OK
deployment-airflow-scheduler.yaml...OK
deployment-airflow-webserver.yaml...OK
networkpolicy-airflow-release-name-egress-external-services-gitlab.yaml...OK
networkpolicy-airflow-release-name-egress-external-services-kafka.yaml...OK
networkpolicy-airflow-release-name-egress-external-services-kerberos.yaml...OK
networkpolicy-airflow-release-name-egress-external-services-postgresql.yaml...OK
networkpolicy-airflow-release-name-egress-external-services-wikimail.yaml...OK
networkpolicy-airflow-release-name.yaml...OK
pod-airflow-release-name-service-checker.yaml...OK
secret-airflow-release-name-secret-config.yaml...OK
service-airflow-release-name-tls-service.yaml...OK
Test out local changes in a test application in Kubernetes
Sometimes, you want to test your changes out before you can submit a patch, to make sure things work appropriately. Editing the values and charts in the /srv/deployment-charts
folder of the deployment server is out of the question, because it'll prevent any subsequent git pull
to run, thus blocking any change from being pulled and deployed.
What you can do though, is sync your chart and helmfile directories to your home directory, on the deployment server, slightly tweak the helmfile to use the local chart instead of the one published in chart-museum
and apply from there.
keepinsync
script, defined in https://wikitech.wikimedia.org/wiki/User:Brouberol#Synchronizing_a_directory_to_a_distant_host_(macOS_only)brouberol@local-workstation:~/wmf/deployment-charts $ keepinsync charts/airflow deployment.eqiad.wmnet:~/deployment-charts
# in another tab/terminal
brouberol@local-workstation:~/wmf/deployment-charts $ keepinsync helmfile.d/dse-k8s-services/airflow-test-k8s deployment.eqiad.wmnet:~/deployment-charts
Now, make the following change to the helmfile.yaml
file, to make sure that a) you don't spam IRC with your changes and b) helmfile
uses your local chart.
--- a/helmfile.d/dse-k8s-services/airflow-test-k8s/helmfile.yaml
+++ b/helmfile.d/dse-k8s-services/airflow-test-k8s/helmfile.yaml
@@ -13,22 +13,6 @@ helmDefaults:
- --kubeconfig
- {{ .Environment.Values | get "kubeConfig" (printf "/etc/kubernetes/airflow-test-k8s-deploy-%s.config" .Environment.Name) }}
-hooks:
- - events: ["prepare"]
- command: "helmfile_log_sal"
- args:
- [
- "{{`{{.HelmfileCommand}}`}}",
- "[{{`{{ .Environment.Name }}`}}] START helmfile.d/dse-k8s-services/airflow-test-k8s: {{`{{.HelmfileCommand}}`}}",
- ]
- - events: ["cleanup"]
- command: "helmfile_log_sal"
- args:
- [
- "{{`{{.HelmfileCommand}}`}}",
- "[{{`{{ .Environment.Name }}`}}] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: {{`{{.HelmfileCommand}}`}}",
- ]
-
releases:
- name: postgresql-airflow-test-k8s
namespace: airflow-test-k8s
@@ -46,7 +30,7 @@ releases:
- name: production
namespace: airflow-test-k8s
- chart: wmf-stable/airflow
+ chart: ../../../charts/airflow
# Allow to forcibly recreate pods by passing --state-values-set roll_restart=1 on the command line
recreatePods: {{ if (hasKey .Environment.Values "roll_restart") }}{{ eq .Environment.Values.roll_restart "1" }}{{ else }}false{{end}}
# This template gets applied for every release, all of which are applied in every environment
Once the changes are rsynced to the deployment server, you can simply run helmfile
from the deployment-charts
folder in your local directory.
brouberol@deploy2002:~$ cd deployment-charts/helmfile.d/dse-k8s-services/airflow-test-k8s/
brouberol@deploy2002:~/deployment-charts/helmfile.d/dse-k8s-services/airflow-test-k8s$ helmfile -e dse-k8s-eqiad diff --context=5
Building dependency release=production, chart=../../../charts/airflow
skipping missing values file matching "values.yaml"
Comparing release=production, chart=../../../charts/airflow
Comparing release=postgresql-airflow-test-k8s, chart=wmf-stable/cloudnative-pg-cluster
airflow-test-k8s, airflow-gitsync, Deployment (apps) has changed:
...
command: ["git-sync"]
args:
- "--repo=https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags.git"
- "--root=/dags"
- "--link=airflow_dags"
- - "--ref=main"
- - "--period=300s"
+ - "--ref=T380765"
+ - "--period=60s"
- "--depth=1"
- "--sparse-checkout-file=/etc/gitsync/sparse-checkout.conf"
securityContext:
allowPrivilegeEscalation: false
capabilities:
Useful remote commands
How to strace
the process running in a container
This allows us to run strace
on a running docker container, on a Kubernetes host.
Example: strace-container mpic-staging-5f488948c-2g64g mpic-staging -e 'trace=!epoll_wait'
strace-container() {
local pod_name=$1
local container_name=$2
shift
shift
local container_id=$(sudo docker ps | grep "${container_name}_${pod_name}" | awk '{ print $1 }')
local container_root_pid=$(sudo docker top ${container_id} | grep -v UID | awk '{ print $2 }')
# any additional argument will be passed to `strace`
sudo strace -f -p ${container_root_pid} $@
}
How to run a command into the network namespace of a container
This function allows us to run a command available on a Kubernetes worker host, within a container network namespace. It can be useful to, say, check whether a container process can open a TCP connection to a given host/port.
Example: nsenter-container aqs-http-gateway-main-56945ccf8b-fk8tz aqs-http-gateway-main telnet 10.2.2.38 8082
nsenter-container() {
local pod_name=$1
local container_name=$2
shift
shift
local container_id=$(sudo docker ps | grep "${container_name}_${pod_name}" | awk '{ print $1 }')
local container_root_pid=$(sudo docker top ${container_id} | grep -v UID | awk '{ print $2 }' | head -n 1)
# any additional will be passed to `nsenter`
sudo nsenter --target ${container_root_pid} --net $@
}
How to run tcpdump
on a pod IP
tcpdump-pod-ip() {
local pod_ip=$1
shift
# Any additional argument is passed to `tcpdump`
sudo tcpdump -i any host ${pod_ip} $@
}
Link a Calico Network Policy to Pods, IPs and ports
Pods "subscribe" to a Calico Network policy using a selector and a service name, allowing ingress from / egress to that service. However, all of this is hard to inspect, as it requires multiple kubectl describe
calls: one to inspect the calico network policy, one to inspect the service and associated endpoints, and one to list the pods matching the selector.
The following script allows to resolve all these selectors to actual pod names and service IP/port data.
brouberol@deploy1002:~$ kube_env superset-deploy dse-k8s-eqiad
brouberol@deploy1002:~$ ./inspect_calico_networkpolicy --help
usage: inspect_calico_networkpolicy [-h] [-n NAMESPACE] [-f {plain,json}]
Inspect the Calico NetworkPolicy objects in a given namespace and link each of
them to the set of attached Pods.
optional arguments:
-h, --help show this help message and exit
-n NAMESPACE, --namespace NAMESPACE
The namespace to inspect (default: the current one)
-f {plain,json}, --format {plain,json}
Output format (default=plain)
brouberol@deploy1002:~$ ./inspect_calico_networkpolicy
[+] NetworkPolicy superset-staging-egress-external-services-cas
Pods:
- superset-staging-8678649994-grb8z
Service cas-idp -> ips=208.80.153.12, 208.80.154.80, 2620:0:860:1:208:80:153:12, 2620:0:861:3:208:80:154:80, port=TCP/443
[+] NetworkPolicy superset-staging-egress-external-services-druid
Pods:
- superset-staging-8678649994-grb8z
Service druid-analytics -> ips=10.64.21.11, 10.64.36.101, 10.64.5.17, 10.64.5.36, 10.64.53.13, 2620:0:861:104:10:64:5:17, 2620:0:861:104:10:64:5:36, 2620:0:861:105:10:64:21:11, 2620:0:861:106:10:64:36:101, 2620:0:861:108:10:64:53:13, port=TCP/8081,TCP/8082,TCP/8083
Service druid-public -> ips=10.64.131.9, 10.64.132.12, 10.64.135.9, 10.64.16.171, 10.64.48.227, 2620:0:861:102:10:64:16:171, 2620:0:861:107:10:64:48:227, 2620:0:861:10a:10:64:131:9, 2620:0:861:10b:10:64:132:12, 2620:0:861:10e:10:64:135:9, port=TCP/8081,TCP/8082,TCP/8083
[+] NetworkPolicy superset-staging-egress-external-services-kerberos
Pods:
- superset-staging-8678649994-grb8z
Service kerberos-kdc -> ips=10.192.48.190, 10.64.0.112, 2620:0:860:104:10:192:48:190, 2620:0:861:101:10:64:0:112, port=UDP/88,TCP/88
[+] NetworkPolicy superset-staging-egress-external-services-presto
Pods:
- superset-staging-8678649994-grb8z
Service presto-analytics -> ips=10.64.138.7, 10.64.142.6, 2620:0:861:100:10:64:138:7, 2620:0:861:114:10:64:142:6, port=TCP/8280,TCP/8281
brouberol@deploy1002:~$ ./inspect_calico_networkpolicy --format json
{
"superset-staging-egress-external-services-cas": {
"pods": [
"superset-staging-8678649994-grb8z"
],
"services": [
{
"name": "cas-idp",
"ips": [
"208.80.153.12",
"208.80.154.80",
"2620:0:860:1:208:80:153:12",
"2620:0:861:3:208:80:154:80"
],
"ports": [
"TCP/443"
]
}
]
},
"superset-staging-egress-external-services-druid": {
"pods": [
"superset-staging-8678649994-grb8z"
],
"services": [
{
"name": "druid-analytics",
"ips": [
"10.64.21.11",
"10.64.36.101",
"10.64.5.17",
"10.64.5.36",
"10.64.53.13",
"2620:0:861:104:10:64:5:17",
"2620:0:861:104:10:64:5:36",
"2620:0:861:105:10:64:21:11",
"2620:0:861:106:10:64:36:101",
"2620:0:861:108:10:64:53:13"
],
"ports": [
"TCP/8081",
"TCP/8082",
"TCP/8083"
]
},
{
"name": "druid-public",
"ips": [
"10.64.131.9",
"10.64.132.12",
"10.64.135.9",
"10.64.16.171",
"10.64.48.227",
"2620:0:861:102:10:64:16:171",
"2620:0:861:107:10:64:48:227",
"2620:0:861:10a:10:64:131:9",
"2620:0:861:10b:10:64:132:12",
"2620:0:861:10e:10:64:135:9"
],
"ports": [
"TCP/8081",
"TCP/8082",
"TCP/8083"
]
}
]
},
"superset-staging-egress-external-services-kerberos": {
"pods": [
"superset-staging-8678649994-grb8z"
],
"services": [
{
"name": "kerberos-kdc",
"ips": [
"10.192.48.190",
"10.64.0.112",
"2620:0:860:104:10:192:48:190",
"2620:0:861:101:10:64:0:112"
],
"ports": [
"UDP/88",
"TCP/88"
]
}
]
},
"superset-staging-egress-external-services-presto": {
"pods": [
"superset-staging-8678649994-grb8z"
],
"services": [
{
"name": "presto-analytics",
"ips": [
"10.64.138.7",
"10.64.142.6",
"2620:0:861:100:10:64:138:7",
"2620:0:861:114:10:64:142:6"
],
"ports": [
"TCP/8280",
"TCP/8281"
]
}
]
}
}
#!/usr/bin/env python3
"""
Inspect the Calico NetworkPolicy objects in a given namespace and link
each of them to the set of attached Pods.
"""
import argparse
import subprocess
import json
import sys
from collections import defaultdict
from typing import Dict, Optional, List
WARNING = "\033[93m"
ENDC = "\033[0m"
log_format = None
def parse_args():
parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument(
"-n",
"--namespace",
help="The namespace to inspect (default: the current one)",
type=str,
)
parser.add_argument(
"-f",
"--format",
help="Output format (default=plain)",
choices=("plain", "json"),
default="plain",
)
return parser.parse_args()
def log(msg):
if log_format == "plain":
print(msg)
def warn(msg: str):
sys.stderr.write(f"{WARNING}{msg}{ENDC}\n")
def kubectl_get(
resource_type: str,
resource_name: Optional[str] = None,
label_selector: Optional[str] = None,
namespace: Optional[str] = None,
) -> List:
cmd = ["kubectl", "get", resource_type]
if label_selector:
cmd.extend(["-l", label_selector])
if namespace:
cmd.extend(["-n", namespace])
cmd.extend(["-o", "json"])
if resource_name:
cmd.append(resource_name)
returncmd = subprocess.run(cmd, capture_output=True)
try:
data = json.loads(returncmd.stdout)
except json.decoder.JSONDecodeError:
return {} if resource_name else []
if resource_name:
return data
return data["items"]
def calico_selector_to_label_selector(calico_selector: str) -> str:
label_selector = []
tokens = calico_selector.split(" && ")
for token in tokens:
selector = token.replace(" == ", "=")
label_selector.append(selector.replace("'", ""))
return ",".join(label_selector)
def parse_network_policy_egress_destinations(egress_destinations: dict) -> list[dict]:
services = []
for egress_destination in egress_destinations:
egress_service = egress_destination["destination"]["services"]
egress_service_endpoints = kubectl_get(
"endpoints",
egress_service["name"],
namespace=egress_service["namespace"],
)
if not egress_service_endpoints:
warn(
f"The Service {egress_service['name']} does not exist. The network policy won't have any effect."
)
warnings += 1
continue
ips = [
addr["ip"] for addr in egress_service_endpoints["subsets"][0]["addresses"]
]
ports = [
f"{port['protocol']}/{port['port']}"
for port in egress_service_endpoints["subsets"][0]["ports"]
]
log(
f" Service {egress_service['name']} -> ips={', '.join(ips)}, port={','.join(ports)}"
)
service = {"name": egress_service["name"], "ips": ips, "ports": ports}
services.append(service)
return services
def parse_network_policy_ingress_rules(ingress_rules: dict) -> list[dict]:
services = []
for ingress_rule in ingress_rules:
source, destination = ingress_rule["source"], ingress_rule["destination"]
source_service, source_ns = (
source["services"]["name"],
source["services"]["namespace"],
)
destination_service, destination_namespace = (
destination["services"]["name"],
destination["services"]["namespace"],
)
destination_service_ports = destination["services"].get("ports", ["*"])
log(
f" Service {source_service}.{source_ns} -> f{destination_service}.{destination_namespace}:{'/'.join(destination_service_ports)}"
)
service = {
"source": {
"name": source_service,
"namespace": source_ns,
},
"destination": {
"name": destination_service,
"namespace": destination_namespace,
"ports": destination_service_ports,
},
}
services.append(service)
return services
def main():
global log_format
output = defaultdict(lambda: defaultdict(lambda: defaultdict(dict)))
warnings = 0
args = parse_args()
log_format = args.format
network_policies = kubectl_get(
"networkpolicies.crd.projectcalico.org", namespace=args.namespace
)
for network_policy in network_policies:
network_policy_name = network_policy["metadata"]["name"]
namespace = network_policy["metadata"]["annotations"][
"meta.helm.sh/release-namespace"
]
selector = network_policy["spec"]["selector"]
label_selector = calico_selector_to_label_selector(selector)
log(f"[+] NetworkPolicy {network_policy_name}")
pods = kubectl_get("pod", label_selector=label_selector, namespace=namespace)
output[network_policy_name]["pods"] = [pod["metadata"]["name"] for pod in pods]
if not pods:
log(" No pods found")
else:
log(" Pods:")
for pod in pods:
log(f" - {pod['metadata']['name']}")
if "egress" in network_policy["spec"]:
output[network_policy_name]['egress']["services"] = (
parse_network_policy_egress_destinations(
network_policy["spec"]["egress"]
)
)
if "ingress" in network_policy["spec"]:
output[network_policy_name]["ingress"]["services"] = (
parse_network_policy_ingress_rules(network_policy["spec"]["ingress"])
)
if args.format == "json":
print(json.dumps(output, indent=2))
sys.exit(0 if not warnings else 1)
if __name__ == "__main__":
main()
Get CPU usage per user
for user in $(cat /etc/passwd | cut -d: -f1); do printf "$user " && top -b -n 1 -u $user | awk 'NR>7 { sum += $9; } END { print sum; }' ; done | sort -n -k2
Finding the published docker image name and tag from the logs of a Gitlab image publishing pipeline
Repositories such as https://gitlab.wikimedia.org/repos/data-engineering/airflow publish multiple docker images everytime a commit hits main
and we find ourselves in need of manually fetching the tag from each publish job log.
The following CLI tool with do that for us:
#!/usr/bin/env python3
"""Extract published docker image names and tags from publishing gitlab CI jobs"""
import gitlab
import argparse
import sys
import requests
import re
import logging
logging.basicConfig(format="[%(asctime)s] %(levelname)s - %(message)s", level="INFO")
def parse_args():
parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument(
"-r", "--repository", help="The name of the gitlab repository", required=True
)
parser.add_argument(
"-p",
"--pipeline",
help="URL of the publishing pipeline. If none is passed, the latest pipeline to have run at the time will be used.",
)
return parser.parse_args()
def main():
args = parse_args()
gl = gitlab.Gitlab("https://gitlab.wikimedia.org")
try:
project = gl.projects.get(args.repository)
except gitlab.exceptions.GitlabGetError:
print(f"Project {args.repository} not found")
sys.exit(1)
if args.pipeline:
try:
pipeline = project.pipelines.get(args.pipeline)
except gitlab.exceptions.GitlabGetError:
print(f"Pipeline {args.pipeline} not found for project {args.repository}")
sys.exit(1)
else:
pipeline = project.pipelines.latest(ref='main')
logging.info(f"Fetching job details from pipeline {pipeline.web_url}")
publish_jobs = [
job for job in pipeline.jobs.list() if job.name.startswith("publish:")
]
if not publish_jobs:
logging.warning(f"No publish jobs were found in pipeline {pipeline.id}")
sys.exit(1)
elif len([job for job in publish_jobs if job.status == 'success']) < len(publish_jobs):
print("The pipeline is still running. Aborting.")
sys.exit(1)
images = []
for publish_job in publish_jobs:
logging.info(
f"Fetching image details from job {publish_job.name}: {publish_job.web_url}"
)
publish_job_log = requests.get(
f"https://gitlab.wikimedia.org/{args.repository}/-/jobs/{publish_job.id}/raw"
).text
for match in re.finditer(
r"#\d+ pushing manifest for (.+) [\d\.]+s done", publish_job_log
):
images.append(match.group(1))
for image in images:
print(image)
if __name__ == "__main__":
main()
$ get-image-tags-from-pipeline -r repos/data-engineering/airflow
[2024-11-28 17:06:27,755] INFO - Fetching job details from pipeline https://gitlab.wikimedia.org/repos/data-engineering/airflow/-/pipelines/83423
[2024-11-28 17:06:28,221] INFO - Fetching image details from job publish:airflow-base: https://gitlab.wikimedia.org/repos/data-engineering/airflow/-/jobs/400204
docker-registry.discovery.wmnet/repos/data-engineering/airflow:2024-11-20-130943-4ca02ad5b46df5377ad1535657c302a6ea919340@sha256:61c8740217be4a4e13c3d85deca9b9659e4128ac3399f3cf912e07e9c1f2ad75
Quickly setup an environment to test local deployment-charts changes remotely (macOS / iTerm2 only)
I have to User:Brouberol#Test out local changes in a test application in Kubernetes enough times that I wanted to setup the development environment as quickly as possible. I'm not a tmux
user, I tend to stick to iTerm2 , that allows me to script terminal interaction in Python.
Scripts > New Python Script
to define a Python script interacting with your terminal.
I have defined the following script to setup the whole dev env, which is 2 keepinsync
commands, an ssh
session in which I run the helmfile
commands, and an ssh
session streaming the pod states.
#!/usr/bin/env python3.10
import iterm2
# This script was created with the "basic" environment which does not support adding dependencies
# with pip.
WMF_ROOT = "/Users/brouberol/wmf"
DEPLOYMENT_CHARTS = "deployment-charts"
DEPLOY_SERVER = "deployment.eqiad.wmnet"
CHART = "airflow"
INSTANCE = "airflow-test-k8s"
commands_by_panel_position = {
"top_left": [
f"cd {WMF_ROOT}/{DEPLOYMENT_CHARTS}",
f"keepinsync ./charts/{CHART} {DEPLOY_SERVER}:~/{DEPLOYMENT_CHARTS}",
],
"bottom_left": [
f"cd {WMF_ROOT}/{DEPLOYMENT_CHARTS}",
f"keepinsync ./helmfile.d/dse-k8s-services/{INSTANCE} {DEPLOY_SERVER}:~/{DEPLOYMENT_CHARTS}",
],
"top_right": [
f"ssh {DEPLOY_SERVER}",
f"cd {DEPLOYMENT_CHARTS}/helmfile.d/dse-k8s-services/{INSTANCE}",
],
"bottom_right": [
f"ssh {DEPLOY_SERVER}",
f"kube_env {INSTANCE}-deploy dse-k8s-eqiad",
"kubectl get pod -w",
],
}
async def main(connection):
app = await iterm2.async_get_app(connection)
window = app.current_terminal_window
panels = {}
if window is not None:
# Create a new tab
tab = await window.async_create_tab()
# Store a reference to the left pane (initial session)
panels["top_left"] = tab.current_session
# Create a right pane by splitting the left one vertically
panels["top_right"] = await panels["top_left"].async_split_pane(vertical=True)
# Split the right pane into top and bottom panes
panels["bottom_right"] = await panels["top_right"].async_split_pane(
vertical=False
)
# Split the left pane into top and bottom using the stored reference
panels["bottom_left"] = await panels["top_left"].async_split_pane(
vertical=False
)
# Execute commands
for position, commands in commands_by_panel_position.items():
for command in commands:
await panels[position].async_send_text(f"{command}\n")
else:
print("No current window")
iterm2.run_until_complete(main)
I can then click on Scripts > deployment-charts-airflow-test-k8s.py
to have everything setup for me.