Jump to content

Machine Learning/LiftWing/ML-Sandbox/Configuration

From Wikitech

Installation + Configuration script for ML-Sandbox.

Summary

This is a guide for installing the KServe stack locally using WMF tools and images. The install steps diverge from the official KServe quick_install script in order to run on WMF infrastructure. All upstream changes to YAML configs were first published in the KServe chart’s README for the deployment-charts repository. In deployment-charts/custom_deploy.d/istio/ml-serve there is the config.yaml that we apply in production.

Software pre-requisites

Before we set up our local cluster, we need to locally install: Minikube, kubectl, Helm, Istioctl, Minio, s3cmd.

Install on MacOS

# Install current version of Istioctl
curl -L https://istio.io/downloadIstio | ISTIO_VERSION=1.15.7 TARGET_ARCH=arm64 sh -

# You can install remaining software using homebrew
brew install minikube
brew install kubectl 
brew install helm
brew install s3cmd
brew install minio/stable/mc

Install on Linux

Many software packages are available inside WMF APT repository.

# Install Minikube
curl -LO https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64
sudo install minikube-linux-amd64 /usr/local/bin/minikube

# Install packages from our APT repository
sudo apt install helm
sudo apt install istioctl -y

To install remaining software (kubectl, Minio, s3cmd), please follow the documentation.

Start Minikube cluster

Start the cluster matching our production Kubernetes version:

minikube start --kubernetes-version=v1.23.14

Install Istio operator on the cluster

Istio namespace

Run the command below in your terminal to create Istio namespace:

cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Namespace
metadata:
  name: istio-system
  labels:
    istio-injection: disabled
EOF

Istio operator

To install the operator, first create the file `istio-minimal-operator.yaml` with the following manifest:

apiVersion: install.istio.io/v1beta1
kind: IstioOperator
spec:
  values:
    global:
      proxy:
        autoInject: disabled
      useMCP: false
      # The third-party-jwt is not enabled on all k8s.
      # See: https://istio.io/docs/ops/best-practices/security/#configure-third-party-service-account-tokens
      jwtPolicy: first-party-jwt

  meshConfig:
    accessLogFile: /dev/stdout

  addonComponents:
    pilot:
      enabled: true

  components:
    ingressGateways:
      - name: istio-ingressgateway
        enabled: true
      - name: cluster-local-gateway
        enabled: true
        label:
          istio: cluster-local-gateway
          app: cluster-local-gateway
        k8s:
          service:
            type: ClusterIP
            ports:
            - port: 15020
              targetPort: 15021
              name: status-port
            - port: 80
              name: http2
              targetPort: 8080
            - port: 443
              name: https
              targetPort: 8443

Next you can apply the manifest using istioctl:

istioctl manifest apply -f istio-minimal-operator.yaml -y

Clone deployment charts

To deploy Knative and Kserve, we'll use their charts in our deployment charts repository.

git clone "https://gerrit.wikimedia.org/r/operations/deployment-charts"

Install Calico NetworkPolicy CRDs

We're using NetworkPolicy CRDs from Calico in Knative and Kserve. First, make sure to install those CRDs:

kubectl apply -f https://raw.githubusercontent.com/projectcalico/calico/v3.26.1/manifests/crds.yaml

Deploy Knative

Available Images

To learn more about the newest available Knative images, you can check the docker registry:

Create knative-serving namespace

First, let’s create a namespace for knative-serving:

cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Namespace
metadata:
  name: knative-serving
  labels:
    serving.knative.dev/release: "v1.7.2"
EOF

Deploy Knative charts

First, let's deploy the CRDs:

helm install deployment-charts/charts/knative-serving-crds knative-serving-crds

Next, you can install the serving chart:

helm install deployment-charts/charts/knative-serving knative-serving

Next we need to add registries skipping tag resolving etc.:

cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: ConfigMap
metadata:
  name: config-deployment
  namespace: knative-serving
data:
  queueSidecarImage: docker-registry.wikimedia.org/knative-serving-queue:1.7.2-7
  registriesSkippingTagResolving: "kind.local,ko.local,dev.local,docker-registry.wikimedia.org,index.docker.io"
EOF

Deploy KServe

Images

Create kserve namespace

cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Namespace
metadata:
  labels:
    control-plane: kserve-controller-manager
    controller-tools.k8s.io: "1.0"
    istio-injection: disabled
  name: kserve
EOF

Deploy Kserve charts

First, let's install the base kserve chart:

helm install deployment-charts/charts/kserve kserve

Next, we can install the kserve-inference chart:

helm install deployment-charts/charts/kserve-inference kserve-inference

Install self-signed certificates

We have everything needed to run kserve, however we still need to deal with tls certificate. We will use the self-signed-ca available in the kserve repo: https://raw.githubusercontent.com/kserve/kserve/v0.15.2/hack/self-signed-ca.sh

First, delete the existing secrets:

kubectl delete secret kserve-webhook-server-cert -n kserve
kubectl delete secret kserve-webhook-server-secret -n kserve

Now copy the self-signed script and execute it:

curl -LJ0 https://raw.githubusercontent.com/kserve/kserve/v0.11.2/hack/self-signed-ca.sh > self-signed-ca.sh
chmod +x self-signed-ca.sh
./self-signed-ca.sh

Deploy kserve-test namespace

Now, we can create namespace where we will deploy our services:

kubectl create namespace kserve-test

Deploy Minio

This is an optional step for using minio for model storage in your development cluster. In Production, we us Thanos Swift to store our model binaries, however, we can use something more adhoc for local dev.

This will mostly follow the document here: https://github.com/kserve/website/blob/main/docs/modelserving/kafka/kafka.md

Create Minio Service

First we create a file called minio.yaml, with the following contents:

---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: minio
  name: minio
  namespace: kserve-test
spec:
  progressDeadlineSeconds: 600
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: minio
  strategy:
    type: Recreate
  template:
    metadata:
      labels:
        app: minio
    spec:
      containers:
      - args:
        - server
        - /data
        env:
        - name: MINIO_ACCESS_KEY
          value: minio
        - name: MINIO_SECRET_KEY
          value: minio123
        image: minio/minio:RELEASE.2020-10-18T21-54-12Z
        imagePullPolicy: IfNotPresent
        name: minio
        ports:
        - containerPort: 9000
          protocol: TCP
---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: minio
  name: minio-service
spec:
  ports:
    - port: 9000
      protocol: TCP
      targetPort: 9000
  selector:
    app: minio
  type: ClusterIP

Now, you can install the minio test instance to your cluster:

kubectl apply -f minio.yaml -n kserve-test

Deploy Secrets to interact with Minio

Now we need to create an s3 secret for minio and attach it to a service account. Create a file `s3-secret.yaml` with the following contents:

apiVersion: v1
kind: Secret
metadata:
  name: storage-secret
  annotations:
     serving.kserve.io/s3-endpoint: minio-service.kserve-test:9000 # replace with your s3 endpoint
     serving.kserve.io/s3-usehttps: "0" # by default 1, for testing with minio you need to set to 0
     serving.kserve.io/s3-verifyssl: "0"
     serving.kserve.io/s3-region: us-east-1
type: Opaque
stringData:
  AWS_ACCESS_KEY_ID: minio
  AWS_SECRET_ACCESS_KEY: minio123
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: sa
secrets:
- name: storage-secret
---

and we can apply it as follows:

kubectl apply -f s3-secret.yaml -n kserve-test

Create storage bucket in Minio

First, we need to port-forward our minio test app in a different terminal window

# Run port forwarding command in a different terminal
kubectl port-forward $(kubectl get pod -n kserve-test --selector="app=minio" --output jsonpath='{.items[0].metadata.name}') 9000:9000 -n kserve-test

Now lets add our test instance and create a bucket for model storage

mc alias set myminio http://127.0.0.1:9000 minio minio123
mc mb myminio/wmf-ml-models

Upload model to Minio bucket

Upload manually

You should be able to upload a model binary file as follows:

mc cp model.bin myminio/wmf-ml-models/

Upload via model-upload script

You can use the modelupload.sh script to handle model uploads to minio. First you need to create a s3cmd config file called ~/.s3cfg:

# Setup endpoint
host_base = 127.0.0.1:9000
host_bucket = 127.0.01:9000
bucket_location = us-east-1
use_https = False

# Setup access keys
access_key =  minio
secret_key = minio123

# Enable S3 v4 signature APIs
signature_v2 = False

Now you can download the modelupload script and use in on the ml-sandbox:

curl -LJ0 https://gitlab.wikimedia.org/accraze/ml-utils/-/raw/main/model_upload.sh > model_upload.sh
chmod +x model_upload.sh
./model_upload.sh model.bin articlequality enwiki wmf-ml-models ~/.s3cfg

Deploy InferenceService

Finally, when you create an Inference service, you can point it at the new minio bucket (s3://wmf-ml-models), just make sure to add the serviceAccountName “sa” to the container that has a storage uri.

Example Inference Service spec:

apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
  name: enwiki-goodfaith
  annotations:
    sidecar.istio.io/inject: "false"
spec:
  predictor:
    serviceAccountName: sa
    containers:
      - name: kfserving-container
        image: docker-registry.wikimedia.org/wikimedia/machinelearning-liftwing-inference-services-editquality:2021-07-28-204847-production
        env:
          # TODO: https://phabricator.wikimedia.org/T284091
          - name: STORAGE_URI
            value: "s3://wmf-ml-models/"
          - name: INFERENCE_NAME
            value: "enwiki-goodfaith"

Notes

Delete cluster

Sometimes you might need to destroy the cluster and rebuild. Here is a helpful command:

minikube delete --purge --all
minikube start --kubernetes-version=v1.16.15  --cpus 4 --memory 8192 --driver=docker --force