Machine Learning/LiftWing
Lift Wing
A scalable machine learning model serving infrastructure on Kubernetes using KServe.
- Phabricator MVP Task: T272917
Stack
Software | Version |
---|---|
Kubernetes | v1.16.5 |
Istio | v1.9.5 |
Knative | v0.18.1 |
KServe | v0.8.0 |
Istio
Istio is a service-mesh where we can run our ML-services. It is installed using the istioctl package, which has been added to the WMF APT repository (Debian buster). See packages, we are currently running Istio 1.9.5 (istioctl: 1.9.5-1)
Knative
We use Knative Serving for running serverless containers on Kubernetes using Istio. It also allows for various deployment strategies like canary, blue-green, A/B tests, etc.
Charts
Images
KServe
We use KServe for its custom InferenceService
resource. It enables us to expose our ML models as asynchronous micro-services.
Charts
Images
Hosts
eqiad
- ml-serve1001-4
codfw
- ml-serve2001-4
- ml-staging200[12]
Components
Monitoring
- Grafana - KServe
- Grafana - Knative Serving
Serving
We host our Machine Learning models as Inference Services (isvcs), which are asynchronous micro-services that can transform raw feature data and make predictions. Each inference service has production images that are published in the WMF Docker Registry via the Deployment Pipeline. These images are then used for an isvc configuration in our ml-services helmfile in the operations/deployment-charts repo.
- Model Deployment Guide: Machine Learning/LiftWing/Deploy
- Inference Service Docs: Machine_Learning/LiftWing/Inference Services
Storage
We store model binary files in Swift, which is an open-source s3-compatible object store that is widely-used across the WMF. The model files are downloaded by the storage-initializer (init:container) when an Inference Service pod is created. The storage-initializer then mounts the model binary in the pod at /mnt/models/
and can be loaded by the predictor container.
- Model Upload info: Machine_Learning/LiftWing/Deploy#How_to_upload_a_model_to_Swift
Development
We are developing inference services with Docker and testing on the ML Sandbox using our own WMF KServe images & charts.
- KServe Guide: Machine Learning/LiftWing/KServe
- Production Image Development Guide: Machine Learning/LiftWing/Inference Services/Production Image Development
- ML-Sandbox Guide: Machine Learning/LiftWing/ML-Sandbox
We previously used multiple sandbox clusters running MiniKF.
Services
We are serving ML models as Inference Services, which are containerized applications. The code is currently hosted on Gerrit.
Gerrit mono-repo: https://gerrit.wikimedia.org/r/plugins/gitiles/machinelearning/liftwing/inference-services
Github mirror: https://github.com/wikimedia/machinelearning-liftwing-inference-services
Current Inference Services
- Revscoring models (migrated from ORES)
Model Type | Model Name | Kubernetes Namespace | Docker Image |
---|---|---|---|
articlequality | enwiki-articlequality, euwiki-articlequality, fawiki-articlequality, frwiki-articlequality, glwiki-articlequality, nlwiki-articlequality, ptwiki-articlequality, ruwiki-articlequality, svwiki-articlequality, trwiki-articlequality, ukwiki-articlequality, wikidatawiki-itemquality | revscoring-articlequality | revscoring |
draftquality | enwiki-draftquality, ptwiki-draftquality | revscoring-draftquality | |
damaging | arwiki-damaging, bswiki-damaging, cawiki-damaging, cswiki-damaging, dewiki-damaging, enwiki-damaging, eswikibooks-damaging, eswiki-damaging, eswikiquote-damaging, etwiki-damaging, fawiki-damaging, fiwiki-damaging, frwiki-damaging, hewiki-damaging, hiwiki-damaging, huwiki-damaging, itwiki-damaging, jawiki-damaging, kowiki-damaging, lvwiki-damaging, nlwiki-damaging, nowiki-damaging, plwiki-damaging, ptwiki-damaging, rowiki-damaging, ruwiki-damaging, sqwiki-damaging, srwiki-damaging, svwiki-damaging, ukwiki-damaging, wikidatawiki-damaging, zhwiki-damaging | revscoring-editquality-damaging | |
goodfaith | arwiki-goodfaith, bswiki-goodfaith, cawiki-goodfaith, cswiki-goodfaith, dewiki-goodfaith, enwiki-goodfaith, eswikibooks-goodfaith, eswiki-goodfaith, eswikiquote-goodfaith, etwiki-goodfaith, fawiki-goodfaith, fiwiki-goodfaith, frwiki-goodfaith, hewiki-goodfaith, hiwiki-goodfaith, huwiki-goodfaith, itwiki-goodfaith, jawiki-goodfaith, kowiki-goodfaith, lvwiki-goodfaith, nlwiki-goodfaith, nowiki-goodfaith, plwiki-goodfaith, ptwiki-goodfaith, rowiki-goodfaith, ruwiki-goodfaith, sqwiki-goodfaith, srwiki-goodfaith, svwiki-goodfaith, ukwiki-goodfaith, wikidatawiki-goodfaith, zhwiki-goodfaith | revscoring-editquality-goodfaith | |
reverted | bnwiki-reverted, elwiki-reverted, enwiktionary-reverted, glwiki-reverted, hrwiki-reverted, idwiki-reverted, iswiki-reverted, tawiki-reverted, viwiki-reverted | revscoring-editquality-reverted | |
articletopic | arwiki-articletopic, cswiki-articletopic, enwiki-articletopic, euwiki-articletopic, huwiki-articletopic, hywiki-articletopic, kowiki-articletopic, srwiki-articletopic, ukwiki-articletopic, viwiki-articletopic, wikidatawiki-itemtopic | revscoring-articletopic | |
drafttopic | arwiki-drafttopic, cswiki-drafttopic, enwiki-drafttopic, euwiki-drafttopic, huwiki-drafttopic, hywiki-drafttopic, kowiki-drafttopic, srwiki-drafttopic, ukwiki-drafttopic, viwiki-drafttopic | revscoring-drafttopic |
- Language agnostic models
Model Name | Kubernetes Namespace | Docker Image | Model Card |
---|---|---|---|
outlink-topic-model | articletopic-outlink | outlink, outlink-transformer | Language_agnostic_link-based_article_topic_model_card |
revert-risk-model | experimental | revertrisk |
Usage
Internal endpoints
Once an InferenceService is deployed, it'll become available internally via
https://inference.svc.codfw.wmnet:30443/v1/models/{MODEL_NAME}:predict
(codfw)https://inference.svc.eqiad.wmnet:30443/v1/models/{MODEL_NAME}:predict
(eqiad)https://inference.discovery.wmnet:30443/v1/models/{MODEL_NAME}:predict
(both)https://inference-staging.svc.codfw.wmnet:30443/v1/models/{MODEL_NAME}:predict
(staging)
with the HTTP Host header: {MODEL_NAME}.{KUBERNETES_NAMESPACE}.wikimedia.org
You can find {MODEL_NAME} and {KUBERNETES_NAMESPACE} in the tables in the previous section.
Note that the revscoring model group has its own model for each supported wiki, so the {MODEL_NAME} combines a wiki code and its model type i.e. {wiki_code}wiki-{model_type}
. For example, enwiki-articlequality, arwiki-damaging, bnwiki-reverted, eswikibookswiki-goodfaith.
via Curl
The way to query enwiki-goodfaith
model via curl:
aikochou@stat1004:~$ cat input.json
{ "rev_id": 1083325118 }
aikochou@stat1004:~$ curl "https://inference.discovery.wmnet:30443/v1/models/enwiki-goodfaith:predict" -X POST -d @input.json -i -H "Host: enwiki-goodfaith.revscoring-editquality-goodfaith.wikimedia.org" --http1.1
HTTP/1.1 200 OK
content-length: 209
content-type: application/json; charset=UTF-8
date: Mon, 31 Oct 2022 16:51:54 GMT
server: istio-envoy
x-envoy-upstream-service-time: 361
{"enwiki": {"models": {"goodfaith": {"version": "0.5.1"}}, "scores": {"1083325118": {"goodfaith": {"score": {"prediction": true, "probability": {"false": 0.033641298577500645, "true": 0.9663587014224994}}}}}}}
If you get error curl: (60) SSL certificate problem: unable to get local issuer certificate
, using
-k
to ignore this error (like python requests verify=false), or--cacert /etc/ssl/certs/wmf-ca-certificates.crt
to add the CA certificate.
If you get error curl: (56) Received HTTP code 403 from proxy after CONNECT
, try with
unset https_proxy
to unset your http proxy environment variables.
via Python
The way to query outlink-topic-model
via python:
import os
import json
import requests
os.environ['REQUESTS_CA_BUNDLE'] = "/etc/ssl/certs/wmf-ca-certificates.crt"
inference_url = 'https://inference.discovery.wmnet:30443/v1/models/outlink-topic-model:predict'
headers = {
'Host': 'outlink-topic-model.articletopic-outlink.wikimedia.org',
'Content-Type': 'application/x-www-form-urlencoded',
}
data = {"lang": "en", "page_title": "Wings of Fire (novel series)"}
response = requests.post(inference_url, headers=headers, data=json.dumps(data))
print(response.text)
If you get an error message:
requests.exceptions.ProxyError: HTTPSConnectionPool(host='inference.discovery.wmnet', port=30443): Max retries exceeded with url: /v1/models/outlink-topic-model:predict (Caused by ProxyError('Cannot connect to proxy.', OSError('Tunnel connection failed: 403 Forbidden')))
Try with unset https_proxy
and run the script again.