Machine Learning/LiftWing/Inference Services/Production Image Development

From Wikitech

Summary

This is a guide to developing model-servers and module images using WMF release infrastructure. For example purposes, we will be creating a production image for the draftquality model-server (a.k.a. predictor).

Blubber

We create Dockerfiles using an abstraction for container build configurations called Blubber. If you want to publish a production-ready image to the WMF Docker Registry, you will need to develop a Blubberfile that can be used in the Wikimedia Deployment Pipeline.

Developing a Blubberfile

Let’s start our blubber.yaml file with some basic declarations. We want to use version v4 of the Blubber spec. Also we define our base image to build off of (buster in this case). Also we define our image to run insecurely in order to write to filesystem/cache during testing.

version: v4
base: docker-registry.wikimedia.org/buster:20220109
runs:
  insecurely: true

Next we define our working directory within the generated Dockerfile.

lives:
  in: /srv/draftquality

Variants

Blubber uses the concept of variants to create multi-stage builds. You can have any number of variants, although it is a good idea to have at least a test variant and also a production variant. You can also create a build variant that shares the common configurations that are needed by other variants.

Build

Let’s start with our build variant. We need to declare the Python version we want to run and also point to where the requirement files are located.

variants:
  build:
    python:
      version: python3.7
      requirements: [revscoring/draftquality/model-server/requirements.txt, python/requirements.txt]

Next, we need to specify which packages we wish to install from APT for image.

    apt:
      packages:
        - python3-pip
        - python3-dev
        - python3-setuptools
        - g++
        - git
        - gfortran
        - liblapack-dev
        - libopenblas-dev
        - libenchant1c2a

We will also need to run some commands to finish up installing all the required assets, which we will do using builder command.

    builder:
      # FIXME: path hack - see: https://phabricator.wikimedia.org/T267685
      command: ["PYTHONPATH=/opt/lib/python/site-packages", "python3.7", "-m",
        "nltk.downloader", "omw", "sentiwordnet", "stopwords", "wordnet"]

Production

Now let’s create a production variant that relies on packages installed by the build variant. We will use the copies key and copy over the model-server source code and the files in the shared Python directory from our local filesystem, the nltk data (stopwords) and also all other dependencies installed via pip.

 production:
    copies:
      - from: local
        source: revscoring/draftquality/model-server
        destination: model-server
      - from: local
        source: python/*.py
        destination: model-server/
      - from: build
        source: /home/somebody/nltk_data
        destination: /home/somebody/nltk_data
      - from: build
        source: /opt/lib/python/site-packages
        destination: /opt/lib/python/site-packages

We define the Python requirements for the production image and which packages we wish to install from APT for image.

   python:
      version: python3.7
      use-system-flag: false
   apt:
      packages:
        - python3
        - liblapack3
        - libopenblas-base
        - libenchant1c2a
        - aspell-ar
        - aspell-bn
        - aspell-el
        - hunspell-id
        - aspell-is
        - aspell-pl
        - aspell-ro
        - aspell-sv
        - aspell-ta
        - aspell-uk
        - myspell-cs
        - myspell-de-at
        - myspell-de-ch
        - myspell-de-de
        - myspell-es
        - myspell-et
        - myspell-fa
        - myspell-fr
        - myspell-he
        - myspell-hr
        - myspell-hu
        - myspell-lv
        - myspell-nb
        - myspell-nl
        - myspell-pt-pt
        - myspell-pt-br
        - myspell-ru
        - myspell-hr
        - hunspell-bs
        - hunspell-ca
        - hunspell-en-au
        - hunspell-en-us
        - hunspell-en-gb
        - hunspell-eu
        - hunspell-gl
        - hunspell-it
        - hunspell-hi
        - hunspell-sr
        - hunspell-vi
        - voikko-fi
        - wmf-certificates

Lastly, we define the entrypoint to our application when the image is run as a container.

    entrypoint: ["python3",  "model-server/model.py"]

Test

The test variant is going to be similar to the other two however it is much more light-weight. This variant simply runs some tests via tox (which is configured in the tox.ini in app)

Let’s start by defining the packages we need from APT

 test:
    apt:
      packages:
        - python3-pip
        - python3-setuptools

Next, let's copy over the source files from our local filesystem

    copies:
      - from: local
        source: revscoring/draftquality/model-server
        destination: model-server

Let’s define our Python requirements for testing:

    python:
      version: python3.7
      requirements: [revscoring/draftquality/model-server/requirements-test.txt]
      use-system-flag: false

Finally, let's invoke tox to run our tests when the image is run as a container:

    entrypoint: ["tox", "-c", "model-server/tox.ini"]

See the complete blubberfile for the draftquality model-server.


Testing your Blubberfile

Now that we have created a Blubberfile, let's test it using the Blubber service, to ensure that our config generates a working Dockerfile.

Simply start using this Bash function to get the blubber. This will ensure you're using the same version that is deployed for use by Wikimedia CI.

blubber() {
  if [ $# -lt 2 ]; then
    echo 'Usage: blubber config.yaml variant'
    return 1
  fi
  curl -s -H 'content-type: application/yaml' --data-binary @"$1" https://blubberoid.wikimedia.org/v1/"$2"
}

Then we can use the following command to create a Dockerfile based on the production variant. We pipe the blubber response into a file called Dockerfile in our current working directory.

blubber .pipeline/draftquality/blubber.yaml production > Dockerfile

Next, we can build our image locally with the following command:

cat Dockerfile | docker build -t SOME-DOCKER-TAG-THAT-YOU-LIKE -f - .

Here, we pipe our newly-generated Dockerfile into the context of our Docker build, tag it and use the current directory as the build context (as specified by the .).

Or, you can pipe the output of the blubber command directly to the docker build directly, as follows:

blubber .pipeline/draftquality/blubber.yaml production | docker build -t SOME-DOCKER-TAG-THAT-YOU-LIKE -f - .

After the build process completes, you should now have a production image that reflects the configuration defined in our Blubberfile.

To test the image we build, please check the following guides:

Pipeline Configuration

Once you are happy with the image being generated from your Blubberfile, it’s time to configure a pipeline to build the image, run tests and publish the production-ready images to the WMF Docker Registry.

In our inference-services repo, all pipelines are configured via PipelineLib in the .pipeline/config.yaml file. We will want to configure two pipelines for the above image. One to build our images then run tests, and one to publish our production image after the code change is merged.

  draftquality:
    stages:
      - name: run-test
        build: test
        run: true
      - name: production
        build: production

  draftquality-publish:
    blubberfile: draftquality/blubber.yaml
    stages:
      - name: publish
        build: production
        publish:
          image:
            name: '${setup.project}-draftquality'
            tags: [stable]