Debian packaging with dgit and CI

From Wikitech

Some notes on Debian packaging for WMF with dgit. This is solving a slightly different problem than "Debian packaging for upload to Debian", since we are interested in i) source code in git and ii) binary packages for installation, and do not care about source packages at all - Our preferred form for modification is a git repository instead.

You don't need to write your own CI file for this; you can just tell gitlab to use builddebs.yml@repos/sre/wmf-debci as CI/CD configuration directly.

This document presumes an existing package; if you are starting to package software from scratch, there is a separate tutorial that describes how to do that.

Executive Summary / TL;DR

Suppose you want to update the package foo from the Debian bookworm release and build packages for bullseye for WMF. The process looks like this: First, get the source code, and make your changes:

 dgit clone foo bookworm
 cd foo
 git checkout bookworm-wikimedia
 #make the changes you want, git commit
 gbp dch --since=dgit/bookworm --ignore-branch -R --distribution=bullseye-wikimedia --commit -N NEWVERSION

Then create a suitably-named empty repository on gitlab, and in that new repository, under Settings-CI/CD-General pipelines, set the CI/CD Configuration file to builddebs.yml@repos/sre/wmf-debci. Add the new repository as a remote and push your new branch:

 git remote add gitlab git@gitlab.wikimedia.org:/repos/YOURREPONAME
 git push -u gitlab bookworm-wikimedia

You should now have a CI pipeline that will build your new package, and make the resulting artifacts available for you to inspect and download.

If you want to also have the CI track updates to the Debian suite your package is based on, then you should also push the dgit/bookworm branch (i.e. git push -u gitlab dgit/bookworm), make a project access token and store it in a masked CI variable called DGIT_CI_TOKEN. Then you can schedule a pipeline run against your dgit/bookworm branch and it will automatically check for updates of your package in Debian, and attempt to merge your changes and build new packages as appropriate.

Rationale / Further Details

At WMF, we store our source code in git, and then install binary packages on our systems (via the APT_repository); so we don't need source packages. This means we can avoid a lot of unnecessary complexity - e.g. quilt which essentially embeds an ersatz revision control system into a source package. We can instead use dgit to fetch us the source code for a package and present it to us in a consistent way where the code we see in our git checkout is exactly the code that we are going to build.

Always use dgit clone to fetch package source code

It's important to use dgit clone to fetch the package source code, rather than e.g. looking up the Debian maintainer's source repository and cloning that instead. This is because there are a wide variety of ways that Debian maintainers use to present changes to upstream source code, and sometimes this means that building the tree you clone will result in building unpatched code - you could end up building code without security fixes applied. Using dgit clone means you always end up with a working directory that matches the code you'll actually build, and you can then just treat it like a regular git checkout.

Fetching Source Code

This is as simple as dgit clone packagename suite. Here suite is the codename of the Debian release you want, e.g. bookworm.

If you don't specify a suite, you'll get unstable. If you want a package from ubuntu, add -d ubuntu to the command-line and specify the Ubuntu release name as the suite (e.g. jammy). If you're fetching a Debian package from the current or previous stable releases, it's worth adding ,-security to the suite (e.g. bookworm,-security), as that means you'll get any security updates that have been made available for that release.

Branch Naming

You should do all your work on a suitably-named branch off the dgit/suite branch that dgit clone created. Your branch should be called suite-wikimedia (or suite-wikimedia-foo if you might want more than one branch tracking the same Debian release), where suite is the Debian suite you want to track not the suite you want to build for.

If you just want the CI to build your package, then you need only push the suite-wikimedia branch; if you want the CI to also attempt to automatically merge in future changes from Debian, then you need to push the dgit/suite branch for the Debian suite(s) you want to track.

Making Changes

Treat this like a regular git checkout - make changes, git add, git commit as usual. There's no need to worry about quilt or dpkg-source --commit. Leave the contents of debian/patches alone.

If you want to apply an upstream fix (i.e. effectively to backport it), then you can add upstream's remote and cherry-pick if you like (or apply a patch in the usual manner); it's useful to say git cherry-pick -x in this case, as that will show where the patch came from for future reference.

Always Commit before attempting a build

Running a build attempt can modify your working tree, so always make sure you've committed your changes before attempting a build (which means you can use git clean -xdf and git reset --hard to restore your working tree afterwards). But you can just get the CI system to do the building for you :)

Making a Changelog Entry

Choosing a Version Number, backporting

Append +wmf1 to the version if building from the same distribution (making our version higher) or ~wmf1 if building from the next distribution (making our version lower)

The version number in debian/changelog determines the version number of the binaries you build. The version number wants to be higher than the version Debian ships in the distribution you want to install on, but lower than the version in the next distribution. For example, swift in Debian bookworm is version 2.30.0-4, and in Debian trixie is version 2.31.1-3. So if we're building a local package of swift for bookworm-wikimedia, then it wants to be a version higher than 2.30.0-4 and lower than 2.31.1-3. Achieve that by appending +wmf1 to the version number if building a package from the same release and appending ~wmf1 to the version if building a package from the next release. Then increment that integer when you make the next WMF-specific build.

Continuing our example, if we built a swift package for bookworm-wikimedia based on Debian's package from bookworm, we'd make version 2.30.0-4+wmf1, and if building a package based on Debian's version in trixie, we'd make version 2.31.1-3~wmf1. You can use dpkg --compare-versions to check that your version number does what you want, e.g.

 matthew@tsk:~$ dpkg --compare-versions 2.30.0-4+wmf1 gt 2.30.0-4 ; echo $?
 0

Here we've checked that our proposed version number 2.30.0-4+wmf1 is higher than the distribution version 2.30.0-4.

If you want to backport the same version to multiple distributions, it's usual to embed the Debian release number (e.g. 12 for bookworm) into the version number to disambiguate them - so we'd have 2.31.1-3~wmf12+1 for bookworm and 2.31.1-3~wmf11+1 bullseye. As an example, you can check that changelog showing a published version and its backport annotation on the relevant branch.

Updating the Changelog

If you've made commits with helpful commit messages, then you can use gbp dch to create a changelog entry for you:

 gbp dch --since=dgit/bookworm --ignore-branch -R --distribution=bookworm-wikimedia --commit -N NEWVERSION

Replacing NEWVERSION with the correct version number determined as above. The arguments to this command work as follows: --since tells gbp dch which commits to include when making the changelog entry (here we specify the branch tip corresponding to the version we cloned from Debian); --distribution states which distribution to put in the changelog entry (this must be the WMF suite you're building for - the CI parses the changelog to determine which image to use for the build); -N specifies the version number (it will guess wrongly otherwise); --ignore-branch -R --commit tell gbp dch to ignore the branch layout, make a release entry (rather than an unreleased snapshot), and commit the resulting changelog. By default gbp dch will spawn an editor window, so you can tweak the new changelog entry before it gets committed.

If you just want a new blank changelog entry to edit, then you can instead use

 dch -e -D bookworm-wikimedia -bv NEWVERSION

Which will set up a new changelog entry for you (but with no changes noted therein) to edit and then commit.

Alternatively, you can just edit (and then commit) debian/changelog by hand to make a suitable entry - the elpa-dpkg-dev-el package contains a helpful Emacs mode for editing Debian changelogs. If doing this, be careful to get the format correct as the requirements are stricter than for changelogs in general.

Create Repository and set up CI for package builds

Make a new repository on WMF gitlab; it needs to be under repos/ in order to have access to CI runners. There will in due course be Policy on where, but for now go with something sensible (e.g. your team may already have a namespace set up). Make an empty repository, and gitlab will tell you the remote to use, which you can add to your checkout thus:

 git remote add gitlab git@gitlab.wikimedia.org:PATH/TO/REPO

Here the new remote is called gitlab, but you can name it whatever you like (I prefer to keep origin for upstream's code); dgit will have created a remote called dgit which you can use to fetch other versions from Debian, and maybe vcs-git for the maintainer's repository (treat with caution, for the same reasons you should use dgit clone rather than closing the maintainer's repo directly).

Before you first push, set up CI (otherwise, you'll need to push a new commit to trigger any CI to run). Do this by hovering over "Settings" and clicking "CI/CD" from the revealed menu, then clicking "Expand" next to "General pipelines". Type builddebs.yml@repos/sre/wmf-debci into the "CI/CD configuration file" box, then click "Save changes". What this does is it uses builddebs.yml from the repos/sre/wmf-debci repo. That sets up jobs that build the tip of branches named suite-wikimedia or suite-wikimedia-*, using the image from the WMF repo that corresponds to the distribution specified in the most recent debian/changelog entry (with -wikimedia stripped off - so specify bookworm-wikimedia in the changelog if you want to build on bookworm.

Set up CI to track updates to Debian

Debian sometimes updates packages in its stable suites, typically security fixes or minimally-invasive fixes. You might well want any packages you're deploying to production to contain those fixes :-) With dgit this is reasonably easy - you can just do dgit pull on the relevant dgit/* branch and then merge those changes into your packaging branch like any other branch merge operation. The wrinkle is typically debian/changelog which of necessity both Debian and WMF will have updated. There's a special program in the dpkg-dev package called dpkg-mergechangelogs that is designed to help with this - it understands the format (and Debian version numbering) so can typically merge debian/changelog for you assuming you picked good version numbers. dgit code typically sets this up for you, but you can check by running git config --get merge.dpkg-mergechangelogs.name (if that returns nothing, you don't have the merge driver installed). You can run dgit setup-mergechangelogs to set this up in a repository, or refer to dpkg-mergechangelogs(1) if you want to do it yourself.

Obviously, it would be better to do this automatically! builddebs.yml can do so for you, with a little bit of initial setup. Firstly, make sure you push your dgit/* branch to gitlab; in our example above where we cloned a package from bookworm, this would be:

 git push -u gitlab dgit/bookworm

There is CI that will update that branch when the package is updated in Debian (and push the result), then attempt to merge the changes into any packaging branches that track that suite (e.g. in this case branches called bookworm-wikimedia or bookworm-wikimedia-*), make a new changelog entry, and commit and push the result. That then triggers the usual build process for the packaging branch, resulting in new binaries.

The access token that CI runs with by default (CI_JOB_TOKEN) is read-only, though (there is upstream discussion about making it possible to change that), so you need to create a project access token before these CI jobs can run. To do this, hover over "Settings", and then click on "Access Tokens" in the menu that appears. Give the token a sensible name (e.g. "dgit CI token") and expiry date (the maximum is 12 months), give it the "Maintainer" role, select the "write repository" scope, and click "Create project access token". Take a note of the generated token (gitlab won't show it to you again), then select "CI/CD" from the Settings menu and click "Expand" next to "Variables". Click "Add variable", and a popup appears. In Key, put "DGIT_CI_TOKEN" (this is the value the CI looks for to see if it should try and run the update jobs), paste the token into "Value", deselect the "Protect variable" and "Expand variable reference" flags, select the "Mask variable" flag (you don't want the token value appearing in CI output), and click "Add Variable".

At this point, you can run this CI by hand (click on CI/CD in the side menu, click "Run pipeline" on the top right, select one of your dgit branches, and click "Run pipeline"), and if you update a dgit/* branch yourself (e.g. via dgit pull and git push) then the CI will automatically attempt to merge that into any tracking branches.

But we can automate checking Debian for updates too :) Do this by scheduling a pipeline run: click on "CI/CD" in the left menu, and then select "Schedules" from the expanded left menu. Click "New schedule", give the schedule a sensible name and interval pattern (daily ought to be sufficient) & timezone, and then select a dgit/* branch under "Target branch or tag", and click "Save pipeline schedule". If you have multiple dgit branches you want to track, you need only set up one schedule - the relevant CI job checks each extant dgit/* branch in turn.

Push your code!

Then push your branch - the first branch you push to gitlab becomes the default branch:

 git push -u gitlab bookworm-wikimedia

The CI you set up in the previous step will now attempt to build your package for you, and if it succeeds, you'll have gitlab artifacts containing the output of your build. If you pushed before setting up CI (or are coming to CI setup later on), make another commit on the bookworm-wikimedia branch to kick off a pipeline branch - an edit to debian/changelog for example.

Setting Build Options

Many Debian packages support the DEB_BUILD_OPTIONS environment variable; you can set this like any other CI variable. For example, setting it to nocheck will skip the build-time test suite (which might be useful if it fails in our CI environment due to e.g. needing IPv6 networking available).

If you require any packages from the relevant -backports suite to build your package, then set the CI variable USEBACKPORTS to any non-zero value (something like "yes" or "True" is probably most clear); note that this means that all of the Build-Dependencies will be taken from -backports where available, not simply those necessary to match a versioned dependency.

What the CI does

The CI is defined in builddebs.yml and should be reasonably well commented.

Updates from Debian

The dgit_pull job updates dgit/* branches if the corresponding suite in Debian has been updated, and then the dgit_merge job attempts to merge those changes into appropriate tracking branches. Both jobs set GIT_STRATEGY: clone to ensure a properly clean repo, and both require DGIT_CI_TOKEN to be defined (this should be a project access token so the CI jobs can push changes).

The dgit_pull job additionally does not run on push events (since it would be confusing to have updated the dgit branch and pushed it yourself only to have the CI then try and updated the dgit branch further from Debian). It performs a bunch of setup (making sure the dpkg-mergechangelogs merge driver is available, arranging to have sensible commit name & email, and to be able to push), and then runs roughly the following

for branch in $(git for-each-ref --format='%(refname:lstrip=3)' refs/remotes/origin); do
  if [[ "$branch" =~ dgit/.+ ]] ; then
    git checkout "$branch" ;
    oldsha=$(git show -s --format=%H) ;
    dgit pull ;
    newsha=$(git show -s --format=%H) ;
    if [ "$newsha" != "$oldsha" ] ; then git push origin "$branch" ; fi ;
  fi ;
done

This checkouts out every branch named dgit/* and checks to see if dgit pull makes any changes. If so, it pushes those changes.

The dgit_merge job runs whenever a dgit/* branch is updated. Eliding the uninteresting setup, it runs roughly thus:

git checkout "$CI_COMMIT_BRANCH"
'SUITE=${CI_COMMIT_BRANCH#dgit/}'
for branch in $(git for-each-ref --format='%(refname:lstrip=3)' refs/remotes/origin); do
  if [[ "$branch" =~ ${SUITE}-wikimedia.* ]] ; then
    git checkout "$branch" ;
    oldsha=$(git show -s --format=%H) ;
    distro=$(dpkg-parsechangelog -S distribution) ;
    vsuffix=$(dpkg-parsechangelog -S version | sed -re 's/^.*([~+]+wmf.*$)/\1/') ;
    git merge --no-edit -m "Automatic CI update of $CI_COMMIT_BRANCH" "$CI_COMMIT_BRANCH" ;
    newsha=$(git show -s --format=%H) ;
    if [ "$newsha" != "$oldsha" ]; then
      debver=$(dpkg-parsechangelog -S version) ;
      version="${debver}${vsuffix}" ;
      dch -b -p --force-distribution -D "$distro" -v "$version" "Automatic CI update from tracking branch $CI_COMMIT_BRANCH" ;
      git add debian/changelog ;
      git commit -m "Auto-generated changelog for $version" ;
      git push origin "$branch" ;
    fi
  fi
done

This checks for branches called suite-wikimedia.* where suite is the Debian suite being tracked. For each of those branches, it extracts the distribution and version from the existing changelog entry (and extracts the local version suffix); attempts to automatically merge in the changes from the dgit branch; if that works (and so the tip of the working branch has changed) then it constructs a suitable new version number by adding the previous local version suffix to the new version number from Debian and makes a new entry in debian/changelog; and finally commits the changelog entry and pushes the updated branch.

Those commits then fire the build jobs for the updated branches (in the same way as you pushing changes to them would).

Package Building

There are two jobs that run for package builds: pickimage and build_ci_deb.

The pickimage job extracts the suite name from debian/changelog, removes the -wikimedia suffix and stores the result in the SUITE environment variable. This is needed by the build_ci_deb job to know which image to pull from the WMF Docker registry to run the build in.

The build_ci_deb job is very simple - it extends the build_ci_deb job from includes.yml to build on branches named .*-wikimedia.* (except where the commit is a tag matching WMFDEBCI.*:

build_ci_deb:
  rules:
    - if: $CI_COMMIT_TAG =~ /^WMFDEBCI.*/
      when: never
    - if: $CI_COMMIT_BRANCH =~ /.*-wikimedia.*$/

The actual work is done in the extended build_ci_deb job from includes.yml (doing it this way means that it can be extended by other users more easily). That is:

build_ci_deb:
  stage: build
  rules: *never
  image: docker-registry.wikimedia.org/wmf-debci-${SUITE}
  script:
    # If USEBACKPORTS is set, tell apt to use packages from backports
    - >
      if [ "$USEBACKPORTS" ]; then
      echo -e "Package: *\nPin: release a=${SUITE}-backports\nPin-Priority: 500" >/etc/apt/preferences.d/${SUITE}-backports.pref ;
      fi
    - apt update
    # Install the build-dependencies specified by this package
    - mk-build-deps -i debian/control -t "apt-get -o Debug::pkgProblemResolver=yes -y --no-install-recommends"
    # Build binary package(s) from the current source tree
    # set DEB_BUILD_OPTIONS in ci if you want to e.g. skip tests
    - dpkg-buildpackage -uc -b
    # Make a new directory for build artifacts
    - mkdir WMF_BUILD_DIR
    # dcmd operates on the changes file and everything named therein
    - dcmd cp ../*.changes WMF_BUILD_DIR/
  artifacts:
    # Tell gitlab where to find the build artifacts
    paths:
      - WMF_BUILD_DIR/
  variables:
    GIT_STRATEGY: clone

This requires the SUITE variable to be set (e.g. by the pickimage job). It installs the necessary dependencies, runs a build attempt, and tells gitlab where to find the resulting build artifacts. It specifies GIT_STRATEGY: clone to ensure a clean build environment (otherwise e.g. rebuilding the same package against two target suites may fail).

Tagging

Iff DGIT_CI_TOKEN is set, then after a successful package build, a tag will be created corresponding to the built version (mangled per DEP14 to make a legal git tag name) and that tag will be pushed to the repository. This step is skipped if the relevant tag already exists. This is done by the imaginatively-named tag_build job. The core of its work is (boilerplate removed for clarity):

tag_build:
  stage: build
  rules:
    - if: $CI_COMMIT_TAG =~ /^WMFDEBCI.*/
      when: never
    - if: $CI_COMMIT_BRANCH =~ /.*-wikimedia.*$/ && $DGIT_CI_TOKEN != null
  needs: ['build_ci_deb']
  script:
    - apt-get -y install ca-certificates dpkg-dev git
    - git fetch -t
    - pversion=$(dpkg-parsechangelog -S version)
    # Version transform per DEP14, trim final newline
    - gversion=$(echo "$pversion" | perl -pe 'y/:~/%_/; s/\.(?=\.|$|lock$)/.#/g; s/\n$//;')
    - >
      if [ -z $(git tag -l "WMFDEBCI/$gversion") ]; then
      git tag -a -m "Automatic CI build of version $pversion" "WMFDEBCI/$gversion" ;
      git push origin "WMFDEBCI/$gversion" ;
      fi
  variables:
    GIT_STRATEGY: clone

The rules and needs declarations ensure that this job only runs after a successful package build (and not if the triggering commit is adding a tag); then a tag is generated and pushed only if necessary (i.e. the relevant tag does not yet exist).

Further Reading

Dgit is well-supplied with documentation; the above process is modified from that in dgit-user(7). The dgit(1) manual has links to all the available manuals; if you are starting a new package from scratch (rather than an existing Debian or Ubuntu package), you may find dgit-maint-merge(7) helpful, although that is still a bit more complex than we need (since we don't need to care about source pacakges). Watch this space for more documentation...