Wikimedia Cloud Services team/EnhancementProposals/Decision request - How to provide a way to install system dependencies for buildpack-based images

From Wikitech

Origin task: phab:T336669

Date of the decision: 2023-06-08

There was no meeting, decision made in the task..


Decision taken

Option 2 - limiting packages to OS provided ones

Rationale

We decided to go with allowing anyone to use the Apt buildpack because, even though it makes it easier to abuse the system, it also makes it easier to maintain (at least initially) and to develop, and we can always add the extra bits to enforce good usage later if that ever becomes a problem.

We are limiting though packages existing on the official repos of the build/run image (as of writing this it's Ubuntu Jammy).

Problem

Sometimes tools need some system dependencies like:

   imagemagik
   pstools

And currently buildpacks don't allow to install any system packages.

Constraints and risks

  • Users will not be able to use the buildservice and request adding extra packages to the existing toolforge docker images
  • Users might not be able to migrate out of the grid without a big refactor

Options considered

Options

Option 1

Allowing to install apt packages using upstream apt buildpack (https://github.com/heroku/heroku-buildpack-apt).

This can be done by injecting that buildpack to the generated groups.toml (after detection).

We should put this feature behind an access list, enabled for certain projects after request.

Pros:

  • Unblocks any users that need extra dependencies
  • Still pushes users to use the recommended way to pull dependencies with buildpacks (pip/composer/bundler/...)

Cons:

  • For users that need it, they will have to do an extra request to enable it
  • A bit more complicated code-wise (we have to implement some sort of allowlist, we might use it for other things too ex. multistack/custom buildpacks)

Option 2

Allowing to install apt packages using upstream apt buildpack (https://github.com/heroku/heroku-buildpack-apt).

This can be done by injecting that buildpack to the generated groups.toml (after detection).

Enabled for everyone.

Pros:

  • Unblocks any users that need extra dependencies
  • Users that need it have it right away
  • No need for allowlist implementation

Cons:

  • We enable installing any package from anywhere to everyone potentially welcoming non-opensource code to run on toolforge
  • Images will be bigger (we currently have 1G limit per tool set on harbor, so will not be bigger than that).

Option 3

Allowing only selected buildpacks for specific libraries, and not the "apt" buildpack.

E.g. this buildpack adds a bunch of additional libraries https://github.com/heroku/heroku-geo-buildpack

Many buildpacks can be found online, and we could create more ourselves.

Pros:

  • More control on what people can install in their images
  • Might be solved the same way that multistack/specific buildpacks would be

Cons:

  • We would need to add those buildpacks individually, as people request them
  • Some libraries might not be available as a buildpack, and creating a custom buildpack is possible but not easy