News/Toolforge Grid Engine deprecation

From Wikitech
Jump to navigation Jump to search

This page contains information about the deprecation and removal of the Toolforge Grid Engine platform.

What is changing?

The Grid Engine cluster is being decommissioned in accordance with the timeline on this page.

The Toolforge admins are asking tool maintainers to move tools off the grid and report any blocking issues. This work is being tracked on the Phabricator workboard.

Timeline

  • Oct-Dec 2021: Yes Done Release the Toolforge Jobs Framework. Continue working on Toolforge buildpacks. Migrate Son of Grid Engine to Debian Buster.
  • Oct-Dec 2022: Yes Done Ask community to begin migrating tools. Collect blocking issues.
  • Jan-Mar 2023: Yes Done Add features to support identified blocking issues. Explore Kubernetes a service as potential migration path. Tool migrations continue.
  • Apr-Jun 2023: Yes Done Toolforge buildpacks beta. See T267374. Tool migrations continue.
  • Jul-Sep 2023: Yes Done Toolforge buildpacks multipack support work. See T325799. Tool migrations continue.
  • Oct-Dec 2023: Doing... All Tools are now able to be migrated. Tool migrations continue.
    • November 2023: Kickoff grid shutdown process. Notify individual maintainers who still have tools on the grid of shutdown timelines via email, cloud-announce mailing list, and talk pages.
    • December 2023: Notify maintainers using jsub and related tools about shutdown timeline
    • 2023-12-14: Tools owned by unresponsive or unreachable maintainers will be stopped
  • Jan-Mar 2024: The grid is stopped and the grid infrastructure is deleted
    • 2024-02-14: Grid is stopped entirely: All tools stopped, new submissions will no longer be possible
    • 2024-03-14: Grid infrastructure is deleted.

What should I do?

You have a couple of options:

Use case continuity

The following table tracks use case continuity.

Moving from Toolforge GridEngine to Toolforge Kubernetes
Feature Grid Engine Kubernetes Comment
job scheduling jsub or jstart Toolforge jobs
One off jobs or continuous jobs
Example:

From GridEngine

5 * * * * jsub -once -N name-of-tool $HOME/user/bot.php >/dev/null 2>&1

To Kubernetes

$ toolforge-jobs run name-of-tool --command ./user/bot.php --image php8.2 --schedule "@hourly"
web services webservice specify an image and 'kubernetes' as the backend Example:

From GridEngine

$ webservice --backend=gridengine start

To Kubernetes

$ webservice stop
$ webservice --backend=kubernetes php8.2 start
Multi-language tools Native Toolforge buildpacks Some single language tools will need updated or new images (like dotnet)

Why are we doing this?

As outlined in our series of blog posts, Toolforge is powered by two different backend engines, Kubernetes and Grid Engine. These two backends have traditionally offered different features for tool developers. But as time moves forward we’ve learnt that Kubernetes is the future.

See more for a detailed explanation.

Solutions to common problems

Rebuild virtualenv for Python users / python3: not found

Python virtual environments ("venvs") are tied to the underlying system where they are running. Because of that, you will need to delete and re-create your virtual environments using these instructions.

Tools needing multiple language runtimes

You can build an image for your tool with the dependencies required.

Mono container

Using mono? See discussion on a Mono specific container phab:T311466

Requires a system library or tool to be present

If you believe this depedency makes sense to have in the base container, file a ticket. For example, mysqldump phab:T254636, resolved.

For other needs, note that buildpack project will support more nuanced and custom container setups.

Pywikibot scripts

See Help:Toolforge/Running Pywikibot scripts on how to easily run Pywikibot scripts using the jobs framework.

Delete a tool

Some tools were experiments that are done, others were made obsolete by other tools, some are just things that the original maintainer is tired of caring for. Maintainers can mark their tools for deletion using the "Disable tool" button on the tool's detail page on https://toolsadmin.wikimedia.org/. Disabling a tool will immediately stop any running jobs including webservices and prevent maintainers from logging in as the tool. Disabled tools are archived and deleted after 40 days. Disabled tools can be re-enabled at any time prior to being archived and deleted.

See also

Communication and support

Support and administration of the WMCS resources is provided by the Wikimedia Foundation Cloud Services team and Wikimedia movement volunteers. Please reach out with questions and join the conversation:

Discuss and receive general support
Stay aware of critical changes and plans
Track work tasks and report bugs

Use a subproject of the #Cloud-Services Phabricator project to track confirmed bug reports and feature requests about the Cloud Services infrastructure itself

Read stories and WMCS blog posts

Read the Cloud Services Blog (for the broader Wikimedia movement, see the Wikimedia Technical Blog)