Jump to content

Portal:Toolforge/Admin/Harbor/maintain-harbor

From Wikitech

maintain-harbor is a tool used to perform house-keeping tasks on our Harbor installation. Currently, these house-keeping tasks include:

The purpose of the above tasks is to keep the size of our harbor installation manageable and without these, our harbor installation will quickly overgrow the storage available to its host VM.

The code for maintain-harbor currently lives in wikimedia gitlab

Key concepts

Harbor Tool projects

These are harbor projects that are used to store the images generated from a buildservice build while authenticated as a particular toolforge tool. This harbor tool project is named after the toolforge tool whose images it stores.

Harbor image retention policies

These are harbor objects that Harbor uses to decide how to handle the images in a harbor tool project's repositories. For example an image retention policy can be used to tell harbor to delete all except the latest 5 images in each of the repositories of a harbor tool project.

Harbor Image Immutability policies

Harbor image immutability policies are harbor objects used to prevent the deletion of the images in a harbor tool project's repositories. One important thing to remember about these is that they supersede image retention policies and when configured for a project ensure that all affected images cannot be deleted, not even by an image retention policy.

Delete harbor tool projects with no repositories

This task helps to delete harbor tool projects when all their repositories and images are deleted. This prevents unnecessary bloat.

Create or update Harbor tool project's image retention policy

When a harbor tool project is first created, it has no image retention policy configured. It is the responsibility of this task to create such a policy for each project and to update the policies of already existing projects if we decide to make changes to existing image retention policies.

Stale harbor image cleanup for core Toolforge repositories

We use Harbor to store the images of core toolforge repositories like builds-api for CI/CD purposes. These are grouped under the toolforge harbor project and have image immutability policy configured hence can't be cleaned up using a simple harbor image retention policy. This task is used to clean up these images by disabling the toolforge harbor project's image immutability policy, then running its own custom image cleanup algorithm for each repository in the project before enabling the immutability policy again.

How it's setup

To perform the housekeeping tasks implemented by maintain-harbor, you interact with maintain-harbor on the command line as a CLI. maintain-harbor is currently deployed as three separate toolforge jobs performing the three tasks listed above using the maintain-harbor toolforge tool account.

If you are a maintainer of the maintain-harbor tool or a tools admin, you can view the Toolforge jobs created by maintain-harbor by doing the following:

user@ubuntu:~/Desktop/maintain-harbor$ ssh login.toolforge.org
...
user@tools-bastion-13:~$ become maintain-harbor
tools.maintain-harbor@tools-bastion-13:~$ toolforge jobs list
+-------------------------------------------+-----------------------+------------------------------------------+
|                 Job name:                 |       Job type:       |                 Status:                  |
+-------------------------------------------+-----------------------+------------------------------------------+
|    mh--delete-empty-tool-projects-cron    | schedule: 23 8 * * *  | Last schedule time: 2024-08-19T08:23:00Z |
| mh--delete-stale-toolforge-artifacts-cron |  schedule: 0 8 * * 3  | Last schedule time: 2024-08-14T08:00:00Z |
|      mh--manage-image-retention-cron      | schedule: 23 10 * * * | Last schedule time: 2024-08-19T10:23:00Z |
+-------------------------------------------+-----------------------+------------------------------------------+
tools.maintain-harbor@tools-bastion-13:~$

The maintain-harbor jobs are currently named accordingly:

  • mh--delete-empty-tool-projects-cron
  • mh--delete-stale-toolforge-artifacts-cron
  • mh--manage-image-retention-cron

Common tasks

Deploy maintain-harbor

To deploy maintain-harbor using a Toolforge tool account,

  • you need to have it cloned to the home directory of the tool account. We are currently using the maintain-harbor Toolforge tool account but you can use any tool:
    tools.maintain-harbor@tools-bastion-13:~$ cd $HOME/maintain-harbor && git clone https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-harbor.git
    
  • Next step is to create a $HOME/maintain-harbor.yaml config file. You can see the expected config file format by running:
    tools.maintain-harbor@tools-bastion-13:~$ cd $HOME/maintain-harbor && $HOME/venv/bin/python3 -m src.maintain_harbor --show-config
    environments:
      toolsbeta:
        auth_password: my_password
        auth_username: my_username
        base_harbor_api: https://harbor.domain.name/api/v2.0
        do_retentions: true
        toolforge_repo_artifact_limit: 2
    
    tools.maintain-harbor@tools-bastion-13:~/maintain-harbor$
    
  • To finally get the maintain-harbor jobs running, execute:
    tools.maintain-harbor@tools-bastion-13:~$ $HOME/maintain-harbor/deploy.sh
    

Checking maintain-harbor status

To view the status of the information about all the maintain-harbor jobs:

tools.maintain-harbor@tools-bastion-13:~$ toolforge jobs list
+-------------------------------------------+-----------------------+------------------------------------------+
|                 Job name:                 |       Job type:       |                 Status:                  |
+-------------------------------------------+-----------------------+------------------------------------------+
|    mh--delete-empty-tool-projects-cron    | schedule: 23 8 * * *  | Last schedule time: 2024-08-19T08:23:00Z |
| mh--delete-stale-toolforge-artifacts-cron |  schedule: 0 8 * * 3  | Last schedule time: 2024-08-14T08:00:00Z |
|      mh--manage-image-retention-cron      | schedule: 23 10 * * * | Last schedule time: 2024-08-19T10:23:00Z |
+-------------------------------------------+-----------------------+------------------------------------------+
tools.maintain-harbor@tools-bastion-13:~$

To view a detailed status of each job, you can run the toolforge-jobs command toolforge jobs show <job-name> as the maintain-harbor tool:

tools.maintain-harbor@tools-bastion-13:~$ toolforge jobs show mh--delete-empty-tool-projects-cron
+---------------+------------------------------------------------------------------------------------------------------------------------------------------+
| Job name:     | mh--delete-empty-tool-projects-cron                                                                                                      |
+---------------+------------------------------------------------------------------------------------------------------------------------------------------+
| Command:      | cd $HOME/maintain-harbor && $HOME/venv/bin/python3 -m src.maintain_harbor --config $HOME/maintain-harbor.yaml delete-empty-tool-projects |
+---------------+------------------------------------------------------------------------------------------------------------------------------------------+
| Job type:     | schedule: 23 8 * * *                                                                                                                     |
+---------------+------------------------------------------------------------------------------------------------------------------------------------------+
| Image:        | python3.11                                                                                                                               |
+---------------+------------------------------------------------------------------------------------------------------------------------------------------+
| Port:         | none                                                                                                                                     |
+---------------+------------------------------------------------------------------------------------------------------------------------------------------+
| File log:     | yes                                                                                                                                      |
+---------------+------------------------------------------------------------------------------------------------------------------------------------------+
| Output log:   | /data/project/maintain-harbor/mh--delete-empty-tool-projects-cron.out                                                                    |
+---------------+------------------------------------------------------------------------------------------------------------------------------------------+
| Error log:    | /data/project/maintain-harbor/mh--delete-empty-tool-projects-cron.err                                                                    |
+---------------+------------------------------------------------------------------------------------------------------------------------------------------+
| Emails:       | onfailure                                                                                                                                |
+---------------+------------------------------------------------------------------------------------------------------------------------------------------+
| Resources:    | default                                                                                                                                  |
+---------------+------------------------------------------------------------------------------------------------------------------------------------------+
| Mounts:       | all                                                                                                                                      |
+---------------+------------------------------------------------------------------------------------------------------------------------------------------+
| Retry:        | no                                                                                                                                       |
+---------------+------------------------------------------------------------------------------------------------------------------------------------------+
| Health check: | none                                                                                                                                     |
+---------------+------------------------------------------------------------------------------------------------------------------------------------------+
| Status:       | Last schedule time: 2024-08-19T08:23:00Z                                                                                                 |
+---------------+------------------------------------------------------------------------------------------------------------------------------------------+
| Hints:        | No pods were created for this job.                                                                                                       |
+---------------+------------------------------------------------------------------------------------------------------------------------------------------+
tools.maintain-harbor@tools-bastion-13:~$

Stopping and starting the maintain-harbor jobs

There is currently no way to stop and restart an existing toolforge job. If you wish to do this for the maintain-harbor jobs, you will need to delete the jobs and recreate them whenever you wish.

To delete a job, run:

tools.maintain-harbor@tools-bastion-13:~$ toolforge jobs delete mh--delete-empty-tool-projects-cron

or

tools.maintain-harbor@tools-bastion-13:~$ toolforge jobs flush
Note that "toolforge jobs flush" will delete all the maintain-harbor jobs at once.

you can easily recreate all the deleted jobs by running:

tools.maintain-harbor@tools-bastion-13:~$ $HOME/maintain-harbor/deploy.sh

The above command will recreate all the maintain-harbor jobs if they don't exist already.

Checking the logs

The logs for each job can be found in maintain-harbor tool's home directory as .out and .err files that are named after the jobs.

tools.maintain-harbor@tools-bastion-13:~$ cat $HOME/mh--delete-empty-tool-projects-cron.out
...
tools.maintain-harbor@tools-bastion-13:~$ cat $HOME/mh--delete-empty-tool-projects-cron.err

Manual execution

If you wish to manually perform any of the maintain-harbor tasks without using toolforge-jobs to schedule it, you might find the contents of the $HOME/maintain-harbor/jobs-schedule.yaml useful.

  • To run a single maintain-harbor task outside toolforge jobs, execute the following:
    tools.maintain-harbor@tools-bastion-13:~$ cd $HOME/maintain-harbor && $HOME/venv/bin/python3 -m src.maintain_harbor --config $HOME/maintain-harbor.yaml delete-empty-tool-projects