Media storage/Backups

From Wikitech

General architecture

The backups servers are listed on puppet (hieradata).

Overview of the media backups architecture
  • Worker servers (ms-backup*) are those that are used to download and upload files on backup and recovery, as well as pre-processing them (e.g. hashing them and checking its integrity).
  • Storage servers (a subset of all backup*) run minio (an S3-api compatible service) and hold the data long term. For now it uses completely static discovery as it give us high flexibility to depool a server.
  • MariaDB database servers (a subset of all db*) store backup metadata needed to quickly recover and search specific files from backups
  • MediaWiki backup source database servers (not used by MediaWiki itself, only containing the same data) are used to retrieve updated file metadata (specially for deleted files, for which there is no available api or script yet)
  • Periodic API public calls to production servers are used to get the latest updates in a "pull" model, which was found reliable enough for keeping Commons backups up to date, rather than getting updates from a jobqueue/streaming, as real-time updates were not needed

Each datacenter has its own separate set of servers, data and credentials, for redundancy reasons.

At the moment, the clustering functionality of minio is not used, meaning one will have to access each storage server (minio server) individually. The sharding is based on the sha256 hash of the file, divided by the number of configured servers, in the order configured.

For example, with:

endpoints:
  - https://backup1004.eqiad.wmnet:9000
  - https://backup1005.eqiad.wmnet:9000
  - https://backup1006.eqiad.wmnet:9000
  - https://backup1007.eqiad.wmnet:9000
  - https://backup1011.eqiad.wmnet:9000

files whose sha256 hash start (in hexadecimal) with 0-3 go to backup1004, 4-7 to backup1005, 8-B to backup1006 and C-F to backup1007. This is not guaranteed to stay like that, as servers will likely be unavailable for maintenance at times, and the number of servers will be expanded, which means eventually one will be forced to use the metadata database to locate the server where a file is located (this can be done with the recovery script).

Manual search query, recovery and deletion

Querying files

There is a script present at all worker hosts (ms-backup*) that allows querying and searching for files that have been backed up, or were detected and are pending or failed to be backed up: query-media-file

The script should be run as root (technically, the only rights needed are those of the system user: mediabackup: sudo -u mediabackup query-media-file).

The script will interactively ask for:

The script will locate and print all matches found for the given parameters, then quit. Note that backups on both datacenters are not guaranteed to be exactly the same at all times- so if wanting to query for the status of a particular file, it should be done on each datacenter separatelly (ms-backup1* hosts will query eqiad backup and ms-backup2* hosts will query codfw backup, only).

Recovering files

Example run of restore-media-file for small-scale media recovery

In order to recover 1 or a few files (e.g. all versions from the same name) from backups it is possible to use the command line utility: restore-media-file, available on media backup worker hosts (ms-backup*). You should use a server in the same datacenter as where the backup is located (not the primary dc for Swift), as it will by default use the local backup storage for it.

The script will have a very similar usage as the query script, except that instead of quiting after printing, it will attempt to download the files to the server used to run the recovery.

The script will locate, download and decrypt all chosen files to the local host filesystem after asking for confirmation. Changing directory to /srv/mediabackups before running the command is recommended to avoid having disk space issues, there is 200+ GB available there).

It is expected for the script upload the files directly to production in the future, but that is not yet available- so a recovery at the moment will require a second, manual upload step to swift. This script is only intended for small-scale recovery, a different method is planned for larger recovery jobs.

Example (see attached screenshot for a typical session)

SSH-ed to server ms-backup1001 (so the eqiad backup is used) and running restore-media-file as root from /srv/mediabackups directory:

  • A wiki is asked for: commonswiki
  • A method of searching files: chosen 0 - by title
  • Parameters: "Crystal 128 yast backup.svg" as title. Spaces in titles will be automatically translated into underscores. If the prefix 'File:' is found on the title, it will be ignored (this is not true for other localized versions in other languages). Title will not be automatically Capitalized, as collation could vary from PHP to PHP version, and compared to Python (capitalization should be the same as the original file).
  • 4 files are presented as matching the search criteria, with their details: 1 "public" (the latest version) and 3 "archived" (previous versions). All 4 will be downloaded. Confirmation is required to proceed.
  • If you type "y" for confirmation, the files will be attempted to be downloaded from the backup storage (decrypted, if necessary) and written to the local filesystem. If there are errors (e.g. download or decryption fails) they will be logged as such (ERROR), giving details of what step couldn't be completed correctly. A total count of successful vs attempted restores is given at the end. The name of the recovered files in this case will be "Crystal_128_yast_backup.svg", "20100218201009!Crystal_128_yast_backup.svg", "20100218201250!Crystal_128_yast_backup.svg" and "20100218201411!Crystal_128_yast_backup.svg", trying to conserve the original names on production storage where possible. If a name is a duplicate, it will have additional '~' signs at the end.
  • To complete the restore, the backup must be manually uploaded to the production swift servers. The information regarding the production_container and production_path will likely be useful for that. This has not yet been automated for 2 reasons:
  • Restoration may require additional steps, such as removing existing files, or recovering to a different location (this will depend on the kind of restoration wanted), and this needs additional automation that is not yet available
  • Additional coordination with the WMF production media storage maintainers is needed to make sure their feedback is taken into account so that recoveries are done in a safe and fast way (and no accidental recovery, leading to potential data loss, is done).

Once those blockers have been solved, a fully automated recovery method will be implemented for both a small number and a large number of files. This means that, as of now it is safe (other than the risk of filling up mw-backup hosts' disks) to execute random recovery processes for testing purposes, as they won't affect the real media servers.

Deleting files

See also the following section: Batch query, recovery and deletion

Upon request from Trust & Safety, SREs at the moment (we may want them to auto-serve in the future) must delete already deleted files from production also from media backups.

For maximum safety, please considering just querying the file first before deletion, to ensure the file about to be deleted is the same found.

For the deletion, the following actions have to be taken:

  • Trust and Safety has to delete the file from production first (this is important to happen strictly first, as otherwise backup could happen again)
  • The file must disappear from backup storage, on both datacenters (the procedure will have to be done twice, as data, metadata and credentials are independent).
  • Metadata has to delete the reference to the backed up file
  • Wiki status has to mark the file as hard-deleted to prevent the file from being backed up again by accident

The architecture of backups was designed with deletion of individual files in mind. In order to simplify those steps, a script exists delete-media-file (run it as root) on ms-backup* hosts, that finds the file, deletes it permanently from storage and updates the metadata accordingly, on a single, simple step.

The script follows the same conventions and input as the querying one (see #Querying_files and the example above, with the exception of the following changes):

  • By default, the script will run in dry-mode, that is, it will do all the steps as if it was really deleting, but warn no actual state change happens. In order to actually delete the file, the script will have to be run with the option: delete-media-file --execute.
  • Before deletion, the script will ask for user to confirm the deletion. If running in non-dry mode this action isn't undoable! Consider running a first run on dry mode, or just querying the files to confirm the deletion first. An accidental deletion can be corrected if the file hasn't been deleted from production/Swift, though.
  • A sanity check is performed before the actual deletion- to ensure the file is not publicy available. If the file is seen available through public https requests, and not returning a 404 not found errors, the deletion will abort early.
  • Some steps of the deletion can fail individually. For example, a file could be physically deleted but metadata failed to be deleted or updated individually. Logs will indicate the error, and a user should judge if to act manually based on that or it was an expected error (e.g. metadata was incomplete).
  • Make sure and indicate to Trust & Safety if a file is referenced more than once (e.g. same file content referred from multiple names)

It is important to remember to perform search and deletions of file on both datacenters, e.g. ms-backup1001 and ms-backup2001, as they are fully independent for redundancy reasons.

Batch query, recovery and deletion

If a lot of files have to be queried, restored or deleted, doing it one by one can become too lengthy. There is at the moment a single file format that can be used to automate multiple files in a single run. The format expected is that of the production deletion log (eraseArchivedFile.php):

jynus@mwmaint2002:~$ mwscript eraseArchivedFile.php --wiki commonswiki --filename "A_deleted_file_upload_name.jpg" --filekey "*" --delete
Purging all thumbnails for file 'A_deleted_file_upload_name.jpg'...done.
Finding deleted versions of file 'A_deleted_file_upload_name.jpg'...
Deleted version 'bskcb87kla1szao983jymuamieolywt.jpg' (20231205153959) of file 'A_deleted_file_upload_name.jpg'
...

It is ok if the file contains extra information or things like extra white spaces, non-relevant lines are ignored.

Deletions will ask for a final confirmation before deletion- the queries will try to match the title, hash and upload date, and if it founds 0 or more than 2 matches, it with throw a warning (but by default it will attempt to delete all matches, even if they don't match the original number of files!).

This deletion log should be saved to a utf-8 text file and sent to a worker host (e.g. ms-backup1001.eqiad.wmnet):

Then execute it with one of the 3 above commands, like this:

# query-media-file files_to_delete.txt

(change query-media-file for delete-media-file or restore-media-file, depending on the action needed.

By default, the script will run in dry-mode, that is, it will do all the steps as if it was really deleting, but warn no actual state change happens. In order to actually delete all files and update the metadata, the script will have to be run with the flag: --execute <files-to-delete-txt>
Future formats for file lists, such as CSV, may be implemented at a later time (?).

Logs

All queries, recoveries and deletions are logged to disk, in addition to screen. To monitor previous actions taken check for logs available at /var/log/mediabackups:

root@ms-backup1001:~$ ls -lha /var/log/mediabackups
total 48K
drwxr-xr-x  2 mediabackup mediabackup 4.0K Jun 29 12:31 .
drwxr-xr-x 35 root        root        4.0K Jun 29 00:00 ..
-rw-r--r--  1 mediabackup root          78 Jun 29 12:31 deletion.log
-rw-r--r--  1 mediabackup root         158 Jun 30 09:03 query.log
-rw-r--r--  1 mediabackup root         32K Jun 29 11:10 recovery.log
✔️

File search methods

  1. Title of the file on upload or after rename: The title of the file (img_name, oi_name, fa_name), usually will be the same as the File page, without the 'File:' prefix (if If that prefix is detected, it will be ignored. Whitespaces will be converted into underscores, as that is how it is managed internally by MediaWiki. It will match all files (current, old or deleted) that were uploaded with that name (or were later renamed to it). Examples: Crystal_128_yast_backup.svg, Moon.webm
  2. sha1sum hash of the file contents, in hexadecimal: The sha1sum of the file contents, a 40 byte string e.g.: 44f60bd0d070b9c16260e95483a6b50467b3ef20
  3. sha1sum hash of the file contents, in Mediawiki's base 36: alternative format for the file contents sha1sum, used internally by MediaWiki database. E.g.: 81zu1g0k850cbxc0a85yr4vj646wyzk
  4. Original container name and full path as was stored on Swift: A couple of parameters- the name of the swift container and the path when it was backed up, entered separately e.g. wikipedia-commons-local-public.5c (press enter) and 5/5c/Crystal_128_yast_backup.svg. This is useful when the file has been lost from swift, but its equivalent name on MediaWiki is unknown
  5. sha256sum hash of the file contents, in hexadecimal: a 64-character string with the sha256 hash of the file contents- useful to check duplicates among wikis, or if the original file is available (e.g. when comparing between datacenters), as it is normally not recorded by mediawiki. E.g.: 7d45a30698051aa58b18c6728a1a07633b6e8eab971821f96847609492267856
  6. Exact date of the original file upload, as registered in the metadata: the exact timestamp when the file was uploaded, normally available for all files, independently of their current status (so also for archived & deleted files). It will be accepted in ISO or WMF date string format, like this: 2010-02-18 20:14:11 or 20100218201411
  7. Exact date of the latest file archival, as registered in the metadata: the exact timestamp when the file was archived (a new file version of the same name was uploaded); only available on files that were not the latest version. Same format as the upload date.
  8. Exact date of the latest file deletion, as registered in the metadata: the exact timestamp when the file was deleted; only available on files that were soft-deleted, being the latest or an older version. Same format as the upload date.

File properties explanation

  • wiki: The name of the wiki as it appears on dblists (not the swift container name): commonswiki, enwiki, frwikivoyage, ...
  • title: The internal name of the file as it appears on the img_name, oi_name, fa_name of the image, oldimage and filearchive tables. It normally corresponds to the name of the file title, except without the File: prefix and with underscores instead of whitespaces
  • production_container: The Swift container name where it is hosted on production at the time of the backup. For large wikis it can be hashed. E.g.: wikipedia-commons-local-public.5c
  • production_path: the Swift path within the container where it is hosted on production. E.g.: 5/5c/Crystal_128_yast_backup.svg
  • sha1: the sha1 hash of the file, in hexadecimal. E.g.: 44f60bd0d070b9c16260e95483a6b50467b3ef20
  • sha256: the sha256 hash of the file, in hexadecimal. E.g.: 7d45a30698051aa58b18c6728a1a07633b6e8eab971821f96847609492267856
  • size: size in bytes of the file. E.g.: 37845
  • production_status: public, if the file is the latest version of a non-deleted file. Note that private wikis will show the latest versions as public even if the files are not available to the public. archived if it is an older version of a non-deleted file. Note that hidden revisions will show as archived even if they are not available to the public. deleted if it has been soft-deleted (latest version or not). hard-deleted if it has been fully removed from WMF infrastructure (many of those will be shown as not found, as they have only been started to be tracked since around 2022).
  • production_url: URI of the file as publicy available on wikimedia infrastructure (a upload.wikimedia.org link). Only available for public and older files, deleted ones will print "None", as they are not publicly downloadable.
  • upload_date: timestamp of when the file was first uploaded, in ISO format. E.g.: 2010-02-18 20:14:11
  • archive_date: timestamp of when the file was overriden by a newer file with the same title (only for archived and deleted files that were not the latest versions), in ISO format.
  • delete_date: timestamp of when the file was soft-deleted (only for deleted files), in ISO format.
  • backup_status: one of pending (detected but not yet backed up), processing (inside a batch that is being backed up at this moment), backedup (backup completed, available for restore), duplicate (found to have been previously backed up in a previous process, available for restore), and error (backup process could NOT complete because of some error).
  • backup_date: timestamp of when the file was successfully backed up for the first time (not counting duplicates), in iso format. E.g.: 2021-09-06 06:00:22
  • backup_location: backup server (and port) where the file is stored. Will be one of backup[12]00[4567]. E.g.: https://backup1005.eqiad.wmnet:9000. At the moment, backup*004 contains all wiki files whose hashes start with the character 0 to 3; 5 => 4-7; 6 => 8-b; 7 => c-f.
  • backup_container: container/bucket in minio used for backups. Should be mediabackups for all (some storage technologies have issues with more than 1024 buckets)
  • backup_path: path within the backup server where the file is stored. At the moment, the method used is <wiki>/<3 first sha256 hash letters>/<sha256>[.age] E.g.: commonswiki/7d4/7d45a30698051aa58b18c6728a1a07633b6e8eab971821f96847609492267856. The wiki is the name of the project as it appears on dblists. Encrypted files (those from non-public wikis) are encrypted using age and have that extension added to avoid mistakes. sha256sum is before encryption.

How to access the web UI of minio

Minio login screen

The integrated web client of minio, while simple, allows easily to manage, list and upload/download files to the backend with a more user-friendly interface.

minio access is firewalled, and it only has its service port open to the backup workers and prometheus for metrics gathering from the same datacenter. To gain access one needs to tunnel HTTPS on port 9001 to a local port with ssh through a server with access (e.g. a worker server from the same datacenter).

For example:

ssh -L 1234:backup1004.eqiad.wmnet:9001 ms-backup1001.eqiad.wmnet

change 1234 to a port available to listen in your local machine.

This will tunnel the minio service to the local port 1234 through ms-backup1001.

For the actual active worker and storage servers, consult hieradata.

Minio file browser

Then go to your browser and open https://localhost:1234. The https is important, as non-tls traffic is not allowed.

Your browser will complain about lack of a trusted TLS- this is because it uses the discovery CA, which is only deployed to the WMF production cluster. Either install it on your client PC or click "Accept risk and continue".

A login screen should appear. Credentials are deployed on the worker servers at: /etc/mediabackup/mediabackups_recovery.conf They are also available on private-puppet:hieradata/common.yaml and private-puppet:hieradata/{eqiad, codfw}.yaml.

Do not remember the credentials on your local browser, if prompted to do so

It is highly recommended to use the credentials used for backup recovery (not backup generation) as those are read-only. If writing is needed, it is more reliable to use the command line client (mc), unless you know what you are doing.

After logging, a browser screen should appear, allowing you to navigate among the file structure, download files, etc. The minio browser will not disable options for deleting or uploading if in read only- it will only fail to do so.

Software and WMF deployment and configuration

Further reading