Dumps/Testing

From Wikitech

Unit tests

Mediawiki dump maintenance scripts may be tested to some degree via unit and/or integration tests in MediaWiki core or the appropriate extension. See e.g. https://github.com/wikimedia/mediawiki/blob/master/tests/phpunit/maintenance/backup_PageTest.php New scripts should have tests added along with the new code.

The python SQL/XML dump scripts have some unit (or perhaps really integration) tests which can be run from the dumps repo in the xmldumps-backup subdirectory. See https://github.com/wikimedia/operations-dumps/blob/master/xmldumps-backup/test/all_test.sh for those.

Deployment-prep testing

Beyond unit tests, dumps should be tested on a snapshot instance in the deployment-prep project. Currently the test instance is deployment-snapshot03.deployment-prep.eqiad1.wikimedia.cloud. All jobs should be run as the dumpsgen user. Configuration files for the deployment-prep setup are found in /etc/dumps/confs/ with the extension ".labs". So for example, if you are testing something other than the SQL/XML dumps, you would want the file "wikidump.conf.other.labs".

To access deployment-prep you'll need to go via the bastion, see Help:Accessing Cloud VPS instances#Accessing Cloud VPS instances

SQL/XML dumps

The dumps repository, if you are testing SQL/XML dumps, is /srv/deployment/dumps/dumps just like in production, and you'll want to be in the xmldumps-backup subdirectory where all of the python scripts are located.

Other dumps

If you are trying to test one of the other dumps, you will likely want to run the maintenance script for it manually first, from the directory /srv/mediawiki/php-master/ You'll use MWScript.php to invoke it, for example: /usr/bin/php /srv/mediawiki/multiversion/MWScript.php maintenance/exportSites.php --wiki=commonswiki --help

Once you've gotten that working happily, you will want to test the bash script that runs it from cron. These typically reside in /usr/local/bin, where you can see many examples such as dumpcontentxlation.sh.

Output from the dumps

Output is written to directories in /mnt/dumpsdata/xmldatadumps/public for the SQL/XML dumps, and in /mnt/dumpsdata/otherdumps for the other dumps.

Please be conscious of the space on the instance; mediawiki and all of its l10n files live on the same filesystem as the dumps. If you see that your testing is close to filling up the disk, remove some of the older dump run directories, making sure that at least one complete run for each wiki is left intact for prefetch purposes.