Dumps/SQL-XML Dumps/Swapping NFS servers
Appearance
< Dumps | SQL-XML Dumps
Sometimes you may want to swap the primary and fallback NFS servers for the XML/SQL dumps. If they are both operational, here is the procedure used to do it.
- Make sure dumps are idle and that all data from
/data/xmldatadumps/public
is current on the fallback server - Stop puppet on both hosts
- Stop the dumps-rsyncer service on the primary host (
systemctl stop dumps-rsyncer.service
) - Check for any /usr/local/bin/rsync..*sh process on the primary and shoot it if there is one.
- Remove the unit file for the rsyncer service outright: it's likely in
/usr/lib/systemd
rather than in/etc/systemd
, a find will turn it up - Rsync the
/data/xmldatadumps/private
directory from the primary to the fallback host- You might do a dry run first, doing something like
rsync -av --dry-run --itemize-changes /data/xmldatadumps/private/ dumpsdata1001.eqiad.wmnet::data/xmldatadumps/private/
(adjust for hostnames, check that the path is still right, etc)
- You might do a dry run first, doing something like
- Rsync the
/data/temp directory
(used for testing) to the fallback host; you will need to edit/etc/rsyncd.conf
on the fallback host to not exclude**temp
- As with the
private
directory, you may want to do a dry run first to make sure your paths are right.
- As with the
- Disable puppet on the xml/sql snapshot hosts (NOT the one running misc dumps)
- Stop the dumps-monitor service on the snapshot running it (
systemctl stop dumps-monitor.service
) - Swap the role of the primary and secondary in puppet:
- Swap the
hiera/host
yaml files - Swap the settings for the two hosts in
hieradata/common.yaml
andprofile/dumps.yaml
making one the nfs server and one an internal server to get copies - Do a grep to make sure there's nothing else naming the two servers in the puppet repo that you might need to swap around
- Comment out the cron jobs for root and the dumpsgen users on the two hosts (puppet will put them back correctly), OTHERWISE YOU MAY HAVE DUPS, VERY BAD
- Remove /data/xmldatdumps/public/dumpstatusfiles.tar.gz on both dumpsdata hosts.
- Do a permissions change on the new primary, just in case an unnoticed interrupted rsync left things in a weird state:
- from /data/xmldatadumps/public dir, chown -R dumpsgen:dumpsgen *wik*
- from the same dir, chmod a+r .
- Swap the
- Enable puppet on the old primary, run it, make sure it does not start up the rsyncer shell script and that the cron jobs look right (compare to the ones you commented out on the fallback)
- Enable puppet on the old fallback, run it, make sure the rsyncer shell script is started and that the cron jobs look right (compare to the ones you commented out on the old primary)
- Fix up the snapshot hosts:
- Make sure no job of the dumpsgen user is running on the snapshot
- Umount the
/mnt/dumpsdata
nfs share - Enable and run puppet
- Make sure that
/mnt/dumpsdata
now comes from the new primary host - On the snapshot with the monitor, make sure the monitor has started up
- Finally: on the snapshot testbed, try a test run of a small wiki, for example (as the dumpsgen user, make sure you use the TESTS configfile):
python3 ./worker.py --configfile /etc/dumps/confs/wikidump.conf.tests --exclusive --log dewikiversity
- Update Dumps/Dumpsdata hosts with the new server info