External storage/Maintenance

From Wikitech
Jump to navigation Jump to search

How to safety perform maintenance on external storage boxes...

ES boxes are Apaches running an instance of MySQL for some simple blob storage. A couple of things you need to know:

  1. They come in clusters.
    You can check /h/w/php-1.5/db.php for cluster<->machine assignments
  2. Each cluster has a master and one or more slaves.
    The master is the first to appear in the cluster's list in db.php... but for some perverse reason is usually the highest-numbered server (eg, srv146 master, 145 and 144 slaves)
    It should always be safe to take down a slave for maintenance.
    For older, read-only clusters, taking the master down is also safe. Reads will fail-over to the slaves.
  3. Only the last couple clusters are active for writes.
    These are listed in $wgDefaultExternalStore at the end of db.php
    If you're going to shut down the master of one of these clusters, you should remove it from $wgDefaultExternalStore temporarily, otherwise some page saves will fail while it's down.
  4. If MySQL doesn't automatically start when you reboot the machine, punch it manually!
    /etc/init.d/mysqld start should usually do it.

In rarer cases it may be necessary to fix replication or re-clone databases to replace a dead slave. These exercises are left to the reader.