Phabricator/Disaster Recovery

From Wikitech

Disaster recovery plan for phabricator.wikimedia.org

For reference, here is a list of patches required to migrate to a new phabricator server: https://gerrit.wikimedia.org/r/q/topic:phab1002

And here is a more recent migration from phab1001 to phab1003: https://gerrit.wikimedia.org/r/q/%2523phabricator-server-switch

High level overview

  • Switch Phabricator production to phab2001 (in codfw)
    • verify that git-ssh is working on phab2001
    • shut down phd on phab1001
    • manually rsync the git data from phab1001 to phab2001 (this is already rsync'd periodically, just need a one-time refresh)
    • verify that phd works on phab2001
    • switch `phabricator_active_server` to `phab2001` in hiera, `role/common/phabricator/main.yaml`
    • test phabricator's web interface by locally overriding dns records to point to phab2001
  • update dns / redirect traffic to phab2001
    • Update SFP records for the phabricator hostname (example)
  • Make phab1001 a warm standby for phab2001
    • verifying that everything is installed correctly
    • manually rsync the git repositories.
    • make sure the rsync cron job is set up correctly to sync from phab2001

See Also