Jump to content

Switch Datacenter/DeploymentServer

From Wikitech

This page describes the procedure to switch over the active deployment server from one host to another.

This is suitable for use either as part of the regularly scheduled datacenter switchover or more generally when replacing the active deployment server with a newer one.

In either case, you will want to schedule an exclusive window in the Deployments calendar (at least one hour) and coordinate with potential deployers.

Procedure

  • Disable puppet on deployment servers. From a cluster-management host:

sudo cumin 'A:deployment-servers' 'disable-puppet "Deployment server switchover - TXXXXXX"'

  • Merge a DNS change that points the deployment.eqiad.wmnet CNAME record to the new active host, see this change for an example.
  • Change deployment_server and scap::deployment_server variables in hiera, see this change for an example.
  • Run puppet on the new active deployment server: sudo run-puppet-agent -e "Deployment server switchover - TXXXXXX"
  • Run puppet on all the other servers (same command).
  • Using sudo cumin 'A:deployment-servers' 'grep block_deployments /etc/scap.cfg' (on a cluster-management host), verify that:
    • Deployments on the previously active server are blocked
    • Deployments on the newly active server are not blocked
  • Since keyholder configuration might have changed but not been reloaded on the newly active deployment server when it was a spare, restart the keyholder service there and test it:
$ sudo systemctl restart keyholder-proxy.service
$ SSH_AUTH_SOCK=/run/keyholder/proxy.sock ssh -i /etc/keyholder.d/deploy_jenkins -l deploy-jenkins releases1002.eqiad.wmnet
  • Workaround for task T197470: Run the following on all deployment servers after replacing the deployment server URLs accordingly. For example, if switching from deploy1003 to deploy2002:
$ sudo -i
# find /srv/deployment -name DEPLOY_HEAD | xargs sed -i "s/git_server: deploy1003.eqiad.wmnet/git_server: deploy2002.codfw.wmnet/"
  • Test a scap deployment, noting that this may take quite some time on the first attempt: scap sync-world "Test deployment to validate deployment server switchover - TXXXXXX". This will also test helmfile deployments.
  • Email ops@ about the switch of active deployment server and update any IRC channels where the ongoing work is being tracked.
  • Update Deployment_server to reflect the change in active server.