Jump to content

Decom script

From Wikitech

A real life example how to decom a host using the latest method, a Spicerack cookbook which replaced "wmf-decommission-host".

1) ssh to one of the cumin masters: cumin1002.eqiad.wmnet, cumin2002.codfw.wmnet

2) example command, as dry-run:

sudo cookbook -d sre.hosts.decommission analytics1032.eqiad.wmnet -t T233080

3) replace host name, ticket ID and remove the "-d" to actually run it.

example output of the dry run:

elukey@cumin1001:~$ sudo cookbook -d sre.hosts.decommission analytics1032.eqiad.wmnet -t T233080
DRY-RUN: Executing cookbook sre.hosts.decommission with args: ['analytics1032.eqiad.wmnet', '-t', 'T233080']
DRY-RUN: START - Cookbook sre.hosts.decommission
ATTENTION: destructive action for 1 hosts: analytics1032.eqiad.wmnet
Are you sure to proceed?
Type "done" to proceed
> done
DRY-RUN: Resolved CNAME record for icinga.wikimedia.org: icinga.wikimedia.org. 300 IN CNAME icinga1001.wikimedia.org.
DRY-RUN: MGMT_PASSWORD environment variable not found
Management Password:
DRY-RUN: Scheduling downtime on Icinga server icinga1001.wikimedia.org for hosts: ['analytics1032.eqiad.wmnet']
DRY-RUN: Executing commands ['icinga-downtime -h "analytics1032" -d 14400 -r "Host decommission - elukey@cumin1001 - T233080"'] on 1 hosts: icinga1001.wikimedia.org
DRY-RUN: Downtimed host on Icinga
DRY-RUN: Resolved A record for analytics1032.mgmt.eqiad.wmnet: analytics1032.mgmt.eqiad.wmnet. 3600 IN A 10.65.3.215
DRY-RUN: Management FQDN for analytics1032.eqiad.wmnet is analytics1032.mgmt.eqiad.wmnet
DRY-RUN: Scheduling downtime on Icinga server icinga1001.wikimedia.org for hosts: ['analytics1032.mgmt.eqiad.wmnet']
DRY-RUN: Executing commands ['icinga-downtime -h "analytics1032" -d 14400 -r "Host decommission - elukey@cumin1001 - T233080"'] on 1 hosts: icinga1001.wikimedia.org
DRY-RUN: Downtimed management interface on Icinga
DRY-RUN: Executing commands ['true'] on 1 hosts: analytics1032.eqiad.wmnet
DRY-RUN: Executing commands ["lsblk --all --output 'NAME,TYPE' --paths | awk '/^\\/.* disk$/{ print $1 }' | xargs -I % bash -c '/sbin/wipefs --all --force %*'"] on 1 hosts: analytics1032.eqiad.wmnet
DRY-RUN: Wiped bootloaders
DRY-RUN: Running IPMI command: ipmitool -I lanplus -H analytics1032.mgmt.eqiad.wmnet -U root -E chassis power off
DRY-RUN: Powered off
DRY-RUN: skipping host status write due to dry-run mode for analytics1032 Active -> Decommissioning
DRY-RUN: Set Netbox status to Decommissioning
DRY-RUN: Skip removing host analytics1032.eqiad.wmnet from Debmonitor in DRY-RUN
DRY-RUN: Removed from DebMonitor
DRY-RUN: Executing commands ['puppet node clean analytics1032.eqiad.wmnet', 'puppet node deactivate analytics1032.eqiad.wmnet'] on 1 hosts: puppetmaster1001.eqiad.wmnet
DRY-RUN: Removed from Puppet master and PuppetDB
DRY-RUN: Skip updating Phabricator task T233080 in DRY-RUN with comment: cookbooks.sre.hosts.decommission executed by elukey@cumin1001 for hosts: `analytics1032.eqiad.wmnet`
-  analytics1032.eqiad.wmnet (**PASS**)
  - Downtimed host on Icinga
  - Downtimed management interface on Icinga
  - Wiped bootloaders
  - Powered off
  - Set Netbox status to Decommissioning
  - Removed from DebMonitor
  - Removed from Puppet master and PuppetDB
DRY-RUN: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)

example output of the actual run:

[cumin1001:~] $ sudo cookbook sre.hosts.decommission labtestcontrol2001.wikimedia.org -t T218021
START - Cookbook sre.hosts.decommission
Scheduling downtime on Icinga server icinga1001.wikimedia.org for hosts: ['labtestcontrol2001.wikimedia.org']
Scheduling downtime on Icinga server icinga1001.wikimedia.org for hosts: ['labtestcontrol2001.mgmt.codfw.wmnet']
Removed host labtestcontrol2001.wikimedia.org from Debmonitor
Updated Phabricator task T218021
END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)

You should see logmsgbot and stashbot talk about it on #wikimedia-operations and your Phabricator ticket should be automatically updated.

An example on a Phabricator ticket the result looks like https://phabricator.wikimedia.org/T218021#5107910

Also see: https://doc.wikimedia.org/spicerack/master/cookbook.html