Server Lifecycle/Reimage

From Wikitech

The act of reimaging refers to the clean installation of the OS (operating system) on a host. It could be either the first installation of a new fresh host (imaging) or the upgrade of a host to a newer OS version (reimaging). During the reimage process all data on the host will be lost, unless a specific partman recipe that retains data in specific partition has been configured (see Example ).

Physical hosts

For physical hosts the reimage process is performed through the sre.hosts.reimage cookbook. The --os option is mandatory and allows specifying with OS version should be installed. When asked for the Management password, find it in the management file on pwstore.

How to run it

For general informations on how to run a Cookbook see Spicerack/Cookbooks#Run_a_single_Cookbook.

Check available options

sudo cookbook sre.hosts.reimage -h

Reimage of a generic host that doesn't need any specific options

sudo cookbook sre.hosts.reimage --os bullseye -t T12345 somehost1001

Reimage of a generic host behind LVS

sudo cookbook sre.hosts.reimage --os bullseye --conftool -t T12345 somehost1001

Reimage of a MediaWiki host

sudo cookbook sre.hosts.reimage --os bullseye --conftool --httpbb -t T12345 somehost1001

Image of a new or previously decommissioned host

sudo cookbook sre.hosts.reimage --os bullseye --new -t T12345 somehost1001

Operations performed by the cookbook

Pre-reimage validation operations

  • Ensure the cookbook is run within a screen/tmux
  • Ensure the host to reimage is a physical host
  • If --new is set:
  • If --new is not set:
    • Ensure that the host exists in PuppetDB
  • Check that both the host and its management DNS names resolve correctly
  • Check that the host's IPMI is reachable and it's possible to execute commands

Reimage operations

  • Update the Phabricator task saying that the reimage has been started (if -t/--task is set)
  • Downtime on Icinga (unless --new or --no-downtime are set)
  • Depool from conftool (if -c/--conftool is set)
  • Disable Puppet on the host (but doesn't fail if it's unreachable or unable to disable it)
  • Remove the host from Puppet and PuppetDB: Puppet#Misc
  • Delete any existing Puppet certificate for the host
  • Remove the host from Debmonitor
  • Unless --no-pxe is set:
    • Generate a temporary DHCP snippet on the install server in the same datacenter of the host with the specified OS version to use and restart the DHCP server
    • Force next boot to go via PXE via IPMI
    • Reboot the host via IPMI (power cycle or power up based on current power status) and poll until reachable
    • Verify that the host has rebooted into the Debian installer environment
    • Poll until the host gets rebooted by the Debian installer into the new OS
    • Verify that the host has rebooted into a new OS and not again into the Debian installer
    • Delete the temporary DHCP snippet and restart the DHCP server as the assigned IP has been statically configured in the host at this point
  • Mask the provided systemd services (if --mask is set)
  • Generate a Puppet certificate request on the host, poll for the CSR on the active Puppet CA server, verify its fingerprint and sign the new certificate
  • Run Puppet in NOOP mode to compile the catalog and populate PuppetDB with the exported resources. Poll PuppetDB until the Nagios_host resource appears.
  • Downtime the new host in Icinga forcing a Puppet run on the Icinga server host to get first all the check definitions for the new host generated by the exported resources
    • This step is always performed and not affected by the --no-downtime option that affects only the downtime prior of the reimage
  • Run Puppet for the first time (this step takes a long time), asking the user what to do on failure
  • Ensure that the BIOS boot parameters are back to normal to prevent an accidental reboot into PXE
  • Run Puppet on the host where the cookbook is running to get the known host key of the new host
  • Reboot the new host and poll until reachable
  • Poll until a successful Puppet run is completed
  • Run Httpbb /srv/deployment/httpbb-tests/appserver/* against the host (if --httpbb is set)
  • Print the command to unmask the masked units, if there is any
  • Force a recheck of all Icinga checks and poll until the host status in Icinga waiting to reach optimal status:
    • If the host status is optimal within few minutes remove the downtime
    • If the host status is still not optimal after few minutes print a warning and tell the user that the downtime has not been removed
  • Print the command to repool any depooled service, if there is any
  • Run the interface_automation.ImportPuppetDB Netbox script for the host to import all the data from PuppetDB and show its results
  • If the host is in PLANNED status in Netbox, update that to STAGED
  • Update the Phabricator task with the result of all the actions performed (if -t/--task is set)

Virtual hosts

This is all covered in the Ganeti docs.

DHCP Automation

Workflow

  • On each install host the DHCP configuration includes a usually empty file: /etc/dhcp/automation/proxies/opt82-ttyS1-115200.conf
  • The reimage cookbook creates a DHCP snippet file in the /etc/dhcp/automation/opt82-ttyS1-115200 directory with the host{...} block for just the host that is being reimaged, with the configuration required to assign to it its assigned primary IPv4 in Netbox.
  • The /usr/local/sbin/dhcpincludes script is then run, that takes care of populating the mentioned /etc/dhcp/automation/proxies/opt82-ttyS1-115200.conf file with additional includes, one per available DHCP snippet file, and takes care of checking that the whole DHCP configuration is correct and restarts the DHCP server.
  • Once the host is rebooted into the new OS and the assigned IP has been statically configured in its /etc/network/interface, the reimage cookbook deletes the DHCP snippet and re-run the /usr/local/sbin/dhcpincludes script.

Key concepts

All the switches have enabled the DHCP relay agent information option (option 82) that injects additional information (switch hostname, interface name and vlan name) in the DHCP packets destined for a DHCP server.

The DHCP configuration snippets are generated getting the data that will be injected by the switch from Netbox so that the DHCP server will be able to match the request packets with the host and assign the right IP to it. See the example below.

This approach allowed to remove all the hardcoded MAC addresses from the Puppet repository and the need to keep the DHCP configuration for all physical hosts active all the time. One of the benefits, among keeping the DHCP configuration much smaller and simpler, is that if a host is rebooted by accident into PXE mode, it will not get any IP assigned and hence will not wipe its data.

Example snippet

host somehost1001 {
    host-identifier option agent.circuit-id "asw2-a-eqiad:ge-6/0/1.0:private1-a-eqiad";
    fixed-address 10.0.0.1;
    option pxelinux.pathprefix "http://apt.wikimedia.org/tftpboot/bullseye-installer/";
}

What to do if...

The reimage fails

Because the reimage process is as much as possible idempotent it should be fine to run it again in case of failure. In case the host has been already removed from PuppetDB the pre-reimage validation step will fail and tell the user to retry setting the --new option. In case the reimage fails after the Debian installer has successfully installed the new OS the reimage can be resumed skipping the reboot into PXE and a new re-installation of the OS setting the --no-pxe option.

To check the cookbook logs see Spicerack/Cookbooks#Logs

Debian Installer Can't Detect NICs

If the Debian Installer fails to detect the connected NIC (usually manifests as a DHCP failure), you may need to update your NIC firmware. We have observed this behavior with Broadcom NetXtreme-E NICs . Check the DC Ops page for latest recommended firmware.

IPMI fails

Follow the steps outlined in the Management Interfaces page to troubleshoot and fix the IPMI connection with the host.

DHCP issues

To troubleshoot DHCP allocation check the DHCP logs in the /var/log/messages file on the install server host in the same datacenter of the reimaged host. In addition it's possible to test DHCP provisioning in isolation with the sre.hosts.dhcp cookbook. See Spicerack/Cookbooks#Cookbook_Operations for how to run it.

For any troubleshooting connect to the console to monitor where the issue happens.

  • Is tcpdump on the local install server seeing inbound DHCP and TFTP packets for the relevant host?
    • tcpdump -vvv 'udp and (src port 67 or src port 68 or src port 69)'
  • Is the host going past the PXE screen and booting into d-i?
    • If yes:
      • Is the NIC firmware using a known to be working version? (see with DCops if needed)
    • If not:
      • If using 10G nics, is the bios configured to PXE boot on those?
      • Is there any relevant Netbox report alerts (eg. incorrect network info)?

Icinga configuration correctness alert fires

This alert fires if the Icinga configuration is not valid. Usually this should not fire during a reimage, but it might happen if there are Icinga services defined outside the host that have the reimaged host as hostname. One notable case is if in the hiera service::catalog key in hieradata/common/service.yaml there is a service that hardcodes hostnames and one of the hostname is the host being reimaged. In that case the Icinga check for the service will still be part of the configuration while the host itself has been removed. In that case the Icinga configuration will be invalid and prevent any reload for the time of the reimage. To be more precise the alert will fire at the first Puppet run on the Icinga host after the reimaged host has been removed from PuppetDB by the cookbook (towards the start of the reimage), and it will recover when the reimage cookbook forces a Puppet run on the Icinga host after the new definitions for the reimage host has been exported.