MariaDB/Provisioning a host

From Wikitech
Jump to navigation Jump to search

Host preparation

Downtime the host in Icinga

Semi-automatically:

cookbook sre.hosts.downtime DBNAME -D1 -t TXXXXXX -r "provisioning - TXXXXXX"

Manually:

  1. Open https://icinga.wikimedia.org/cgi-bin/icinga/status.cgi?search_string=DBNAME.
  2. Select "Schedule downtime for checked hosts" from the top right menu.
  3. TBD: next steps

Adding a new host

  1. TBD

Replacing a failed host

  1. Identify an available host in the same DC
    1. TBD: how?
    2. TBD: What to do if there aren't any?
  2. Match the OS version the replaced host was running
    1. Reimage the host to Stretch if needed, following the instructions in Server_Lifecycle/Reimage

Section and replication configuration

Add the host to a section:

  1. In manifests/site.pp:
    1. Add the DBNAME to the appropriate regexps
    2. Remove the insetup role from DBNAME
  2. In hieradata/hosts/DBNAME.yaml add
    1. For single-instance hosts: mariadb::shard: 'SECTIONNAME'
    2. For multi-instance hosts:

Configure the correct replication mode:

  1. In hieradata/hosts/DBNAME.yaml add: mariadb::binlog_format: 'BINLOG_FORMAT', where BINLOG_FORMAT is:
    1. (TBD: when?) STATEMENT
    2. (TBD: when?)ROW

Add the host to dbctl config (example)

  1. In conftool-data/dbconfig-instance/instances.yaml add - DBNAME in the appropriate location

Add the host to Mediawiki's database loadbalancer configuration (example)

  1. Log on to one of the cluster management hosts (cumin1001.eqiad.wmnet, cumin2002.codfw.wmnet)
  2. Run sudo dbctl --scope DCNAME instance DBNAME edit, where DCNAME is eqiad or codfw
  3. Fill the template:
    1. Set pooled: false
    2. Replacing a failed host?
      1. Yes: mirror the replaced host configuration
      2. No: For now, just ask (TBD: guidelines?)
  4. Run dbctl config commit -m "COMMIT_MESSAGE"

Data population

Cloning

  1. Clone (TBD: How?)
  2. Start MariaDB
  3. Wait until replication has caught up (all green on Icinga)

Restoring from backup

  1. MariaDB/Backups#Provision_a_precompressed_and_prepared_snapshot_(preferred)

Replication from master

Enabling notifications (example):

  1. In hieradata/hosts/DBNAME.yaml delete profile::base::notifications: disabled

Next steps

  1. Ask other team members about schema changes that were executed since the timestamp of the backup you restored the host from and apply them (if needed)
  2. Pool the host in (TBD: link needed)



This page is a part of the SRE Data Persistence technical documentation
(go here for a list of all our pages)