Dumps/Dumpsdata hosts

From Wikitech
Jump to: navigation, search

XML Dumpsdata hosts

Hardware

We have two hosts:

  • Dumpsdata1001 in D.C., production:
    Hardware/OS: PowerEdge R730xd, Debian 8 (jessie), 32GB RAM, 1 quad-core Xeon ES2623 cpu, HT enabled
    Disks: 12 4TB disks in 1 12-disk raid10 volume; two 1T disks in raid 1 for the OS
  • Dumpsdata1002 in eqiad, spare:
    Hardware/OS: PowerEdge R730xd, Debian 8 (jessie), 32GB RAM, 1 quad-core Xeon ES2623 cpu, HT enabled
    Disks: 12 4TB disks in 1 12-disk raid10 volume; two 1T disks in raid 1 for the OS

Services

The production host is nfs-mounted on the snapshot hosts; generated dumps are written there and rsynced from there to the web and rsync servers.

Deploying a new host

You'll need to set up the raid arrays by hand. The single LVM volume mounted on /data is ext4.

Install in the usual way (add to puppet, copying a pre-existing production dumpsdata host stanza, set up everything for PXE boot and go). You will need to add the new host to the dumps_web_rsync_server_clients[ipv4][internal] list in common.yaml. You will also need to add an entry for it in hieradata/hosts; copy either the stanza for the primary generator host, if the new host is to replace it, or the stanza for a secondary (fallback) host, if the new host is to become a fallback.

Additionally you must add your host to $peer_hosts in profile::dumps::rsyncer and profile::dumps::rsyncer_peer so that rsyncs can be done from this host to the rest of the dumpsdata and dumps web server hosts. This must be done for both primary and secondary hosts.

If it is a fallback host, data must be rsynced to it from the primary on a regular basis. To make this happen, add the hostname to the xmlremotedirs and miscremotedirs arguments as passed from profile::dumps::generation::server::primary; you'll need to know the rsync path to the public directory for the dumps on the new host, in case it is set up differently than the rest, and you'll likewise need to know the rsync path to the directory for misc ('other') dumps. Because the initial rsync can be quite timeconsuming, it's best to do a manual rsync first from one of the web servers, and then enable the rolling rsync in puppet.

If it is to be a primary host and the old primary is to go away, when you are ready to make the switch you will need to change profile::dumps::generation::worker::common so that the dumpsdatamount resource mounts the new server's filesystem on the snapshot hosts instead of mounting the old primary server.

Space issues

These hosts should have three sets of xml/sql dumps at all times, thus ensuring that there is always one set that contains full revision history, usable for prefetch for the next such run. If we start getting low on space, this is a huge issue; we need to head that off well before it happens.