Add a server

From Wikitech
This page may be outdated or contain incorrect details. Please update it if you can.

For any MediaWiki app server:

  • Install Ubuntu
  • Install wikimedia-task-appserver
  • Add the hostname to /usr/local/dsh/node_groups/mediawiki-installation on the ssh bastion host

Additionally, for a non-apache batch server:

  • Stop apache: /etc/init.d/apache2 stop
  • Disable apache by running update-rc.d -f apache2 remove
  • Add to ganglia with:
apt-get -y --no-remove install gmond ganglia-metrics
cp -r /home/wikipedia/conf/gmond /etc
rm /etc/gmond.conf
ln -s gmond/misc.conf /etc/gmond.conf
/etc/init.d/gmond restart
/etc/init.d/gmetricd restart

Additionally, for a main pool apache server:

  • Add the hostname to /usr/local/dsh/node_groups/apaches
  • Add to nagios with cd /home/wikipedia/conf/nagios && ./sync
  • Add to ganglia with:
apt-get -y --no-remove install gmond ganglia-metrics
cp -r /home/wikipedia/conf/gmond /etc
rm /etc/gmond.conf
/home/wikipedia/conf/gmond/make-apache-symlink
/etc/init.d/gmond restart
/etc/init.d/gmetricd restart
  • Add the server to /etc/pybal/apache on the LVS director for the apaches, currently lvs3. The weight should be proportional to the CPU count.

List of things that will break if you try to install MediaWiki without following this procedure

  • Not in mediawiki-installation node group: server doesn't get sync-file/scap updates, so misses out on DB server and similar changes. Thus it goes rogue and destroys the cluster.
  • No wikimedia-nis-client: non-roots can't sync. Server misses out on updates, goes rogue.
  • No sudoers (from wikimedia-task-appserver): non-roots can't sync.
  • No upload NFS mounts (from wikimedia-task-appserver): MediaWiki will spew errors when it tries to access uploads, corrupting your data and potentially corrupting the shared caches.
  • No latex, djvulibre, ploticus, etc. (from wikimedia-task-appserver): Bad output, corrupted caches.
  • Apache not in apaches node group: apache-restart-all etc. fails, unmonitored in nagios.
  • Server not in ganglia: by the time you realise you need it, you will have missed hours or days of important performance data.
  • Running apt-get -y unattended without the --no-remove option: due to some subtle error in the repository or sources.list, apt-get declares your entire installation "conflicting", right down to glibc, and removes it, bricking the server.
  • Apache not in nagios: Tim gets upset.