Ganeti evaluation

From Wikitech

Goal

The goal is to evaluate ganeti as a virtualization cluster management tool for miscellaneous services

Quick breakdown of design

  • Two machines, thallium/mercury, networked with access ports in the same vlan/IP public address space.
  • One shared IP for the cluster (ganeti.wikimedia.org)
  • No shared storage (usage of DRBD)

Rammifications of the design

VMs can have no network which was considered OK for the eval. We can easily bypass that by creating a VLAN and giving them IPs or turning the host machines into routers. Should be discussed after the eval though. In general ganeti is made to work both in bridged as well as routed mode.

Important info

  • ganeti.wikimedia.org is a DNS A(AAAA) record that points to the current master everytime.
  • Ganeti is/can be a shared NOTHING Virtualization cluster management suite
  • The master is manually switched over. Failure of the master does NOT mean disruption of services. Just loss of management for VMs (and if the master has VMs, disruption for those too).

Installation

After puppet has run for the first time

On both nodes


# aptitude install ganeti drbd8-utils ganeti-htools ganeti-instance-debootstrap qemu-kvm ndisc6
# swapoff -a # Prerequisite for the next command since swap was on LVM
# vgrename thallium-vg ganeti-vg # All nodes need the same LVM VG
# swapon -a
# cat > /etc/modprobe.d/drbd.conf <<EOF
options drbd usermode_helper=/bin/true
EOF

On the master only (thallium)


# gnt-cluster init --candidate-pool-size=2 --enabled-hypervisors=kvm --master-netdev=eth0 --vg-name=ganeti-vg ganeti.wikimedia.org
# gnt-cluster modify --nic-parameters=link=100,mode=routed
# gnt-cluster modify -H kvm:kernel_path=/boot/vmlinuz-3.13.0-24-generic (not really used, but needed to be set at something to squash warnings)
# gnt-node add mercury.wikimedia.org (a ganeti bug/problem with trusty was discovered here, patched on the fly, will probably need newer packages,  already talking with the maintainer
# gnt-cluster command cp /root/.ssh/id_dsa.pub /root/.ssh/authorized_keys2 (some commands need root)

We are good and ready to go

Operations

Login to ganeti.wikimedia.org using ssh. You should elevate privileges to root before doing anything else

Create a new VM

# gnt-instance add -s 10g -t drbd -o debootstrap+default --no-name-check --no-ip-check --net 0:ip=192.168.0.1 testme

Login in to the console of a VM

# gnt-instance console testme

Reboot/shutdown/startup a VM

# gnt-instance reboot testme
# gnt-instance shutdown testme # Will send ACPI shutdown signal, after 2 mins will forcefully kill the VM
# gnt-instance startup testme

Modify the VM attributes

# gnt-instance modify -B mem=1g testme

Live migrate a VM

# gnt-instance migrate testme

Non-Live migration of a VM

# gnt-instance failover testme

Run cluster verification

# gnt-cluster verify 
# gnt-cluster verify-disks 

Node statistics

# gnt-node list

And much much more, read the docs at http://docs.ganeti.org/ganeti/current/html/admin.html