From Wikitech
Jump to: navigation, search

OpenStack is the primary software we're using for Wikimedia Labs. KVM is used as the hypervisor.


Initial configuration

Puppet handles the majority of the installation process. Initial user creation, database creation, and authentication is handled manually.

Install LDAP

This may be on the controller, but it's generally wise to segregate authentication systems. See the extensive LDAP documentation.

On the MediaWiki node install MediaWiki and dependencies

This may be the controller. The prerequisites are handled by puppet. See the MediaWiki documentation for installing MediaWiki.

TODO: MediaWiki installation and configuration should be managed by puppet or git deploy.

On the DNS node(s), install pdns and ldap backend

This can not be on the same system as the network node, unless you do one of the following actions:

  • Run pdns on a different port, and add an iptables rule forwarding DNS traffic to pdns
  • Bind pdns to the public network IP address, and ensure dnsmasq binds to the host private network IP address

This is being handled by puppet; please see the repository for installation details.

On the controller, create the databases

First, via MySQL, create a user for nova, and the nova database. Grant the nova user all permissions on the nova database. Next, have nova create its schema:

nova-manage db sync

Next, via MySQL, create a user for glance, and add the glance database. Grant the glance user all permissions on the glance database. Next, have glance create its schema:

glance-manage db_sync

Next, via MySQL, create a user for keystone, and add the keystone database. Grant the keystone user all permissions on the keystone database. Next, have keystone create its schema:

keystone-manage db_sync

On the controller, create the fixed instance network

nova-manage network create 1 255

Now, you'll need to modify the database to fix the bridge and vlan (it uses br100 and no vlan by default):

mysql -uroot
mysql> use nova;
mysql> update networks set bridge='br103';
mysql> update networks set vlan='103';
mysql> exit
  • TODO: check this for accuracy?
  • TODO: update this for neutron
  • TODO: update this for ipv6

Add the novaadmin

The novaadmin user is used for the MediaWiki write user and for keystone. Add the user to LDAP, configure MediaWiki to use it for writes, and log-in to MediaWiki as the user (to create the user in the wiki).

You should not make this user an Administrator in MediaWiki since it's better to limit the permissions of the user. In fact, the user should be denied login, if possible (using two factor auth is a good option here).

Give the nova admin access to manage users, groups, hosts and projects in LDAP

The following entry will give novadmin full access to the people and groups ou:

dn: ou=people,dc=labs,dc=wikimedia,dc=org
changetype: modify
replace: aci
aci: (targetattr="*") (version 3.0; acl "novaadmin - people"; allow (all)

dn: ou=groups,dc=labs,dc=wikimedia,dc=org
changetype: modify
replace: aci
aci: (targetattr="*") (version 3.0; acl "novaadmin - people"; allow (all)

dn: ou=hosts,dc=labs,dc=wikimedia,dc=org
changetype: modify
replace: aci
aci: (targetattr="*") (version 3.0; acl "novaadmin - people"; allow (all)

dn: ou=projects,dc=labs,dc=wikimedia,dc=org
changetype: modify
replace: aci
aci: (targetattr="*") (version 3.0; acl "novaadmin - people"; allow (all)

Create a wiki user for yourself, and promote it using WikiSysop

You'll need to enable '$wgLDAPUseLocal = true;' temporarily to log in as WikiSysop and to create the account. After creating the account, promote it and disable $wgLDAPUseLocal.

Add a new project via the wiki

Add yourself to the default (bastion) project, and all project roles.

Add your account to the global roles via the wiki

The initial user should have all access.

On the controller, download a UEC image, and add the image using the glance command

glance add --name "ubuntu-<versionnumber>-<versionname>" --is-public true --disk-format "qcow2" --container-format "ovf" < <image>-disk1.img

Access control

Wiki access

User accounts on the wiki (which is also shared credentials for gerrit and nova) are self-registration.

Project and role membership

Initial project and role membership must be assigned by a wiki admin (a member of the ops team, or a trusted delegate). Further project and role membership can be handled by project admins.

Group membership

Each user has a default group that is shared by all users. Each project is a posix group (named project-<projectname> to avoid clashes with system-local groups). When a user is added to a project, they are added to that project's posix group.

SSH access

SSH access is allowed via SSH keys. Users can add their own keys to their accounts. Access to instances is controlled via access.conf. Users will only be allowed to access instances which belong to projects that the users are a member of.

access.conf will be controlled via puppet. It will use a template to create the access rules via the instanceproject variable, for example:

- : ALL except root project-(<%= instanceproject %>) : ALL

Service groups

Service groups are like unix system accounts, but are available on all instances in a project. Service groups are a project-local user/group combination. Project members and other service groups can be members of service groups. A sudo policy is automatically created for service groups that allows its members to become the user.

Service groups can be created by any project member. Membership is maintained by any member of the service group or by project admins.

Sudo access

Sudo access is managed via policies created and maintained by project admins. All or select project members and service groups can be added to sudo policies. Sudo policies can be applied to all instances, or to select instances.

Home directories

Each project will have its own set of home directories, shared via NFS. The home directories for a project will only be shared to instances of that project. The home directories will be created by a script which pulls information from LDAP. Home directories are created via pam_mkhomedir and are created when a user first logs in.

The shares file will be managed via a script, that will pull the instance information from LDAP and create a share file; for instance:

/home/<project1> <instance1>,<instance2>(rw)
/home/<project2> <instance3>,<instance4>(rw)

Project storage

Each project has its own project storage, which is shared via Gluster or NFS, and is automounted at /data/project.

The shares are created by a script. The shares are mounted by autofs via an entry in LDAP:

dn: nisMapName=/data,nisMapName=auto.master,dc=wikimedia,dc=org
nisMapName: auto.master
nisMapName: /data
objectClass: top
objectClass: nisObject
cn: /data
dn: cn=*,,dc=wikimedia,dc=org
nisMapName: auto.home
objectClass: top
objectClass: nisObject
nisMapEntry: -fstype=glusterfs ${GSERVNAME}:/${PROJECT}-&
cn: *

The autofs variables above are filled in via puppet, where GSERVNAME is the gluster server name used to mount the share, and PROJECT is the instance's project.

Puppet and Gerrit

Our puppet configuration is stored in the operations/puppet git repository that is managed by Gerrit. All work is done inside of the production branch.

Current Architecture


Network design

  • Tampa
    • Test/dev
      • Private host network
        • Range:
        • Gateway:
        • VLAN: 105
        • Interface: eth0
      • Private guest network
        • Range:
        • Gateway:
        • VLAN: 103
        • Interface: eth1
      • Public Floating IP network
        • Range:
        • Range:
  • Eqiad
      • Public Floating IP network
        • Range:

Common actions

Managing floating IP addresses

List current IP addresses
nova --os-tenant-id <project-name> floating-ip-list
Add a new floating IP address to the pool
nova --os-tenant-id <project-name> floating-ip-create

Managing project quotas

List a project's quota
nova --os-tenant-name <project-name> quota-show
Change a project's floating IP quota
nova quota-update <project-name> --floating-ips <new-number>

Building new images

Wmflabs proposes images for the Ubuntu and Debian distributions. The images are generated using different tools and the sections below describe the step by step instructions to create new ones.

Building Ubuntu images

We use vmbuilder to build our custom Ubuntu images. The vmbuilder configuration is in puppet in the labs-vmbuilder module. It can be added to a node using role::labs::vmbuilder. Here's a set of steps to build and import the images:

On labs-vmbuilder-trusty:

puppet agent -tv
cd /mnt/vmbuilder
rm -Rf ubuntu-trusty
vmbuilder kvm ubuntu -c /etc/vmbuilder.cfg -d /mnt/vmbuilder/ubuntu-trusty -t /mnt/vmbuilder/tmp --part=/etc/vmbuilder/files/vmbuilder.partition

Note the name of the tmp file generated; for instance: "Converting /tmp/tmpD0yIQa to qcow2, format /mnt/vmbuilder/ubuntu-trusty/tmpD0yIQa.qcow2"

On labs-vmbuilder-precise:

puppet agent -tv
cd /mnt/vmbuilder
rm -Rf ubuntu-precise
vmbuilder kvm ubuntu -c /etc/vmbuilder.cfg -d /mnt/vmbuilder/ubuntu-precise -t /mnt/vmbuilder/tmp --part=/etc/vmbuilder/files/vmbuilder.partition

Note the name of the tmp file generated; for instance: "Converting /tmp/tmpD0yIQa to qcow2, format /mnt/vmbuilder/ubuntu-precise/tmpD0yIQa.qcow2"

Building a Debian image

We build debian images using bootstrap-vz. The bootstrap-vz config is puppetized in the class labs_bootstrapvz -- we use a custom build of the bootstrap-vz package, documented below.

To build a Debian Jessie image, log in to labs-bootstrapvz-jessie:

sudo su -
cd /target # This is where the image will end up when we finish
rm *.raw && *.qcow2 # Make space for our new build
bootstrap-vz /etc/bootstrap-vz/manifests/labs-jessie.manifest.yaml
qemu-img convert -f raw -O qcow2 ./<new image name>.raw ./<new image name>.qcow2

You can boot the image locally for testing, like this:

qemu-system-x86_64 ./<new image name>.raw --curses

Unfortunately, qemu's default behavior is to suppress all boot logs, so you'll be looking at a mostly-blank screen for several minutes before getting a login prompt with no working password. Turning on a working login account for test purposes is left as an exercise to the reader -- bootstrap-vz creates one by default (login: root passwd:test) but our config wisely disables it.

Bootstrap-vz uses source files from /etc/bootstrap-vz. These files are puppetized, so you'll want to disable puppet if you change them.

Bootstrap-vz also uses several source files that are standard local config files on the build host. For a complete list of these files, look at the 'file_copy:' section in /etc/bootstrap-vz/manifests/labs-jessie.manifest.yaml

Build the bootstrap-vz package

Andrew built python-bootstrap-vz_0.9wmf-1_all.deb using stddeb. It was built from the 'development' branch on 2014-12-22 with commit 255f0624b49dbcf6cacccd3b2f1fa7c0cc2bcc8d and the patch, below. To reproduce:

diff --git a/ b/
index f7b97ac..349cfdc 100644
--- a/
+++ b/
@@ -22,11 +22,8 @@ setup(name='bootstrap-vz',
       install_requires=['termcolor >= 1.1.0',
                         'fysom >= 1.0.15',
                         'jsonschema >= 2.3.0',
-                        'pyyaml >= 3.10',
                         'boto >= 2.14.0',
                         'docopt >= 0.6.1',
-                        'pyrfc3339 >= 1.0',
-                        'requests >= 2.9.1',
       license='Apache License, Version 2.0',
       description='Bootstrap Debian images for virtualized environments',

  • Alter the version tag in vi bootstrapvz/ as needed
  • Install python-stdeb
  • python --command-packages=stdeb.command bdist_deb
  • ls deb_dist/*.deb

Installing the images

On labcontrol1001:

First, get the new .qcow2 images into /tmp on, by hook or by crook. Then, also on labcontrol1001...

sudo su -
source ~/ 
cd /tmp
openstack image create --file ubuntu-trusty.qcow2 --disk-format "qcow2" --container-format "ovf" --public "ubuntu-14.04-trusty (testing)"
openstack image create --file debian-jessie.qcow2 --disk-format "qcow2" --container-format "ovf" --public "debian-8.1-jessie (testing)"
# Test the images by booting instances in labs; if they don't work, delete
# the instances, then delete the images (using glance delete), then
# restart the process 
glance index
# find image ids
glance image-update --name "ubuntu-14.04-trusty (deprecated <date>)" <old-trusty-id> --purge-props
glance image-update --name "ubuntu-14.04-trusty" <new-trusty-id> --property show=true --property default=true
glance image-update --name "ubuntu-12.04-precise (deprecated <date>)" <old-precise-id> --purge-props
glance image-update --name "ubuntu-12.04-precise" <new-precise-id> --property show=true

Notice in the above glance image-update commands the use of properties. If default=true the image will be the default image selected in the instance creation interface; purging properties removes the 'default' state.

We used to use the 'show=true' property to ensure that an image appeared on wikitech. Now instead we use the image state, where only images with state=active appear in the gui (both on wikitech and in Horizon.) To deactivate your obsolete image:

source /etc/ 
openstack token issue
curl -X POST -H 'X-Auth-Token: <token id>'<image id>/actions/deactivate

To reactivate an image (because it was deactivated in error, or in order to permit a migration):

source /etc/ 
openstack token issue
curl -X POST -H 'X-Auth-Token: <token id>'<image id>/actions/reactivate


Reset state of an instance

You might have to do this if the actual state of the instance doesn't seem to correspond to reality (it says REBOOT or SHUTOFF when it isn't, or vice versa), or if nova isn't responding to any commands at all about a particular instance.

nova reset-state --active <uuid>

This changes the state of the instance with the uuid to 'ACTIVE', and hopefull fixes things (or blows up a baby panda, unsure!)

Rebooting hosts

Compute nodes

  1. Reboot host
  2. Ensure br103 has a proper MAC
    • I've seen a br103 come up with all 0's
  3. Ensure dev eth1 and eth1.103 are set to UP
  4. Restart nova-compute
  5. If you couldn't migrate all instances away, trying rebooting a single instance that is downed
    • Ensure you can ping it
  6. Migrate a single instance
    • Ensure you can ping it
  7. Migrate the rest of the instances back

Network node

  1. Reboot host
  2. Ensure dev eth1, eth2, eth3 and bond1.103 are set to UP
  3. Ensure dev bond1 is in br103
  4. Ensure you can ping an instance's public IP

Bringing a dead host back up

  1. Ensure br103 has a proper MAC
    • I've seen a br103 come up with all 0's
  2. Ensure dev eth1 and eth1.103 are set to UP
  3. Restart nova-compute
  4. Reboot a single instance that was downed
    • Ensure you can ping it
  5. Reboot all instances that were downed

Rebooting an instance

You can reboot instances via the labsconsole interface, or the euca-reboot-instances command on virt1.

Block Migration

Because we don't use shared storage for instance volumes, true live-migration is not available. Block migration works pretty well, though -- it causes a brief (minute or two) interruption to an instance but does not register as a reboot, and most running services should survive a block migration without any complaint.

This is useful for rebalancing when a compute node is overloaded, or for evacuating instances from a failing node.

On the nova controller (e.g. virt1000):

   source /root/
   nova live-migration --block-migrate <instanceid> <targethost>

You can check the status of a migrating instance with 'nova show <instanceid>'. Its status will show as 'migrating' until the migration is complete.

NOTE: There is one serious bug in the block-migrate feature in Havana. The migrate process attempts to check quotas on the target node, but ignores overprovision ratios. That means that the nova scheduler will frequently fill a host to the point where it can no longer accept live migrations. Because of this bug it will probably be necessary to keep two empty compute nodes in order to support complete evacuation of any one node.

Fixing a broken live migration

I've run into an issue where live migration failed and didn't properly rollback the migration, either leaving the instance shutdown on all hosts, or migrated without the database properly updated. In either case, it's possible to bring the instance up on the original host, then restart the migration. In the case of an instance being migrated, with the database improperly updated, you'll need to stop the instance on the second host, and start it on the original, then re-migrate.

To do so, copy the instance's nwfilter and qemu definitions from the second host to the original host, add the nwfilter, then create the domain:

virsh nwfilter-define /etc/libvirt/nwfilter/<nwfilter-instance>.xml
virsh create /etc/libvirt/qemu/<nwfilter-qemu>.xml

After doing so, it should be possible to restart the live migration.

You can avoid broken live migrations by only migrating a single instance at a time.

Recompress a live-migrated instance

In Nova icehouse (and possibly later versions) a block migrate removes the copy-on-write elements of the instance, causing it to take up vastly more space on the new host. The instance can be recompressed if you stop it first (at which point you might as well have used cold-migrate in the first place.) Here's an example of recompressing:

 andrew@labvirt1002:~$ sudo su -
 root@labvirt1002:~# cd /var/lib/nova/instances/c9030a35-4475-4581-a84c-1728d27bcf9b
 root@labvirt1002:/var/lib/nova/instances/c9030a35-4475-4581-a84c-1728d27bcf9b# # Make sure that the instance is STOPPED with 'nova stop'
 root@labvirt1002:/var/lib/nova/instances/c9030a35-4475-4581-a84c-1728d27bcf9b# mv disk disk.bak
 root@labvirt1002:/var/lib/nova/instances/c9030a35-4475-4581-a84c-1728d27bcf9b# qemu-img convert -f qcow2 -O qcow2 disk.bak disk
 root@labvirt1002:/var/lib/nova/instances/c9030a35-4475-4581-a84c-1728d27bcf9b# # Restart instance, make sure it is working.
 root@labvirt1002:/var/lib/nova/instances/c9030a35-4475-4581-a84c-1728d27bcf9b# rm disk.bak

Fixing an instance that won't reboot

Occasionally an instance my fail to reboot. You can usually solve this by using reboot via nova, but occasionally that fails as well. You can force a reboot by "destroying" the instance then telling nova to reboot the instance. This causes nova to "create" the instance. Of course, "destroy" and "create" really just kill the kvm process and start the process. You should not "delete" or "terminate" the instance.

To force reboot the instance, do the following:

  1. Figure out which host the instance is running on
  2. Destroy the instance (<instance-id> can be found via virsh list):
    virsh destroy <instance-id>
  3. If you see an error like below, then you'll need to restart the libvirt-bin process, then try the destroy
    Timed out during operation: cannot acquire state change lock
  4. Tell nova to reboot the instance via "reboot"

Mounting an instance's disk

Note below that nbd[0-9] means use a number 0-9.

To mount a disk, complete the following steps:

  1. Ensure the instance is not running, unless you want to corrupt the disk
  2. cd /var/lib/nova/<instance>
  3. Mount the disk using qemu-nbd:
    1. If the disk is simply an ext3 image:
      • qemu-nbd -c /dev/nbd[0-9] <disk>
      • mount /dev/nbd[0-9] <mountpoint>
    2. If the disk is a partition inside of an image (TRY THIS FIRST):
      • qemu-nbd --partition=<partition-number> -c /dev/nbd[0-9] <disk>
      • mount /dev/nbd[0-9] <mountpoint>
    3. If the disk is an LVM volume:
      • qemu-nbd -c /dev/nbd[0-9] <disk>
      • vgscan
      • vgchange -ay
      • mount /dev/<volume-group>/<logical-volume> <mountpoint>

When finished, you should unmount the disk, then disconnect the volume:

  1. If the disk is not an LVM volume:
    • umount <mountpoint>
    • qemu-nbd -d /dev/nbd[0-9]
  2. If the disk is an LVM volume:
    • umount <mountpoint>
    • vgchange -an <volume-group>
    • qemu-nbd -d /dev/nbd[0-9]

Running fsck on an instance's disk

First, you'll need the mount the instance's disk. After doing so, you can simply run an fsck against it.

libvirt won't start, complaining about its certificate

Ensure the certificate file doesn't have whitespace.

Trace a vnet device to an instance

for vm in $(virsh list | grep running | awk '{print $2}')
  do virsh dumpxml $vm|grep -q "$VNET" && echo $vm

Get an interactive console to an instance from a host

Note: This is known to not work.

virsh console --devname serial1 <instance id>

Get the live virsh config for an instance

virsh dumpxml <instance id>

Get a screenshort of the instance's "screen"

virsh screenshot <instance id>

Send a keypress to the instance's "keyboard"

virsh send-key <instance id> <keycode>

Where keycode is the linux keycode. Most useful is "28" which is an ENTER.

A list of keycodes can be fetched from;a=blob_plain;f=src/util/keymaps.csv

Get a web-based console and root password

Nova can provide web-based console access to instances using spice-html5. These consoles use a short-lived token (10 minutes, generally) and are somewhat clumsy, but it is possible to log in a look around. Getting a console url and a password looks like this:

andrew@labcontrol1001:~$ sudo su -
root@labcontrol1001:~# source ~/ 
root@labcontrol1001:~# OS_TENANT_NAME=testlabs openstack server list
| ID                                   | Name             | Status | Networks            |
| f509c582-da1c-42c2-abfa-7d484d6ba552 | puppet-testing   | ACTIVE | public= |
| f1925627-7df2-49c8-98bd-1d9f7631eba3 | create-test-101  | ACTIVE | public=  |
| c4bc63f8-cbd7-4384-b349-54b115e91a5c | util-abogott     | ACTIVE | public= |
| 482282c1-2c1d-4063-bd10-3d1babf9585d | relic-relics     | ACTIVE | public= |
root@labcontrol1001:~# openstack console url show --spice-html5 c4bc63f8-cbd7-4384-b349-54b115e91a5c
| Field | Value                                                                                         |
| type  | spice-html5                                                                                   |
| url   |<token> |
root@labcontrol1001:~# cat /var/local/labs-root-passwords/testlabs

Reviving mothballed instances

During the pmtpa->eqiad migration a lot of instances are copied between datacenters without being properly set up in eqiad. If a user requests that one of these instances be revived, the following steps are needed:

  1. reboot the instance, either using the nova commandline or the wikitech interface
  2. Uninstall autofs from the instance: "$sudo dpkg --purge autofs5 autofs5-ldap"
  3. Update proxies or floating IPs and DNS as needed
  4. Update puppet. This sometimes just works, but sometimes requires some extra steps:
    1. Edit /etc/puppet/puppet.conf and verify that the puppet master is virt1000
    2. rm -rf /var/lib/puppet/ssl on the instance
    3. re-run puppetd -tv, purge old certs (as directed by the puppet output)
    4. repeat as needed

Maintenance scripts

Puppet installs on the openstack controller. In order to run nova and glance shell commands without having to add a thousand args to the commandline,

   $ source /root/


The cold-migrate tool will shut down an instance, copy it to the specified target host, and boot it on the new host.

./cold-migrate <args> 7d4a9768-c301-4e95-8bb9-d5aa70e94a64 <destination>

Puppet installs in /root on the nova controller. For ssh reasons, cold-migrate will copy the contents of the VM twice, first from the original host to the nova controller, and then from the controller to the target host. This can take quite a while, so run this in a 'screen' session.


The imagestats script can be run periodically to list which images are currently in use -- it can also answer the question 'what instances use image xxx'? As obsolete images are abandoned they can be deleted from glance to save disk space.

Puppet installs imagestats in /root/novastats on the nova controller. is a simple python library which (among other things) creates a dictionary of instance data. It's useful for writing simple one-off scripts during cluster management.

Puppet installs in /root/ on the nova controller. You'll need to source before using any of its functions.

Novastats /should/ use python openstack libraries to talk to nova, but it doesn't. Rather, it uses the openstack commandline tools.

Openstack Upgrade test plan

Upgrading openstack mostly consists of updating config files, changing openstack::version in hiera and then running puppet a bunch of times. In theory each individual openstack service is compatible with the n+1 and n-1 version so that the components don't have to be upgraded in a particular order.

That said, we have a test cluster, so it's best to run a test upgrade there before rolling things out for prod. Here are things to test:

  • Keystone/Ldap
    • Openstack service list
    • Openstack endpoint list
    • Create new account via wikitech
      • Set up 2fa for new account
      • Verify new user can log in on wikitech
    • Create new project via wikitech
      • Set keystone cmdline auth to new user
      • Verify new user has no access to new project
    • Keystone commandline roles
      • Assign a role to the new account
      • Remove role from new account
    • Wikitech project management
      • Add new user to a project
      • Promote user to projectadmin
      • Verify new user can log in on Horizon
      • Verify new user can view instance page for new project
      • Demote user to normal project member
  • Nova
    • Instance creation
      • verify dns entry created
      • Verify ldap record created
      • ssh access
      • check puppet run output
    • Assignment/Removal of floating IPs
    • Security groups
      • Remove instance from ssh security group, verify ssh is blocked
      • Replace instance in ssh security group, verify ssh works again
      • Add/remove source group and verify that networking between existing and new instances in the same project changes accordingly
    • Instance deletion
      • Verify dns entry cleaned up
      • Verify ldap record removed
  • Glance
    • Openstack image list
    • Create new image
    • Test instance creation with new image
  • Designate
    • Assign/remove dns entry from Horizon
  • Dynamic Proxy
    • Create/delete proxy

Host reboot checklist

  1. 'schedule downtime for this host and all services' in icinga
  2. 'schedule downtime for this host and all services' for in icinga
  3. If DNS will be affected: silence diamond warnings for all labs instances (HOW? Is this even possible?)
  4. If VMs will be affected:
    1. depool all tool exec nodes
    2. failover tools nodes as needed
    3. failover nova-proxy as needed
  5. Reboot host
  6. Wait for host to reboot, verify ssh access still works
  7. If VMs were affected, repool all affected exec nodes

See also