Portal:Cloud VPS/Admin/VM images

From Wikitech
Jump to navigation Jump to search

Cloud VPS uses special VM images that contain all the customizations required for our environment.

Builders

There is a builder server for each Linux distribution, installed with the appropriate build tools:

OS Builder Server Build Tool
Debian Stretch cloud-bootstrapvz-stretch.openstack.eqiad.wmflabs bootstrap-vz
Debian Jessie labs-bootstrapvz-jessie.openstack.eqiad.wmflabs bootstrap-vz (modified)
Ubuntu Trusty vmbuilder-trusty.openstack.eqiad.wmflabs vmbuilder

How To Build

Debian Stretch

  • Login to cloud-bootstrapvz-stretch.openstack.eqiad.wmflabs
  • Build and convert image
sudo su -
cd /target
bootstrap-vz --pause-on-error  /etc/bootstrap-vz/manifests/labs-stretch.manifest.yaml
qemu-img convert -f raw -O qcow2 debian-stretch-amd64-$(date "+%Y%m%d").raw debian-stretch-amd64-$(date "+%Y%m%d").qcow2

Debian Jessie

  • Login to labs-bootstrapvz-jessie.openstack.eqiad.wmflabs
  • Build and convert image
sudo su -
cd /target
bootstrap-vz /etc/bootstrap-vz/manifests/labs-jessie.manifest.yaml
qemu-img convert -f raw -O qcow2 debian-jessie.raw debian-jessie.qcow2

Custom bootstrap-vz package

Building Jessie images required a custom bootstrap-vz package (not needed for Stretch).

Andrew built python-bootstrap-vz_0.9wmf-1_all.deb using stddeb. It was built from the development branch on 2014-12-22 with commit 255f0624b49dbcf6cacccd3b2f1fa7c0cc2bcc8d and the patch below.

diff --git a/setup.py b/setup.py
index f7b97ac..349cfdc 100644
--- a/setup.py
+++ b/setup.py
@@ -22,11 +22,8 @@ setup(name='bootstrap-vz',
       install_requires=['termcolor >= 1.1.0',
                         'fysom >= 1.0.15',
                         'jsonschema >= 2.3.0',
-                        'pyyaml >= 3.10',
                         'boto >= 2.14.0',
                         'docopt >= 0.6.1',
-                        'pyrfc3339 >= 1.0',
-                        'requests >= 2.9.1',
                         ],
       license='Apache License, Version 2.0',
       description='Bootstrap Debian images for virtualized environments',


  • Alter the version tag in vi bootstrapvz/__init__.py as needed
  • Install python-stdeb
  • python setup.py --command-packages=stdeb.command bdist_deb
  • ls deb_dist/*.deb

Ubuntu Trusty (Legacy)

We use vmbuilder to build our custom Ubuntu images. The vmbuilder configuration is in puppet in the labs-vmbuilder module. It can be added to a node using role::labs::vmbuilder. Here's a set of steps to build and import the images:

On vmbuilder-trusty.openstack.eqiad.wmflabs:

puppet agent -tv
cd /srv/vmbuilder
rm -Rf ubuntu-trusty
vmbuilder kvm ubuntu -c /etc/vmbuilder.cfg -d /srv/vmbuilder/ubuntu-trusty -t /srv/vmbuilder/tmp --part=/etc/vmbuilder/files/vmbuilder.partition

Note the name of the tmp file generated; for instance: "Converting /tmp/tmpD0yIQa to qcow2, format /mnt/vmbuilder/ubuntu-trusty/tmpD0yIQa.qcow2"

Note the name of the tmp file generated; for instance: "Converting /tmp/tmpD0yIQa to qcow2, format /mnt/vmbuilder/ubuntu-trusty/tmpD0yIQa.qcow2"

How To Test

You can boot an image locally for testing, like this:

sudo qemu-system-x86_64 -nographic -serial mon:stdio -enable-kvm image_name.raw

If the command above does not work, you try can try the following command (beware boot logs will be supressed):

qemu-system-x86_64 image_name.raw --curses

Having a working login account for test purposes is left as an exercise to the reader. bootstrap-vz creates one by default (login:root / passwd:test) but our config wisely disables it.

How To Deploy

Images are deployed to the OpenStack Glance service.

  • Copy the .qcow2 file to /tmp on the cloudcontrol1003.wikimedia.org server

Since the file has to cross the Cloud VPS / Production boundary, you can copy it from the builder server to your laptop (using your Cloud Services root key) and then from your laptop to cloudcontrol1003 (using your production key):

rsync --progress -v -e ssh root@cloud-bootstrapvz-stretch.openstack.eqiad.wmflabs:/target/debian-stretch-amd64-$(date "+%Y%m%d").qcow2 .
rsync --progress -v -e ssh debian-stretch-amd64-$(date "+%Y%m%d").qcow2 cloudcontrol1003.wikimedia.org:/tmp/

Alternatively, you can open a temporary HTTP server to make this transfer:

cloud-bootstrapvz-stretch:~ $ cd /target
cloud-bootstrapvz-stretch:~ $ python3 -m http.server 80

cloudcontrol1003:/tmp$ wget http://185.15.56.45/debian-stretch-amd64-$(date "+%Y%m%d").qcow2
  • Login to cloudcontrol1003.wikimedia.org
  • Create new image in Glance:
sudo su -
source ~/novaenv.sh 
cd /tmp
openstack image create --file debian-stretch-amd64-$(date "+%Y%m%d").qcow2 --disk-format "qcow2" --container-format "ovf" --public "debian-9.6-stretch"
  • Test new image by booting a new VM with it (if the image is faulty, remember to delete the test VM and the faulty image)
  • Update fullstackd to use this new image (see T218314 for an example).
  • Get a list of existing images
openstack image list
  • Append "deprecated" to old images and remove properties (only if new image is working as expected)
openstack image set --name "debian-9.5-stretch (deprecated <date>)" <old-image-id>
nova image-meta <old-image-id> delete default
nova image-meta <old-image-id> delete show

Passing --purge-props to openstack image set should be enough to clear all properties but it's currently not available in our OpenStack version. The nova image-meta commands serve the same purpose but you have to delete each property individually. This should be reviewed when OpenStack is upgraded.

  • Make the new image the default for new instances
openstack image set --name "debian-9.6-stretch" <new-image-id> --property show=true --property default=true

Notice in the above glance image-update commands the use of properties. If default=true the image will be the default image selected in the instance creation interface. Purging properties removes the 'default' state.

How To Delete

Don't delete glance images as long as there are VMs runnning them:

  • Resizing a VM requires access to the base image it was launched with. See how resizing instances works to understand it.
  • We should always be able to identify the OS an instance is running based on the image name.

TODO: fill me.

Private Images

It's possible to restrict an image to a single project or set of projects. This is useful when e.g. gradually deprecating use of an OS.

First, mark the image as private:

# source ~/novaenv.sh
# openstack image set --private <image-id>

At this point only the 'admin' project can view the image. To extend permissions to another project:

root@cloudcontrol1003:~# glance member-create <image-id> <project-id>
+--------------------------------------+-----------+---------+
| Image ID                             | Member ID | Status  |
+--------------------------------------+-----------+---------+
| 957af495-1244-4b05-9f4f-cf68dccb1422 | testlabs  | pending |
+--------------------------------------+-----------+---------+
root@cloudcontrol1003:~# glance member-update <image-id> <project-id> accepted
+--------------------------------------+-----------+----------+
| Image ID                             | Member ID | Status   |
+--------------------------------------+-----------+----------+
| 957af495-1244-4b05-9f4f-cf68dccb1422 | testlabs  | accepted |
+--------------------------------------+-----------+----------+

It's pretty much always a good idea to extend access to the 'observer' project for private images so that read-only tools like openstack-browser can still see what's going on. The novaobserver can use this to view deprecated images but does not have the rights to actually use them (through other restrictions elsewhere).

How To Deactive Obsolete Images

We used to use the 'show=true' property to ensure that an image appeared on Wikitech. Now instead we use the image state, where only images with state=active appear in the GUI (both on wikitech and in Horizon.) To deactivate your obsolete image:

source /etc/novaenv.sh 
openstack image set --deactivate <image-id>

If you need to reactive it for some reason:

source /etc/novaenv.sh 
openstack image set --activate <image-id>

Please note that we usually just "deprecate" images by changing their names. Deactivating an image is a more extreme step to be used when you do not want any users to have access to it.

Internals

bootstrap-vz configuration files

Bootstrap-vz uses source files from /etc/bootstrap-vz. These files are puppetized, so you'll want to disable Puppet if you change them.

OS Configuration File
Debian Stretch /etc/bootstrap-vz/manifests/labs-stretch.manifest.yaml
Debian Jessie /etc/bootstrap-vz/manifests/labs-jessie.manifest.yaml


Bootstrap-vz also uses several source files that are standard local config files on the build host. For a complete list of these files, look at the 'file_copy:' section in /etc/bootstrap-vz/manifests/labs-{stretch,jessie}.manifest.yaml

First Boot

The first boot of the VM image is a key moment in the setup of an instance in Cloud VPS.

This is usually done by means of the /root/firstboot.sh script which is called by means of /etc/rc.local.

The script will do:

  • some LVM configuration
  • run DHCP request for configuration
  • name/domain resolution to autoconfigure the VM
  • initial puppet autoconfiguration (cert request, etc)
  • initial configuration of nscd/nslcd
  • initial apt updates
  • NFS mounts if required
  • final puppet run to fetch all remaining configuration (ssh keys, packages, etc)

Until the last point, the instance may have limited connectivity or usability.

Troubleshooting

This troubleshooting section is specific for VM images (in glance) and generally only usefull when dealing with new VM images.

Common Issues

Common issues when dealing with VM images. These problems may vary from deployment to deployment, but they could be common.

  • Image does not have the puppet master CA, so it fails to fetch catalog (see phab:T181523)
  • Image does not have the puppet master CRL, so it fails to fetch catalog (see phab:T181523)
  • Image doesn't correctly resolve the hostname/domain name (so it fails to fetch its own puppet catalog)

How To Inspect Disk Contents

If you want to explore and edit the disk image of a live instance, read the docs at Cloud VPS troubleshooting, mounting an instance disk.

How to fix VM disk corruption

Please read Cloud VPS troubleshooting, fixing VM disk corruption (fsck).

See Also