Nodepool

From Wikitech
Jump to: navigation, search

A Wikimedia Release Engineering project integrating OpenStack infrastructure Nodepool service.

Upstream doc http://docs.openstack.org/infra/nodepool/
Diskimage doc http://docs.openstack.org/developer/diskimage-builder/
Server labnodepool1001.eqiad.wmnet
Logs /var/log/nodepool/
Monitoring Load for ci-jessie-wikimedia label

Overview

Zuul and the Jenkins Gearman plugin have the ability to get a node offline as soon as a build has completed. That gives us disposables machines to run tests on. Nodepool is a python daemon that replenish the pool by spawning new ones and taking care of garbage collecting instances that have been used.

The services talks to the wmflabs OpenStack API as user nodepoolmanager and spawn instances in the contintcloud project. Once ready, Nodepool adds the instances to the Jenkins master so it can starts being allocated a building task. On completion of a job, if Zuul has injected the paramater NODE_OFFLINE_WHEN_COMPLETE, the Jenkins Gearman plugin marks the node offline. Nodepool garbage collect offline nodes it spawned.

The images are build with OpenStack utility diskimage-builder. The configuration files and build wrappers are in integration/config.git under /dib/. The result is a qcow2 image of a Debian Jessie suitable for CI needs. The image is manually uploaded to labnodepool1001 and then to the labs project.

Using that reference image build with dib, Nodepool spawn an instance and then execute a 'setup script' (in integration/config.git /nodepool/script). Once the setup is completed, Nodepool asks OpenStack to snapshot it.

When adding new instances to the pool, Nodepool boot them from the snapshot and then execute a 'ready script' (in integration/config.git /nodepool/script). On success, the instance is marked as ready and added to the Jenkins master as a slave.

Service

The daemon is managed by systemd. Members of the contint-admins can use sudo to interact with nodepool:

sudo service nodepool status

sudo service nodepool start

sudo service nodepool stop

sudo journalctl -u nodepool

One can start the service is non-daemon mode as user nodepool : nodepoold -d.. Make sure the nodepool service is stopped.

Job migration

To switch the jobs to Nodepool instances, we need to:

  1. Change their Jenkins label to defined in nodepool, as of Sept 2015: ci-jessie-wikimedia
  2. Have the job emit zeromq events (in JJB: properties: - zeromqevent)
  3. In Zuul layout apply the parameter function which injects NODE_OFFLINE_WHEN_COMPLETE (parameter-function: offline_when_complete)

An example is Gerrit change Migrate tox*jessie jobs to Nodepool instances.

Attention: in Zuul only one parameter-function can be applied.

Notes

Configuration is managed in Puppet and generated at /etc/nodepool/nodepool.yaml.

Client to interact with the database and issue informative and administrative commands is nodepool.

Administration

Login as nodepool user

Switch to Nodepool user using a login shell (to source ~/.profile which contains the OpenStack credentials):

user$ become-nodepool
nodepool$

Nodepool CLI

The nodepool command line interacts with Nodepool database and with the OpenStack API. Hence you will want to use it as the nodepool users to have the OpenStack credentials loaded.

The command interacts with spawned instances (node), snapshots (image).

Command Description
nodepool list Spawned instances
nodepool delete <#id> Delete an instance and depool it from Jenkins
nodepool hold <#id> Prevent a node from being automatically deleted
nodepool image-list List snapshots created by Nodepool from the base image
nodepool image-delete Delete a snapshot
nodepool image-update Generate a new snapshot from a base image

To list both images and snapshots specific to the contintcloud project, use the openstack CLI:

$ openstack image list --private
+--------------------------------------+-------------------------------+
| ID                                   | Name                          |
+--------------------------------------+-------------------------------+
| eb7d9baa-7d10-4706-995b-c38a8a601e1c | snapshot-ci-trusty-1486399717 |
| f346821e-2833-4564-8a80-3330f65a2644 | image-ci-trusty               |
| 9b6fb7dc-8538-4a77-89b1-8f1d36b4817a | snapshot-ci-jessie-1486397249 |
| 53d55207-5549-417b-81f7-ad1e73ab0797 | image-ci-jessie               |
+--------------------------------------+-------------------------------+

Pass --long to get more details such as the images properties.

Deploy building scripts

The setup and ready scripts are held in integration/config.git in /nodepool/scripts and must be manually deployed on labnodepool1001.eqiad.wmnet:

   $ become-nodepool
   $ cd /etc/nodepool/wikimedia
   $ git pull

Nodepool inject the scripts from there when it generates a new snapshot or spawn an instance.

Diskimage

Creation

Diskimage building requires root access hence they are build manually on labs instance, uploaded to labnodepool1001 where appropriate credentials are hold and then uploaded to the OpenStack API.

On a labs instance:

git clone https://gerrit.wikimedia.org/r/p/integration/config.git
cd dib
./build_image.sh debian

On a later run you can take care advantage of caches by setting DIB_OFFLINE or just use cache debootstrap withwith DIB_DEBIAN_USE_DEBOOTSTRAP_CACHE=1 DIB_DEBOOTSTRAP_CACHE=1:

DIB_OFFLINE=1 ./build_image.sh debian
DIB_DEBIAN_USE_DEBOOTSTRAP_CACHE=1 DIB_DEBOOTSTRAP_CACHE=1 ./build_image.sh debian

For debugging purposes, set DIB_DEBUG_TRACE=1.

Pass env variable break=after-error to get a shell on error.

The script generates a .qcow2 image, publish it on the instance for example in /var/www/html/ . Then connect to labnodepool1001.eqiad.wmnet and retrieve the file.

Hacking 
diskimage-builder and our element uses the local copy of integration/config.git, hence you can locally hack the puppet manifest to see the result.

Publish on labs

This is done from labnodepool1001.eqiad.wmnet as the nodepool user which has credentials in its environment (via .profile). You first need to rename the obsolete image-ci-jessie image, then upload the newly created one which will effectively replace it.

First list the current images attached to the project for later reference:

$ become-nodepool
$ openstack image list --private
+--------------------------------------+----------------------------------+
| ID                                   | Name                             |
+--------------------------------------+----------------------------------+
| 46150ea9-3a0f-4157-9179-239f54df8117 | image-ci-jessie  |
+--------------------------------------+----------------------------------+

Rename the current image and upload the new one:

$ openstack image set --name image-ci-jessie_old_20160207 image-ci-jessie
$ openstack image create --file <NAME OF IMAGE>.qcow2 image-ci-jessie --disk-format qcow2

Confirm with openstack image list --private by looking at the ID:

$ openstack image list --private
+--------------------------------------+----------------------------------+
| ID                                   | Name                             |
+--------------------------------------+----------------------------------+
| 9299db8b-6125-4129-8c90-1de1bb1dfbc7 | image-ci-jessie              |
| 46150ea9-3a0f-4157-9179-239f54df8117 | image-ci-jessie_old_20170207 |
+--------------------------------------+----------------------------------+

Important the image-ci-jessie must be the same as the name of the image in /etc/nodepool/nodepool.yaml or look at nodepool image-list.

Next, you will want to update the snapshot used to boot instances from. See next section.

Manually generate a new snapshot

Nodepool automatically generate new snapshot. When updating image, puppet or scripts, you will want to manually force a run. Nodepool will:

  • boot the base image (ci-jessie-wikimedia)
  • copy the setup scripts from /etc/nodepool/wikimedia/nodepool/scripts
  • ssh as debian user and executes setup_node.sh
  • take a snapshot of the instance

New instances will be spawned out of that new snapshot.

Make sure the deploy script are up to date then run nodepool image-update

user$ become-nodepool
nodepool$ git -C /etc/nodepool/wikimedia/ pull
...
nodepool$ nodepool image-update wmflabs-eqiad snapshot-ci-jessie

In case the new snapshot causes issues, you will want to delete it with nodepool image-delete. Nodepool would then spawn new instances with the previous snapshot.

Retrieve a snapshot for inspection

Sometime you will want to inspect the generated snapshot. Use the OpenStack CLI to get the image and save it locally:

user$ become-nodepool
nodepool$ openstack image save --file snapshot-ci-jessie-1486397249.qcow2 snapshot-ci-jessie-1486397249
nodepool$ 

You can then inspect it locally eg:

virt-ls -R -a ci-snapshot-ci-jessie-1486397249.qcow2 /etc/default

Connect to an instance

The image we create is not connected with the wmflabs LDAP server (see Provision CI:admins ssh public key in Nodepool instances). As of Sept 2015, one way to connect is to use nodepool ssh credentials:

   user$ become-nodepool
   nodepool$ openstack server list
   nodepool$ ssh jenkins@<IP OF INSTANCE>

For debugging purpose, one can spawn an instance with a ssh keypair attached on boot, that will grant root access.

Generate a ssh key pair as nodepool:

   user$ become-nodepool
   nodepool$ ssh-keygen -t rsa -f somename.key
   nodepool$ chmod go-r somename.key*

Get the content of somename.key.pub.

In Horizon head to the contintcloud project, under Instances use the Launch Instance button at the top right above the instance list.

A modal window will open, fill the various fields. You will need a Flavor m1.medium, in boot source select Boot from snapshot and pick the instance snapshot you want to start. Then head to the tab Access & Security that list available key pair, the + let you add the public key generated previously.

Once the instance has spawned, cloud-init will retrieve the public key and inject it in the instance. For a Jessie image that will let you ssh as the debian user which has sudo access:

   nodepool$ ssh -i somename.key debian@<IP OF INSTANCE>
   debian@instance$ sudo su -
   root@instance# 

Aliens

Due to bugs / crash ... Nodepool might loose track of snapshots or instances which are still in OpenStack. When it encounters unaccounted materials referenced in the OpenStack API, Nodepool flag them as alien. One can get a list via nodepool alien-list and nodepool alien-image-list. Example:

$ nodepool alien-list
+---------------+---------------------------+--------------------------------------+--------------+
| Provider      | Hostname                  | Server ID                            | IP           |
+---------------+---------------------------+--------------------------------------+--------------+
| wmflabs-eqiad | ci-jessie-wikimedia-22712 | 1efc7468-1e37-405b-b5f0-ea5c0dfe76e5 | None         |
| wmflabs-eqiad | ci-jessie-wikimedia-18903 | 72e43729-6467-4e27-a3c1-1f0a5bbba1bd | 10.68.16.240 |
+---------------+---------------------------+--------------------------------------+--------------+

Such instances must be deleted via the OpenStack system:

$ openstack server delete ci-jessie-wikimedia-18903
$ openstack server delete ci-jessie-wikimedia-22712
$ nodepool alien-list
+----------+----------+-----------+----+
| Provider | Hostname | Server ID | IP |
+----------+----------+-----------+----+
+----------+----------+-----------+----+

What's in the nodepool instance queue?

$ nodepool list | grep -e ^\| | grep -v State | cut -d\| -f 11 | sort | uniq -c

Debugging diskimages

Inspection with libguestfs-tools

Offers set of command lines to interact with a virtual machine images. Install it with sudo apt-get install libguestfs-tools;

The diagnostic command is: sudo libguestfs-test-tool

You might need to update the supermin base image: sudo update-guestfs-appliance

boot the image locally At first get the partition of the system:

$ vert-rescue --add image-jessie.qcow2 --suggest
...
mount /dev/sda1 /sysroot/
...

Hook in the image in readonly mode:

host$ virt-rescue --ro --add image-jessie.qcow2
<rescue> mount /dev/sda1 /sysroot/
<rescue> chroot /sysroot
bash
root@(none):/# 


List files inside an image:

   $ virt-ls --add ci-dib-jessie-wikimedia-1436281442.qcow2 --recursive /etc/network
   ...
   $

Dump content of a file:

   $ virt-cat -a ci-dib-jessie-wikimedia-1436281442.qcow2 \
              /etc/network/interfaces.d/eth0
   auto eth0
   iface eth0 inet dhcp
   $

Spurts an XML output listing OS metadata and installed packages.

   $ virt-inspector ci-dib-jessie-wikimedia-1436281442.qcow2
   <?xml version="1.0"?>
   <operatingsystems>
     <operatingsystem>
       <root>/dev/sda1</root>
       <name>linux</name>
       <arch>x86_64</arch>
       <distro>debian</distro>
  ....

See dpkg -L libguestfs-tools|grep /bin/ for more.

Mount a qcow2 image

guestfs

Never mount a running VM with read/write, that will corrupt it. Nopepool disk image should be fine though.

mkdir "$HOME/mount"
guestmount  --inspector --add ci-dib-jessie-wikimedia-1436281442.qcow2 --ro "$HOME/mount"
ls -l "$HOME/mount"

Then:

guestunmount "$HOME/mount"

qemu (needs root)

You might want to inspect a dib image manually. To do so, as root:

modprobe nbd max_part=16
qemu-nbd -c /dev/nbd0 /srv/dib/images/XXXXXX.qcow2
partprobe /dev/nbd0
mkdir "/tmp/$USER-image-inspect"
mount /dev/nbd0p1 "/tmp/$USER-image-inspect"

Once done:

umount "/tmp/$USER-image-inspect" && rmdir "/tmp/$USER-image-inspect"
qemu-nbd --disconnect /dev/nbd0

Nodepoold diagnostic

Log files

File Description
/var/log/nodepool/nodepool.log Nodepoold INFO+
Nodepoold DEBUG+
Snapshot refresh

Stack dump

By sending the signal SIGUSR2 to nodepoold, it will emit a stacktrace for each of its threads at DEBUG level in channel nodepool.stack_dump which is logged in /var/log/nodepool/debug.log. Edited example:

2016-01-04 19:51:16,056 DEBUG nodepool.stack_dump:
Thread: wmflabs-eqiad (140297227978496)
Thread: DiskImageBuilder queue (140297245812480)
Thread: MainThread (140297871116032)
  File "/usr/lib/python2.7/dist-packages/nodepool/cmd/nodepoold.py", line 52, in stack_dump_handler
    log_str += "".join(traceback.format_stack(stack_frame))
Thread: gallium.wikimedia.org (140297641219840)
Thread: NodePool (140297732646656)
  File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner
    self.run()
  File "/usr/lib/python2.7/dist-packages/nodepool/nodepool.py", line 1814, in run
    time.sleep(self.watermark_sleep)
Thread: APScheduler (140297254205184)
  File "/usr/lib/python2.7/threading.py", line 359, in wait
    _sleep(delay)
Thread: Gearman client connect (140297598207744)
Thread: Gearman client poll (140297606600448)
  File "/usr/lib/python2.7/dist-packages/gear/__init__.py", line 788, in _pollLoop
    ret = poll.poll()
Thread: NodeUpdateListener (140297614993152)
  File "/usr/lib/python2.7/dist-packages/nodepool/nodepool.py", line 172, in run
    m = self.socket.recv().decode('utf-8')

This help finding potential deadlocks.

OpenStack API logs

The OpenStack API logs can be checked directly on the labnet* servers. You will need to be a member of the Unix labnet-users user group. Then:

   ssh labnet1002.eqiad.wmnet tail -F /var/log/nova/*.log

Snapshot refresh not working

Although nodepool CLI accepts a --debug parameter it does not show much. To grab the actual HTTP queries and response with the OpenStack API, rely on strace with for example:

strace -f -e recvfrom,sendto -s 1024 nodepool image-update wmflabs-eqiad snapshot-ci-jessie

For details: https://phabricator.wikimedia.org/T127755#2053150

See also