Monitoring/puppet checkpuppetrun

From Wikitech

This Icinga alert says a puppet agent failed to run.

Some things you can do are:

  • ssh to the affected host and run Puppet manually with "sudo puppet agent -tv" to see verbose output
  • check if you see any error, sometimes issues are transient or on the puppetmaster. If you do see an error try to find out which change caused it.
  • Leave a comment on the change that broke it and/or ping the user who merged it on IRC and/or upload a fix yourself and merge it if appropriate.
  • If that fails or you see in the Icinga web UI this alert has been triggering for a longer time, create a ticket about it.

if you are able to find a fix and only want to run it on the hosts with a failed puppet command you can use the following hack. however please pay attention to the cumin host selection and the command you are running to make sure you don't run the command on the wrong system. in the following example we just run hostname on all servers with a failed puppet run

$ sudo cumin '*' "bash -c '. /usr/local/share/bash/puppet-common.sh; last_run_success' || hostname"