Nova Resource:Nagios/SAL
Jump to navigation
Jump to search
January 15
- 08:18 andrewbogott: rebooted nagios-dev
September 23
- 15:58 mutante: restarting nagios3 on nagios-main (which is icinga.wmflabs, was down per bug 52560)
June 11
- 18:23 wm-bot: petrb: restarting nagios
May 30
- 14:02 wm-bot: petrb: fixed nlogin o
May 22
- 15:05 wm-bot: petrb: restarted nagios irc bot
May 10
- 15:05 labs-logs-bottie: petrb: restarting nagios bot
March 22
- 22:33 Damianz: Made adding role checks simple - see https://gerrit.wikimedia.org/r/#/c/55424/ for basics
March 19
- 15:03 labs-logs-bottie: petrb: restarted ircecho
March 5
- 19:56 Ryan_Lane: restarted ircecho on nagios-main
February 21
- 09:11 petan: moved bot to nagios channel
February 18
- 11:45 labs-logs-bottie: petrb: restarting feed
February 6
- 21:21 Ryan_Lane: scratch that. it doesn't need to reboot
- 21:21 Ryan_Lane: rebooting nagios-main
January 30
- 09:21 petan: ignoring all swift-be* instances - no one cares about them and they are spamming channel
December 19
- 08:34 petan: rebooting nagios
October 8
- 15:58 labs-logs-bottie: petrb: restarting nagios server because it really needs it
October 6
- 06:55 Damianz: Fixed permissions on rw dir so snmptt can submit trap results.
October 3
- 17:57 Damianz: Implimented ignoring hosts in the rebuild script + restarted puppet/ircecho
October 1
- 08:40 Damianz: Stopped puppet/ircecho again, commented out in crontab this time
September 13
- 15:53 Damianz: fixed parser fetch so it rebuilds when the old file is missing
- 15:42 Damianz: fixed snmp trap config
- 15:15 Damianz: reset rw owernship to snmptt so the puppet check can be entered - probably should switch this to a setuid'd binary or setup group memberships properly.
August 30
- 08:39 Damianz: changing puppet-FAIL command to `echo "Puppet has not run in the last 10 hours" && exit 2` from `/usr/share/nagios3/puppet_check.sh $HOSTADDRESS$`
August 29
- 01:12 Damianz: Free ram plugin seems to be working, 27 hosts it's not working on - probably puppetmaster::self/broken puppet
August 28
- 23:24 Damianz: free ram check merged in, un-commenting service. Not reloading, should reloading on the next parser run giving time for puppet to run on the instances.
- 22:45 Damianz: Commented out free_ram check for now.
- 22:26 Damianz: Pending change to fix the Free ram checks (21822)
- 22:25 Damianz: Changed snmp host for trap in base::puppet - that should fix Puppet freshness checks on anything not running puppetmaster::self
- 20:58 Damianz: re-setup snmpd/snmptt for puppet freshness checks. Used config from puppet, plugin in /usr/lib/nagios/plugins/eventhandlers
- 18:38 Damianz: parser re-breaks configs, removing -a $ARG2$ from check_nrpe now as nothing gets an arg passed anyway. This should be fixed in a better way so we /can/ use args later.
- 18:20 Damianz: Copied /etc/nagios3/conf.d to /etc/nagios3/conf.d.backup and sed -i 's/check_nrpe/check_nrpe_1arg/g' /etc/nagios3/conf.d/* to fix nrpe checks, need to check the parser.
- 18:19 Damianz: chmod 644 /etc/nagios3/resource.cfg so nagios can read it on reload.
June 20
- 13:15 labs-logs-bottie: root: disabled user check for bastion
June 3
- 13:24 mutante: starting snmptrapd
June 1
- 13:54 labs-logs-bottie: petrb: fixed nagios
May 17
- 12:15 mutante: - started snmptrapd
May 3
- 09:23 mutante: starting snmptrapd
- 07:43 labs-logs-bottie: root: aptitude upgrade
May 2
- 07:01 petan|wk: rebooting
March 20
- 05:18 mutante: - put all the hosts currently down into scheduled downtimes for the next 3 days with manual bash commands
- 04:18 mutante: - temp. changed permissions on external command file per Nagios FAQ, added group "nagiocmd" to see if that allows me to schedule downtimes, it does (independetly from the host command perms), but took permissions back due to security concerns
- 03:36 mutante: even though listed in all authorized_for_* commands in cgi.cfg i get denied to execute any by web ui. guess related to the Apache LDAP auth / auto-login
- 03:27 mutante: puppet broken due to "Could not find class misc::apache2"
February 24
- 08:50 petan|wk: fixed irc