Check ganglia
check_ganglia
script has been archivedSee task T343707
check_ganglia is a nagios/icinga plugin script that can be used to generate alerts based on metric values in Ganglia. It query gmond instances or the main gmetad via its xml query interface.
check_ganglia has been imported into our gerrit and debianized: https://gerrit.wikimedia.org/r/#/admin/projects/operations/debs/check_ganglia. It was originally imported from https://github.com/larsks/check_ganglia.
See also: https://rt.wikimedia.org/Ticket/Display.html?id=6602
Puppet Usage
To install a new icinga alert based on a Ganglia metric, use the monitor_ganglia define.
# Set up icinga monitoring of Kafka broker per second.
# If this drops too low, trigger an alert.
# These thresholds have to be manually set.
# adjust them if you add or remove data from Kafka topics.
monitor_ganglia { 'kafka-broker-MessagesIn':
description => 'Kafka Broker Messages In',
metric => 'kafka.server.BrokerTopicMetrics.AllTopicsMessagesInPerSec.FifteenMinuteRate',
warning => ':1500.0',
critical => ':1000.0',
require => Class['::kafka::server::jmxtrans'],
}
This is a just wrapper for our monitor_service define. See the monitor_ganglia documentation for up to date usage.
CLI usage
In addition to nagios/icinga alerts, check_ganglia is also a useful CLI interface for querying Ganglia directly. It can query either gmond aggregators or gmetad itself.
Querying gmond
Querying gmond aggregator instances is slightly faster than querying gmetad, but requires knowledge of which aggregator instance has data for which nodes.
# analytics1010.eqiad.wmnet is a gmond aggregator.
check_ganglia -g analytics1010.eqiad.wmnet -H analytics1010.eqiad.wmnet -m Hadoop.NameNode.FSNamesystem.CapacityRemainingGB
# or you can even list all ganglia metrics for a given node
check_ganglia -g analytics1010.eqiad.wmnet -H analytics1010.eqiad.wmnet --list
Query gmetad
Querying gmetad is slightly easier, because all of the metrics for all nodes should be available here.
# -q flag tells ganglia to use gmetad (which is on port 8654 for us).
check_ganglia -q -g nickel.wikimedia.org -p 8654 -H analytics1010.eqiad.wmnet --list