Fundraising Monitoring

From Wikitech
Jump to navigation Jump to search

This page is intended to document the monitoring infrastructure that exists for the fundraising as well as keep track of desired monitoring functionality.

Existing monitoring infrastructure

Log monitoring

Minfraud log

We currently monitor the 'minfraud' log (which gets aggregated form the payments cluster to Loudon).

Monitoring wishlist

See RT tickets #405

  • Hudson
    • Nagios check for alive-ness
    • Nagios check for failed builds
      • Note: some scripts run by Hudson need to be modified to throw a non-successful exit status when they don't complete properly (eg send/receive mail scripts for civimail)
    • Nagios check for too many files in build folders (if the limit of 63999 gets hit, builds will fail)
  • ActiveMQ
    • Nagios check for queues filling up too fast
  • Service communication times
    • Nagios checks for timeouts/unacceptably high communication times
  • 3rd party service accessibility from payments cluster
    • Nagios check for communications access to MaxMind/PayPal