Jump to content
Incident documentation meeting/QR201407/group1/notes
20140318-EventLogging
- migrated to m2 shard, shouldn't have too many load issues in future
- analytics is responsible for responding to alerts is from analytics
- ops is responsible for generic looking database alerts
- EL can be down or lagging for up to 48 hours (weekends) - "Tier 2" support
20140328-DB-Queries
- would have been good to have Ariel on the call
- greg to follow up on explicit next steps with Bryan and Reedy
- Add to next group's list
20140509-EventLogging
- all green :)
- seems all bases are covered here, any disagreement? :)
20140526-m1
- blog work, loop back with RobH re future of that box? HA? etc?
- how far away to get rid of blog?
20140607-Elasticsearch
- still need to create reproducible steps for this to be reported upstream
- still need to manually remove a sick node (on purpose)
20140613-Videoscalers
20140619-parsercache
- MediaWiki failed to stop trying to use the bogged down machine
- Greg: need to get this diagnosed and tracked
- HHVM's impact here?
- proposal 4 related to Rashomon?
20140622-es1006
20140622-imagescaler
20140625-CirrusSearch
- Still have the feature request for scap here
-