Incident documentation/20150320-EventLogging

From Wikitech
Jump to: navigation, search

Summary

A storm of WikiEditor Edit schema events (several hundred per second) caused EventLogging to occasionally drop client-side events.

Server-side events were not affected, but they only constitute about 30% of the pipeline. Client-side events were not entered in the database at all during this outage.

Timeline

Eventlogging inflow of events, Wikitext bugs
2015-03-19
The new WikiEditor EventLogging instrumentation was deployed in the wmf21 train.
There were use cases that were not anticipated by the Editing Team, for example, some special wikis were loading wikitext editor in a non-standard way.
Changesets that solved issues related to wikitext and instrumentation:

Main Phabricator ticket: phab:T93242

2015-03-21
After deployment of changesets, event inflow diminished and system went back to normal.

Conclusions

  • We need throttling in eventlogging so in cases like this we can "shed" a stream of events. We have had this item in our backlog for a while: phab:T69470

Actionables