Incidents/20150320-EventLogging
Appearance
(Redirected from Incident documentation/20150320-EventLogging)
Summary
A storm of WikiEditor Edit schema events (several hundred per second) caused EventLogging to occasionally drop client-side events.
Server-side events were not affected, but they only constitute about 30% of the pipeline. Client-side events were not entered in the database at all during this outage.
Timeline
- 2015-03-19
- The new WikiEditor EventLogging instrumentation was deployed in the
wmf21
train. - There were use cases that were not anticipated by the Editing Team, for example, some special wikis were loading wikitext editor in a non-standard way.
- Changesets that solved issues related to wikitext and instrumentation:
- gerrit:197810
- gerrit:198095 (backported and deployed to wmf21, backported but not yet deployed to wmf22),
- gerrit:198134
Main Phabricator ticket: phab:T93242
- 2015-03-21
- After deployment of changesets, event inflow diminished and system went back to normal.
Conclusions
- We need throttling in eventlogging so in cases like this we can "shed" a stream of events. We have had this item in our backlog for a while: phab:T69470
Actionables
- Status: Done Backfill the data: phab:T93602