Incidents/2018-03-02 Train
(Redirected from Incident documentation/20180302-Train)
Summary
Train for 1.31.0-wmf.23 was rolled back on two occasions:
- 2018-02-28 05:43:37 due to deletion logs on MediaWiki recording incorrect users doing deletions task T188479
- 2018-02-28 22:11:xx due to a noisy notices/all pages listed in Special:Newpages showed the current date and time task T188555
Timeline
T188479
- 05:21 Stemoc reported a problem with the deletion logs (Special:Log/delete) showing the wrong user in
#mediawiki
- legoktm investigates and creates a task:
05:29, 28 February 2018 Dharmadeepa V (talk | contribs | block) deleted page User:Dharmadeepa V (spam (this is legoktm)) (view/restore)
2018-02-28 05:39:05 <legoktm> I'd suggest reverting the train, like immediately
05:43:37 +logmsgbot | !log demon@tin rebuilt and synchronized wikiversions files: (no justification provided)
T188555
- 1.31.0-wmf.23 was rolled out to group1 wikis:
2018-02-28 21:56 <thcipriani@tin> rebuilt and synchronized wikiversions files: Group1 to 1.31.0-wmf.23
- Wed, Feb 28, 21:59 thcipriani noticed an increased error rate, notices pointing to stdclass::$rc_timestamp and created task T188555
- thcipriani rolled group back to 1.31.0-wmf.22
2018-02-28 22:11 <thcipriani@tin> rebuilt and synchronized wikiversions files: Group1 back to 1.31.0-wmf.22 T188555
- Overnight the problem was resolved and a patch merged in master, that change was deployed after train the following day:
[2018-03-01T20:15:45Z] <thcipriani@tin> Synchronized php-1.31.0-wmf.23/includes/specials/pagers/NewPagesPager.php:
SWAT: NewPagesPages: Use array_merge rather than + for RC query info fields T188555 (duration: 01m 14s)
Conclusions
- A test case probably should have caught the first problem that led to an emergency rollback.
- The second problem seems like something automated browser tests or manual testing could have caught.
Actionables
- phab:T188773 - Test to validate deletion log entries and ArticleDeleteComplete hook performers