Incident documentation/20131030-Wikidata-ZeroRatedMobileAccess

From Wikitech
Jump to: navigation, search

Background

ZeroRatedMobileAccess has always depended on MobileFrontend and used it liberally, including calls to its classes. However, it was done in hooks called by MF so Zero simply stopped working in absence of MF. This, however, changed in [1] where Zero started using a ResourceLoader module from MF.

What happened

At 23:02pm UTC, after deploying Zero extension updates, fatal monitor was flooded with:

Fatal error: Class 'MFResourceLoaderModule' not found in
/usr/local/apache/common-local/php-1.23wmf1/includes/resourceloader/ResourceLoader.php on line 408

The issue was tracked down to Wikidata having MobileFrontend disabled, while ZeroRatedMobileAccess was enabled. It didn't impact page views directly, however all load.php calls that requested the startup module caused fatals because it attempted to instantiate MFResourceLoader class and couldn't find it. As a consequence, people might have seen pages without styles or scripts.

A number of people (MaxSem, Reedy, Roan, and Greg, and possibly others) gave great assistance to track down the issue and rapidly disable the ZeroRatedMobileAccess extension in Wikidata. Furthermore, mobile configuration [2] will add an additional guard against calling ZeroRatedMobileAccess.php unless it's explicitly within the context of MF.

Thank you to everyone!!!

Timeline

All times in UTC

  • 22:48 Zero 1.22wmf22 deployed, no errors
  • 23:02 Zero 1.23wmf1 deployed, first errors appear - initially unnoticed
  • 23:08 A small MobileFrontend change deployed
  • 23:09 Errors noticed, initially linked with MobileFrontend push
  • 23:17 Max reverts his MobileFrontend changes, errors don't go away
  • 23:22 Problem narrowed down
  • 23:27 Fix deployed

Recommendations

  • Allow a bit more time between deployments and observe fatalmonitor before and after
  • Ensure Zero extension checks if Mobile extension is loaded before enabling itself if it relies on MFResourceLoader.

Follow-up

  • Generically, deployment calendar enforcement appears to be stricter.
  • Guard rail code was put in place.