Incident documentation/20160918-Wikitech2fa

From Wikitech
Jump to: navigation, search

Timeline

  • 2016-09-17 While resetting Greg's 2FA on wikitech, reedy notices that the oathauth_users is due for a schema update. reedy also files a security issue in phab:T145915. He runs OATHAuthHooks::schemaUpdateOldUsers() via eval.php and then drops columns secret_reset, scratch_tokens_reset and is_validated from the wikitech database.
  • 2016-09-20 19:30 valhallasw reports his 2fa credentials have been disabled -- phab:T145973
  • 2016-09-21 00:00 Reedy diagnoses the problem. Andrew recovers a backup of the labswiki database from earlier on the 16th. Reedy restored the old oathauth_user table into a temporary table, and then selected records present in the older table but missing in the current version were copied across.
    • Candidates identified via SELECT id,user_name from oathauth_users_restore INNER JOIN user ON id = user_id where id NOT IN (select id from oathauth_users)

Casualties

  • Any new (or users that had disabled and re-enabled) 2FA users since [1] was merged; the is_validated column was not populated (taking the default of 0), meaning they were eligible to delete in OATHAuthHooks::schemaUpdateOldUsers()

Conclusions

  • Migrations, once started, should be seen through, across the cluster. If the DB columns had been dropped (as it would have been done if run via update.php). It is unknown if wikitech's oathauth_users table on previously migrated. If the columns had been running OATHAuthHooks::schemaUpdateOldUsers() would've been a no-op, and no data loss would've occurred. However, running it so far after the code going live... Would just generally cause the same problem. It seems the columns were already dropped on centralauth.oathauth_users.