Switch Datacenter/planned db maintenance
Appearance
2020 Switch Datacenter
Important dates
- Switchover date: Tuesda, September 1st 2020: 14:00 UTC
- Switchback date: TBD
Failover schedule
- 27th Aug - LAST DB maintenance day in all DCs
- 27th Aug - Enable replication codfw -> eqiad
- 27th&28th Aug - Review codfw weights. In particular make sure section distribution and loads are ok
- 27th,28th,29th,31st Aug - Definitive Warmup (pc, es, maybe some big tables on big wikis)
- 1st Sept - 14:00 UTC Switchover to codfw support and monitoring
- 3th Sept - Disconnect eqiad -> eqiad replication
- 3th Sept - Maintenance can start on eqiad
- TBD - Enable eqiad -> codfw replication
- TBD - Review eqiad weights
- TBD - Enable eqiad -> codfw replication
- TBD - 14:00 UTC Switchover support and monitoring
Tasks
- META task for all the tasks: https://phabricator.wikimedia.org/T243316
- Schema change on production for increase the size of wbt_text_in_lang.wbxl_language https://phabricator.wikimedia.org/T237120
- Switchover s8 primary database master db1109 -> db1104 https://phabricator.wikimedia.org/T239238
- Compress new Wikibase tables https://phabricator.wikimedia.org/T232446
- Normalise MW Core database language fields length https://phabricator.wikimedia.org/T253276
- Compress enwiki InnoDB tables https://phabricator.wikimedia.org/T254462
- Apply updates for MCR https://phabricator.wikimedia.org/T238966
- pl_namespace index on pagelinks is unique only in s8 https://phabricator.wikimedia.org/T256685
- Schema change to make change_tag.ct_rc_id https://phabricator.wikimedia.org/T259831
- Host that needs rebooting for kernel upgrade https://phabricator.wikimedia.org/T261389
- Failover DB masters in row D https://phabricator.wikimedia.org/T186188
2018 Switch Datacenter
Important dates
- Switchover date: Wednesday, September 12th 2018: 14:00 UTC
- Switchback date: Wednesday, October 10th 2018: 14:00 UTC
Failover schedule
- 5th Sept - LAST DB maintenance day in all DCs
- 6th Sept - Enable replication codfw -> eqiad
- 6th Sept - Review db-codfw.php. In particular make sure section distribution and loads are ok
- 11th Sept - Definitive Warmup (pc, es, maybe some big tables on big wikis)
- 12th Sept - 14:00 UTC Switchover to codfw support and monitoring
- 4th Oct - LAST DB maintenance day on eqiad
- 5th Oct - Review db-eqiad.php
- 8th Oct - Enable eqiad -> codfw replication
- 10th Oct - 14:00 UTC Switchover support and monitoring
Tasks
- META task for all the tasks: https://phabricator.wikimedia.org/T189107
UPGRADE MARIADB on eqiad masters: https://phabricator.wikimedia.org/T204311SCHEMA CHANGE (only 6 masters pending): Fix WMF schemas to not break when comment store goes WRITE_NEW https://phabricator.wikimedia.org/T187089UPGRADE MARIADB: Upgrade db1062 and db1075 MARIADB: https://phabricator.wikimedia.org/T181777UPGRADE SOCKET LOCATION: Upgrade db1062 and db1075 socket location https://phabricator.wikimedia.org/T148507Pending only db1073 (m5 master) - stalled waiting for the network maintenance
UPGRADE KERNELS: Upgrade kernels in eqiad masters (or "difficult hosts"): https://phabricator.wikimedia.org/P7510https://phabricator.wikimedia.org/P6592https://phabricator.wikimedia.org/T184267SCHEMA CHANGE (only masters pending): Make several mediawiki table fields unsigned ints on wmf databases https://phabricator.wikimedia.org/T89737SCHEMA CHANGE (Only s4 master pending): Drop eu_touched in production https://phabricator.wikimedia.org/T144010SCHEMA CHANGE: Deploy schema change for adding numeric primary key to wbqc_constraints table https://phabricator.wikimedia.org/T189101SCHEMA CHANGE: Remove partitioning from metawiki.pagelinks from s7 masters (eqiad and codfw): https://phabricator.wikimedia.org/T203548RECLONE HOST: Reclone db1114 (s1 api): https://phabricator.wikimedia.org/T203565RECLONE HOST: Reclone db2054 and db2068 (s7): https://phabricator.wikimedia.org/T204127- (DECIDED NOT TO BE DONE) Failover masters on row D (stalled task as per the network issues): https://phabricator.wikimedia.org/T186188
2017 Switch Datacenter
Important dates
- Switchover date: Wednesday April 19th, 14:00 UTC
- Switchback date: Wednesday May 3rd, 14:00 UTC
Failover schedule
- 7th Apr - Warm up test for es
- 12th Apr - LAST DB maintenance day in all DCs
- 12th Apr - Review db-codfw.php
- 18th Apr - Definitive Warmup
- 19th -14:00 UTC Switchover to codfw support and monitoring
- 28th Apr - Review db-eqiad.php
- 28th Apr - LAST DB maintenance day on eqiad
- 3rd May -14:00 UTC Switchover support and monitoring
Tasks
ALTER eqiad shards: https://phabricator.wikimedia.org/T130067
ALTER TABLE watchlist ADD COLUMN wl_id int unsigned NOT NULL AUTO_INCREMENT FIRST, ADD PRIMARY KEY (wl_id);
ALTER eqiad shards: https://phabricator.wikimedia.org/T147166
ALTER TABLE tag_summary ADD COLUMN ts_id INT UNSIGNED NOT NULL AUTO_INCREMENT FIRST, ADD PRIMARY KEY (ts_id);ALTER TABLE change_tag ADD COLUMN ct_id INT UNSIGNED NOT NULL AUTO_INCREMENT FIRST, ADD PRIMARY KEY (ct_id);
ALTER eqiad enwiki master: https://phabricator.wikimedia.org/T132416
./osc_host.sh --host=db1052.eqiad.wmnet --port=3306 --db=enwiki --table=revision --method=ddl --no-replicate "drop key rev_id, drop primary key, add primary key (rev_id), add key rev_page_id (rev_page,rev_id)"
ALTER eqiad s4 master: https://phabricator.wikimedia.org/T73563
ALTER TABLE filearchive MODIFY COLUMN fa_minor_mime varbinary(100) default "unknown";ALTER TABLE image MODIFY COLUMN img_minor_mime varbinary(100) NOT NULL default "unknown";ALTER TABLE oldimage MODIFY COLUMN oi_minor_mime varbinary(100) NOT NULL default "unknown";
ALTER s5 wikidatawiki wb_terms table on eqiad hosts (and codfw ones codfw is back in standby): https://phabricator.wikimedia.org/T162539 https://phabricator.wikimedia.org/T163548
Remove partitions from s7 metawiki.pagelinks eqiad master: https://phabricator.wikimedia.org/T153300
osc_host.sh --host=db1041 --port=3306 --db=metawiki --table=pagelinks --method=ddl --no-replicate "remove partitioning"(several)
- Extra PRIMARY key additions (TBD) https://phabricator.wikimedia.org/T17441
Master failovers: https://phabricator.wikimedia.org/T162133Spread new masters in different racks: https://phabricator.wikimedia.org/T163895
There's a few jessie-based mysql master servers which are still on Linux 3.19: es1011, es1014, It would be great to reboot these to Linux 4.4 (or ideally 4.9 right away)Upgraded pending 10.0.22 es hosts (es1011, es1014 and es1015) to mariadb 10.0.28