Jump to content

User:Ladsgroup/Schema change lifecycle policy

From Wikitech

UNDER CONSTRUCTION

This page outlines policy regarding how to implement changes to MediaWiki databases including creating new tables, dropping old ones and schema changes.

Lifecycle of a change to database schema

  • First, a need for such change arises. It can be unblocking a new feature or optimizing a table to make it more efficient (such as normalization).
  • A ticket tagged with schema-change and schema-change-lifecycle should be created, outlining why and what is the desired change.
  • DBAs and SMEs should be invited to review the idea and give feedback. Bigger changes should gather feedback from more stakeholders.
  • At least one DBA must sign off on the change. For new tables, this is the approval on the new schema (one more sign off will be needed, see below). Once that is done, the ticket should be moved to "Approved" column in schema-change-lifecycle board.
    • The DBA sign off should evaluate the change on the following aspects:
      • Impact on reliability of production.
      • SQL best practices (e.g. normalization, avoiding ENUMs, mediawiki coding convention, etc.)
      • Long-term sustainability of the change.
    • DBAs may request or suggest changes. DBAs may veto a change if they conclude it can put production reliability at risk.
  • If the schema change will require changes to queries (e.g. wikireplicas, analytics pipelines, mediawiki), it must be communicated via appropriate channels (cloud-l, wikitech-l, Tech news) to people being impacted by the change. A grace period of at least two weeks until deployment of the change is required. Changes to widely-used tables (such as revision) should give more time.
    • Examples of schema changes that won't need an announcement include adding/removing/changing indexes, minor data type changes (binary -> varbinary and vice versa) and so on.
    • Schema changes to private tables won't need announcements to cloud-l or wikitech-l but the impact on analytics pipelines must be assessed. If there will be an impact, it must be communicated to Data_Platform_Engineering and the grace period must be respected, otherwise this step can be fully skipped.
  • After the feedback is gathered, a patch is made to apply it on the software. See mw:Manual:Schema changes on how to do that.
  • Once the patch is made, SMEs with +2 rights can review and merge the patch.
  • It is responsibility of the driver of the change to make sure MediaWiki core and Wikimedia deployed extensions are updated and won't break after deployment of the change.
  • Then depending on the type of the change:
    • For new tables: a ticket tagged with DBA is created and the second sign off by DBAs is needed. See article 5 of Creating new tables. DBAs will do the necessary work of filtering the table before signing it off. It should also be added to Wikimedia's tables catalog.
      • Once it's signed off, a developer can create the table.
    • For general schema changes: A ticket to deploy the change needs to be created and tagged with DBA and schema-change-in-production.
      • DBAs triage and apply the schema change. Depending on complexity of the schema change, it can take from one day to months.

Exceptions

  • This policy does not apply to MediaWiki extensions that are not deployed in Wikimedia production but new extensions planning to be deployed soon must still be reviewed and signed off by DBAs.
  • In case of major emergencies such as long outages, this policy may be bypassed by DBAs.

Other policies

Other policies that must be followed include:

Technical Help pages