Auto schema

From Wikitech

This is the documentation on auto_schema. A set of python scripts to reduce the toil of schema changes for DBAs. This tool is a library rather than script to run. You should utilize it to your needs and can use it for more than schema changes (for example automated complex fixes across the fleet, an example below).

You must get your schema patches submitted to sre/schema-changes and get it reviewed before running in production. You can find lots of examples of schema changes files as well.

Schema changes

An example of schema change file would look like something like this (and need to be added in root of auto_schema)

from auto_schema.schema_change import SchemaChange

section = 's1'
downtime_hours = 4
should_downtime = True
ticket = 'T297189'

# Don't add set session sql_log_bin=0;
command = """ALTER TABLE flaggedtemplates DROP PRIMARY KEY, ADD PRIMARY KEY (ft_rev_id, ft_tmp_rev_id), DROP COLUMN ft_title, DROP COLUMN ft_namespace;"""

# Set this to false if you don't want to run on all dbs
# In that case, you have to specify the db in the command.
all_dbs = True

# DO NOT FORGET to set the right port if it's not 3306
# Use None instead of [] to get all pooled replicas
replicas = None


# Should return true if schema change is applied
def check(db):
    columns = db.get_columns('flaggedtemplates')
    if not columns:
        return True
    return 'ft_title' not in columns


schema_change = SchemaChange(
    replicas=replicas,
    section=section,
    all_dbs=all_dbs,
    check=check,
    command=command,
    ticket=ticket,
    downtime_hours=downtime_hours
)
schema_change.run()

Explanation of each variable:

  • section is the name of the section. Make sure the section is actually the section of replicas
  • downtime_hours how many hours it should downtime the host. Keep it in mind that if there are several layers of replication, it will take multiple times for replication to catch up. As a rule of thumb put three times more than the time it would take to run the alter db.
    • Set it to 0 to disable downtiming.
  • ticket is the ticket it will be used in SAL messages and cookbook runs.
  • command is the alter table command. Don't set the database if you set the "all_dbs" to True, otherwise you must set the database to avoid the schema change from failing.
    • Note: DO NOT add set session sql_log_bin=0; to the command. The script automatically adds it when needed.
  • all_dbs is boolean value to whether should try to run it on each db of the section or one db. For example, if you need to run it only on centralauth, set it to false but if you need to run it on all wikis, set it to True. If you need to run it on some of wikis (e.g. wikis that have flaggedrevs enabled), just set it to run on all wikis and handle that part in "check" function
  • replicas is either None or a list of replicas. Some example values:
    • None: all direct replicas of master of that section in active dc. so for s1 and eqiad being the active dc, you will get all hosts that get directly replicated from master of s1 in eqiad (including master of s1 in codfw, pooled replicas of s1 in eqiad, backup sources of s1, analytics db of s1)
    • ['db1135', 'db1163'] runs the schema change on the two hosts on port 3306
    • ['db1140:3311', 'db1135'] runs the schema change on the host with that given port.
    • Note: DO NOT mix hosts from different sections. You can mix hosts from different dcs (but in the same section) though.
    • Note: Make sure you set the correct port when it's not the default 3306
  • check is the function you need to write and pass (without calling it, note the missing ()) to the class. See below on what you can do with the db object. The function must return a boolean value. True if the schema change is not needed or done already. False if it's needed. In the example above, it checks if the wiki has that table (flaggedrevs is enabled on a subset of wikis) and bails out if not. Then checks if the column doesn't exists. Returns False if it does (so the schema change can go ahead)
    • Note: DO NOT add write queries or slow queries in check. Otherwise, hell would break loose.
    • Note: The check is being run twice on each host/db. First to see if it's needed and skips if not. The second time to check if the schema change has been done correctly. If it fails in the second run, the script stops in the first host and won't repool it.

Running

Once the script has been written and reviewed. You can run it (in cumin as root) with:

python3 run_schema_change.py --run --dc codfw
  • You might want to add --check first to see where the schema change is needed.
  • Without --run it will be in dry mode and won't make any actions against production but it outputs what commands it would run on where. Run every script on dry mode first and check the output.
  • Note: DO NOT run this in parallel in one section, it will depool too many replicas.
    • With exception of different dcs, you can run it in parallel with --dc set to eqiad and codfw to make it finish faster.
  • Note: Given that repooling each host takes around 45 minutes, schema change runs must be run in screen/tmux, otherwise it exits with an error.

If the schema change can be run live on dc masters (small ones that take at most a minute or so).

python3 run_schema_change.py --run --dc-masters --dc $DC

It will discover and run it on the dc masters automatically and it will not stop replication.

If the schema change is too large, you need to do a master switchover and then treat the old master as another replica.

logging

The log of its actions can be seen in directory logs/ under the name of the ticket. e.g. T297189.log. Note that dry runs also log into that file.

Logic

Depooling

  • Replicas that are in the active dc and are pooled, would be depooled. This is regardless of the if they are master themselves (e.g. sanitarium masters can get traffic too).
  • Masters of each dc won't get depooled for obvious reasons and schema change will be live on them.
  • Replicas of passive dc won't get depooled (you shouldn't set them anyway, just set it on master of the passive dc).
  • After depooling, the script waits for user traffic to drain.
    • Sometimes it might take very long time (due to mediawiki maintenance scripts keeping connection open).
    • It has a downtime of an hour, if the traffic doesn't drain after an hour, it'll repool and move on to the next host.
  • Before repooling, the script waits for replication to catch up.

With or without replication

  • If the host doesn't have replicas itself. It will be ran without replication.
  • If the host is a master:
    • If it's not the master of active dc. Meaning anything from sanitarium master to master of passive dc. It will get the change with replication.
    • If it's the master of the active dc. The schema change will run without replication.
    • Master of passive dc will get it with replication unless there is a replica to be skipped there which would make it without replication (and all of its replicas being done one by one).

Downtiming

  • Any host that the schema change is going to happen on them will get downtimed (if downtime value is set).
  • If the host has any replicas, those and replicas of replicas (and so on) will also get downtimed before start of schema change.
    • With the exception of master of active dc which would downtime only itself (and not the replicas) given that schema change won't happen with replication on these hosts.
    • Replicas (that will get the change with replication not directly) will be downtimed twice as long as the master.

Check function

The check function provide a db object you can run three function calls on them:

  • run_sql('sql): Runs raw SQL if not in dry run. It's discouraged to use.
  • get_columns('table_name', 'db_name'): db_name is mandatory if all_dbs is set to False, otherwise not needed. It returns a dictionary mapping column names to column info (another dictionary). The keys are column names of information_schema.columns table Here is an example of what it returns on user_properties table:
{
"up_user": {"TABLE_CATALOG": "def", "TABLE_SCHEMA": "viwiki", "TABLE_NAME": "user_properties", "COLUMN_NAME": "up_user", "ORDINAL_POSITION": 1, "COLUMN_DEFAULT": None, "IS_NULLABLE": "NO", "DATA_TYPE": "int", "CHARACTER_MAXIMUM_LENGTH": None, "CHARACTER_OCTET_LENGTH": None, "NUMERIC_PRECISION": 10, "NUMERIC_SCALE": 0, "DATETIME_PRECISION": None, "CHARACTER_SET_NAME": None, "COLLATION_NAME": None, "COLUMN_TYPE": "int(10) unsigned", "COLUMN_KEY": "PRI", "EXTRA": "", "PRIVILEGES": "select,insert,update,references", "COLUMN_COMMENT": "", "IS_GENERATED": "NEVER", "GENERATION_EXPRESSION": None},
"up_property": {"TABLE_CATALOG": "def", "TABLE_SCHEMA": "viwiki", "TABLE_NAME": "user_properties", "COLUMN_NAME": "up_property", "ORDINAL_POSITION": 2, "COLUMN_DEFAULT": "''", "IS_NULLABLE": "NO", "DATA_TYPE": "varbinary", "CHARACTER_MAXIMUM_LENGTH": 255, "CHARACTER_OCTET_LENGTH": 255, "NUMERIC_PRECISION": None, "NUMERIC_SCALE": None, "DATETIME_PRECISION": None, "CHARACTER_SET_NAME": None, "COLLATION_NAME": None, "COLUMN_TYPE": "varbinary(255)", "COLUMN_KEY": "PRI", "EXTRA": "", "PRIVILEGES": "select,insert,update,references", "COLUMN_COMMENT": "", "IS_GENERATED": "NEVER", "GENERATION_EXPRESSION": None},
"up_value": {"TABLE_CATALOG": "def", "TABLE_SCHEMA": "viwiki", "TABLE_NAME": "user_properties", "COLUMN_NAME": "up_value", "ORDINAL_POSITION": 3, "COLUMN_DEFAULT": "NULL", "IS_NULLABLE": "YES", "DATA_TYPE": "blob", "CHARACTER_MAXIMUM_LENGTH": 65535, "CHARACTER_OCTET_LENGTH": 65535, "NUMERIC_PRECISION": None, "NUMERIC_SCALE": None, "DATETIME_PRECISION": None, "CHARACTER_SET_NAME": None, "COLLATION_NAME": None, "COLUMN_TYPE": "blob", "COLUMN_KEY": "", "EXTRA": "", "PRIVILEGES": "select,insert,update,references", "COLUMN_COMMENT": "", "IS_GENERATED": "NEVER", "GENERATION_EXPRESSION": None}
}
  • get_indexes('table_name', 'db_name'): db_name is mandatory if all_dbs is set to False, otherwise not needed. Returns a dictionary with index names mapping to index information. Here is an example of what it would return when calling get_indexes('category):
{"PRIMARY": {"columns": ["cat_id"], "unique": True}, "cat_title": {"columns": ["cat_title"], "unique": True}, "cat_pages": {"columns": ["cat_pages"], "unique": False}}

For some examples, see Auto schema/examples

More

You can write checks that are not straightforward in bash and run them across the fleet. For example, here is a fix for heartbeat GRANTs in hosts:

from auto_schema.host import Host
hosts = [('db2103', 's1'),
('db1116:3318', 's8'),
('db1114', 's8'),
('db1111', 's8'),
('db1109', 's8'),
('db1104', 's8')]

for host_tuple in hosts:
    host = Host(host_tuple[0], host_tuple[1])
    grants = host.run_sql('show grants for `wikiuser`@`10.64.%`').replace('`', '').replace('\'', '')
    if 'GRANT SELECT ON heartbeat.* TO wikiuser@10.64.%' in grants:
        if 'GRANT SELECT ON heartbeat.heartbeat TO wikiuser@10.64.%' in grants:
            print('needs removing the grant')
            host.run_sql('set session sql_log_bin=0; REVOKE SELECT ON `heartbeat`.* FROM `wikiuser`@`10.64.%`')
        else:
            print('needs fixing the grant')
            host.run_sql('set session sql_log_bin=0; GRANT SELECT ON `heartbeat`.`heartbeat` TO `wikiuser`@`10.64.%`; REVOKE SELECT ON `heartbeat`.* FROM `wikiuser`@`10.64.%`')

Future

In future, we can slowly start using this for automation of mysql and kernel upgrades in the fleet (phab:T239814). Parts of this code also should be ported to use Spicerack and become a cookbook in itself.