Query killer

From Wikitech

The query killer on production databases is supposed to stop queries on replicas that take more than 60 seconds. It is implemented in native MySQL, see events_coredb_slave.sql. Under high load, it may not function properly, see db-kill for how to kill slow queries manually in emergencies.

Historically this was implemented as a 60s pt-kill job, when the number of slow queries on a replica grows beyond a threshold the slowest one above 60s is sniped to keep the box alive. It was introduced after query spikes caused outages in November 2013.

Statistics

Each killed query is recorded in the ops.event_log table. Events are removed after 24 hours.

MariaDB [(none)]> use ops
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
MariaDB [ops]> select * from event_log where event like "wmf_slave_wikiuser_slow%" limit 5;
+-----------+---------------------+-------------------------------+-----------------------------------------------------------------------------------------------------------------------+
| server_id | stamp               | event                         | content                                                                                                               |
+-----------+---------------------+-------------------------------+-----------------------------------------------------------------------------------------------------------------------+
| 171974878 | 2021-10-14 20:49:03 | wmf_slave_wikiuser_slow (>60) | kill 2459844921; SELECT /* SpecialRecentChangesLinked::doMainQuery  */  rc_id,rc_timestamp,rc_namespace,rc_title,rc_m |
| 171974878 | 2021-10-14 20:49:03 | wmf_slave_wikiuser_slow (>60) | kill 2459844921; SELECT /* SpecialRecentChangesLinked::doMainQuery  */  rc_id,rc_timestamp,rc_namespace,rc_title,rc_m |
| 171974878 | 2021-10-14 20:50:03 | wmf_slave_wikiuser_slow (>60) | kill 2459862622; SELECT /* SpecialRecentChanges::doMainQuery  */  /*! STRAIGHT_JOIN */ rc_id,rc_timestamp,rc_namespac |
| 171974878 | 2021-10-14 20:51:03 | wmf_slave_wikiuser_slow (>60) | kill 2459882167; SELECT /* SpecialRecentChanges::doMainQuery  */  /*! STRAIGHT_JOIN */ rc_id,rc_timestamp,rc_namespac |
| 171974878 | 2021-10-14 20:52:03 | wmf_slave_wikiuser_slow (>60) | kill 2459901127; SELECT /* SpecialRecentChanges::doMainQuery  */  /*! STRAIGHT_JOIN */ rc_id,rc_timestamp,rc_namespac |
+-----------+---------------------+-------------------------------+-----------------------------------------------------------------------------------------------------------------------+
5 rows in set (0.000 sec)