MariaDB/pt-heartbeat
pt-heartbeat is a service running on MariaDB servers acting as masters. A process on the source server writes a timestamped row to the heartbeat table every second. A process on a replica compares the most recent entry in the table to the host clock to evaluate the replication lag. MediaWiki load-balancer will avoid replicas that don't have an up-to-date heartbeat entry.
The process writes to a database called heartbeat that contains a table called also heartbeat which has the following structure:
CREATE TABLE `heartbeat` (
`ts` varbinary(26) NOT NULL,
`server_id` int(10) unsigned NOT NULL,
`file` varbinary(255) DEFAULT NULL,
`position` bigint(20) unsigned DEFAULT NULL,
`relay_master_log_file` varbinary(255) DEFAULT NULL,
`exec_master_log_pos` bigint(20) unsigned DEFAULT NULL,
`shard` varbinary(10) DEFAULT NULL,
`datacenter` binary(5) DEFAULT NULL,
PRIMARY KEY (`server_id`)
) ENGINE=InnoDB DEFAULT CHARSET=binary
The process is controlled by pt-heartbeat-wikimedia.service systemctl unit and can be stopped and started as follows:
$ systemctl start pt-heartbeat-wikimedia.service
$ systemctl stop pt-heartbeat-wikimedia.service
The upstream version is developed by Percona, we use a modified version (pt-heartbeat-wikimedia).
See also
- replag.toolforge.org
- [Labs-l] Lag reporting on lab db replicas
- profile/files/wmcs/db/wikireplicas/views/heartbeat-views.sql
This page is a part of the SRE Data Persistence technical documentation
(go here for a list of all our pages)