Portal:Cloud VPS/Admin/Runbooks/Check unit status of maintain dbusers

From Wikitech
The procedures in this runbook require admin permissions to complete.

Error / Incident

The maintain-dbusers service is failing.

Debugging

To gather logs, just ssh to the host having the issue and check the backups_vm.service unit status:

root@cloudvirt1021:~# systemctl status backup_vms
● backup_vms.service - Backup vms assigned to this host
   Loaded: loaded (/lib/systemd/system/backup_vms.service; static)
   Active: failed (Result: exit-code) since Mon 2022-03-21 20:01:53 UTC; 17h ago
TriggeredBy: ● backup_vms.timer
   Process: 4099425 ExecStart=/usr/local/sbin/wmcs-backup instances backup-assigned-vms (code=exited, status=1/FAILURE)
  Main PID: 4099425 (code=exited, status=1/FAILURE)

Mar 21 20:01:52 cloudvirt1021 wmcs-backup[4099425]:     backup_entries = get_backups()
Mar 21 20:01:52 cloudvirt1021 wmcs-backup[4099425]:   File "/usr/lib/python3/dist-packages/rbd2backy2.py", line 600, in get_backups
Mar 21 20:01:52 cloudvirt1021 wmcs-backup[4099425]:     backup = subprocess.check_output([BACKY, "-ms", "ls"])
Mar 21 20:01:52 cloudvirt1021 wmcs-backup[4099425]:   File "/usr/lib/python3.9/subprocess.py", line 424, in check_output
Mar 21 20:01:52 cloudvirt1021 wmcs-backup[4099425]:     return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
Mar 21 20:01:52 cloudvirt1021 wmcs-backup[4099425]:   File "/usr/lib/python3.9/subprocess.py", line 528, in run
Mar 21 20:01:52 cloudvirt1021 wmcs-backup[4099425]:     raise CalledProcessError(retcode, process.args,
Mar 21 20:01:52 cloudvirt1021 wmcs-backup[4099425]: subprocess.CalledProcessError: Command '['/usr/bin/backy2', '-ms', 'ls']' returned non-zero exit status 100.
Mar 21 20:01:53 cloudvirt1021 systemd[1]: backup_vms.service: Main process exited, code=exited, status=1/FAILURE
Mar 21 20:01:53 cloudvirt1021 systemd[1]: backup_vms.service: Failed with result 'exit-code'.


You can also check the logs for a longer view:

root@cloudvirt1021:~# journalctl -u backup_vms


Common issues

Connectivity issues

The maintain-dbusers daemon connects to each of the NFS servers uwsgi-toolsdb-replica-cnf-web APIs to create the files in the NFS filesystem, to the toolsdb database, to each of the wikireplicas and to the accounts database.

To see the full list of databases and credentials to debug check the file /etc/dbusers.yaml

Related information

Old incidents

  • task T304408 cloudvirt1021 - CRITICAL: Status of the systemd unit backup_vms
  • task T303870 cloudvirt1022/Check unit status of backup_vms is CRITICAL

Contacts

Communication and support

Support and administration of the WMCS resources is provided by the Wikimedia Foundation Cloud Services team and Wikimedia movement volunteers. Please reach out with questions and join the conversation:

Discuss and receive general support
Stay aware of critical changes and plans
Track work tasks and report bugs

Use a subproject of the #Cloud-Services Phabricator project to track confirmed bug reports and feature requests about the Cloud Services infrastructure itself

Read stories and WMCS blog posts

Read the Cloud Services Blog (for the broader Wikimedia movement, see the Wikimedia Technical Blog)