VRT System

Znuny, a community fork of OTRS which went closed source in version 7.x, forcing us to adopt Znuny, is installed on vtrs1001.eqiad.wmnet.

Terms

VRT stands for Volunteer Response Team.
- They use the platform to read and answer emails to a variety of incoming destinations, e.g. info-<language>@wikipedia.org, permissions-<language>@wikipedia.org, etc.
- Public details are at m:Volunteer Response Team
- Their private wiki is at vrtwiki:
VRTS stands for Volunteer Response Team System. We currently use software by Znuny GmbH.
- Znuny stands for the software released by Znuny GmbH. It's a community fork of OTRS community edition which went belly up in 2021 after being closed source since version 7.x
- OTRS was how we previously referred to the software and team. Remnants remain. (T280392 - migration task)
VRTS admins, volunteers who administer the queues and the agents. (~6 people)
VRTS agents, individuals of that team. (~400+ people)

Notes

Public documentation for general public interacting with VRT members: https://meta.wikimedia.org/wiki/Volunteer_Response_Team

URL is https://ticket.wikimedia.org
The root user/pass is in the ops password repo
We use mod_perl with ModPerl::Registry, so whenever a file is changed, an apache2 reload is required.
The "News" messages on the main Znuny login screen can be editing by modifying /opt/otrs/Kernel/Output/HTML/Templates/Standard/Motd.tt.

There is no need to update config files to add email addresses to the system; Inbound MX servers will automatically see that the queue exists or has disappeared. However it is possible (due to negative caching at the secondary mail exchangers) that new addresses will take up to two hours to begin working.

Backups

Our Znuny installation is almost fully database only. That is, all data is stored in a mysql database (m2 shard), and only configuration and code is stored locally on the hosting server. Znuny is open source and hence almost impossible to lose the code and most of the configuration is stored in puppet. There are a few configuration items that are stored locally on the server but this is temporary and those are in the end transferred to puppet. Hence we only care about database data being safe and backed up.

Database is regularly backed up once per week (on Wednesday currently). The infrastructure used is Bacula and most documentation from that page applies. The code doing the pre dump is in https://phabricator.wikimedia.org/source/operations-puppet/browse/production/modules/role/templates/mariadb/backups/dumps-otrs.sh.erb, and bacula just backs up the resulting file.

It is important to know that this design choice, while increasing the reliability and recoverability of the system, degrades performance and makes DBAs unhappy as they have to manage a very large database. See https://phabricator.wikimedia.org/T138915

Restoring at a previous point in time is quite easy and all it takes is restore the dump from bacula (covered in Bacula) and applying it to the db server via the mysql command. Restoring individual items (like an article being deleted) is possible but quite complicated and difficult and requires a DBA to help isolate the specific transaction and avoid replaying it while replaying logs between the last backup and the time of the incident. It has never been done, nor required up to now.

This page is a part of the SRE Collaboration Services technical documentation
(go here for a list of all our pages)