The name of the host still refers to the old software, but should probably be changed to vrt1001 or ticket1001 on the next iteration.
The entirety of the infrastucture as well as the handling team is in progress of a renaming at the time of this writing (2021-09-10). That probably means you will see a variety of interchangeable or closely related terms in this document, namely OTRS, Znuny and VRT.
- VRT stands for Volunteer Response Team.
- VRTS stands for Volunteer Response Team System. We currently use software by Znuny GmbH.
- VRTS admins, volunteers who administer the queues and the agents. (~6 people)
- VRTS agents, individuals of that team. (~400+ people)
- Public documentation for general public interacting with VRT members: https://meta.wikimedia.org/wiki/Volunteer_Response_Team
- URL is https://ticket.wikimedia.org
- The root user/pass is in the ops password repo
- We use mod_perl with ModPerl::Registry, so whenever a file is changed, an apache2 reload is required.
- The "News" messages on the main Znuny login screen can be editing by modifying /opt/otrs/Kernel/Output/HTML/Templates/Standard/Motd.tt.
There is no need to update config files to add email addresses to the system; Inbound MX servers will automatically see that the queue exists or has disappeared. However it is possible (due to negative caching at the secondary mail exchangers) that new addresses will take up to two hours to begin working.
Our Znuny installation is almost fully database only. That is, all data is stored in a mysql database (m2 shard), and only configuration and code is stored locally on the hosting server. Znuny is open source and hence almost impossible to lose the code and most of the configuration is stored in puppet. There are a few configuration items that are stored locally on the server but this is temporary and those are in the end transferred to puppet. Hence we only care about database data being safe and backed up.
Database is regularly backed up once per week (on Wednesday currently). The infrastructure used is Bacula and most documentation from that page applies. The code doing the pre dump is in https://phabricator.wikimedia.org/diffusion/OPUP/browse/production/modules/role/templates/mariadb/backups/dumps-otrs.sh.erb, and bacula just backs up the resulting file.
It is important to know that this design choice, while increasing the reliability and recoverability of the system, degrades performance and makes DBAs unhappy as they have to manage a very large database. See https://phabricator.wikimedia.org/T138915
Restoring at a previous point in time is quite easy and all it takes is restore the dump from bacula (covered in Bacula) and applying it to the db server via the mysql command. Restoring individual items (like an article being deleted) is possible but quite complicated and difficult and requires a DBA to help isolate the specific transaction and avoid replaying it while replaying logs between the last backup and the time of the incident. It has never been done, nor required up to now.