SRE/Service Operations

From Wikitech
< SRE(Redirected from Service Operations)
Jump to navigation Jump to search

Service Operations (aka "serviceops") is a subteam of the Wikimedia Foundations's SRE team. The Service Operations team takes care of public and “user-visible” services alongside Technology and Product teams. This means, for example, our MediaWiki platform, but also the newer (micro)services that comprise our stack. The team is responsible for our new service infrastructure in production based on Kubernetes, plus a subset of the caching and storage service - memcached and redis. It also attends to the infrastructure of the supporting services and its components that we rely upon (think GitLab, Phabricator, mailing list systems, OTRS, etc…). Further we coordinate the efforts around the implementation of Service Level Objectives within the foundation.

Architecture: We are happy to talk to you about the operational architecture of your system, and discuss implementation options such as database, caching, shelling out or the operational aspects of running on kubernetes. The best way to start that conversation is to open a Phabricator task. You can use this template, New Task Example, but set the title to: Architecture Conversation Project ABC, if your project is named ABC. In addition tag the team "serviceops". See [1] for a past tasks for the maps 2.0 project.

Implementation: Once you are ready to implement your system, the best way to restart the conversation about your project named "ABC" is to open a "New Service Request ABC" Phabricator task and in addition tag the team "serviceops". New Task Example

Contact:. Best via Phabricator - tell us what you need and how urgent it is. For quick questions we can be reached on #wikimedia-serviceops connect.

Major Projects Status Overview:

  • Mediawiki on Kubernetes: planned for Q3 21/22 current: Q1 21/22 functional and performance testing
  • Service Level Objectives: FY 21/22 1st User facing SLO current: finalizing 4 technical SLOs

Further Information:

See mw:Wikimedia Site Reliability Engineering#Service Operations for more.