SRE/Observability/OKR

From Wikitech

FY2023/2024 Hypothesis

"If we convert MediaWiki statistics end-to-end to a Prometheus-capable metrics library and improve the MediaWiki metrics pipeline by updating and supporting current metrics subsystems, then all MediaWiki statistics will be aggregated and stored in a single modern system, supported by the observability team. This work would effectively improve the developer workflow and assist with troubleshooting by reducing tool fragmentation. The work would entail going through all integration points in MediaWiki where metrics are produced or the metrics library is imported, creating patches, and exporting Prometheus metrics in MediaWiki. This project will involve coordinating with all necessary teams and navigating the deployment process. We will measure success by ensuring that all metrics produced by Mediawiki are exported to Prometheus and supported by the observability team."

Project Objective

This project aims to convert MediaWiki statistics end-to-end to a Prometheus-capable metrics library and port/migrate all existing metrics to this new implementation.

The hypothesis driving this project is that by aggregating all MediaWiki statistics in a single modern system, we can improve the developer workflow and assist with troubleshooting by reducing tool fragmentation. Success will be measured by ensuring that all metrics produced by MediaWiki are exported to Prometheus and supported by the observability team.

Background

This project was initiated in response to an RFC (https://phabricator.wikimedia.org/T249164) and a desire to sunset our previous generation metrics collection stack (Graphite - Wikitech && T228380 Tech debt: sunsetting of Graphite (part 1)).

The transition to Prometheus metrics from Graphite offers several advantages, including the ability to tag and label data, adding multidimensionality to these metrics, and allowing us to stay on a currently supported system and decommission older tech.

The parent task in phabricator is: https://phabricator.wikimedia.org/T240685.

Asana Project: https://app.asana.com/share/wikimedia-foundation/hypothesis-we31/3758245663860/09b3c36be71ab8058ea820e7a7887d5a

POP (project one-pager) - Work in Progress: https://docs.google.com/spreadsheets/d/1MHYN7vXTaYh93mFXh-FLNIU6fDkFeoIvVfigi9XClG8/edit#gid=0 - This doc is primarily a "dashboard" of open tasks and how they map to milestones and status. The core of the work will be tracked in phabricator, but this view will be kept up to date for "easier" tracking purposes.