The Metrics Platform is a suite of services, standard libraries, and APIs for producing and consuming data of all kinds from Wikimedia Foundation products. For data producers, it aims to simplify the task of designing, implementing, and maintaining data-producing code (also called instrumentation), while offering better guarantees of quality, rigor, and safety. For data consumers, it aims to streamline the process of defining a new dataset and offer a rich set of tools to answer questions and generate insights with data.
The Metrics Platform:
- specifies how clients work with the Event Platform
- provides standardized algorithms, behaviors, and basic necessities for web and app tracking, including:
- standardized session ID generation, consistent across MediaWiki, Android, and iOs
- attaching the necessary metadata to logged events such as client-side timestamp recording when the event was generated
- determining which events are in-sample or out-of-sample based on a specific identifier (pageview, session, device)
- Metrics Platform/Client/Implementations - Differences between the various client implementations
- Metrics Platform/Client - Specification of interfaces, data structures, algorithms for conforming clients
- Metrics Platform/Event - Specification of format and properties of an event
- Metrics Platform/Event Context Attributes - Specification of context attributes
- Metrics Platform/Event Schema - Specification of compatible schema
- Metrics Platform/Event Stream Configuration - Specification of stream configuration
- Metrics Platform/Demos - Biweekly demo recordings
|Version||Release Date||MediaWiki JS||MediaWiki PHP||Wikipedia Android||Wikipedia iOS||Wikipedia KaiOS|
|1.0 (in development)||05-30-2021|
History and Rationale
Previously, different teams implemented their own EventLogging-based analytics solutions, isolated from each other. The Metrics Platform is an effort to unify that previous work and to establish consistency across platforms. That uniformity and consistency makes it possible to leverage data from multiple platforms to yield insights into how our users use our whole ecosystem of products in unison.
It also enables analysts to support teams which are not their primary teams – to be more portable. The legacy system, in which every instrumentation has its own quirks and naming is inconsistent, places a heavy burden on each analyst to learn and remember the specifics of their assigned teams' data; and if another analyst had to come in as back-up, they too would need to learn those specifics.