Metrics Platform/How to/Getting Started

From Wikitech

Overview

As it's explained here, Metrics Platform provides an standardized toolkit to create app and web instruments. This toolkit is composed of various resources, libraries or extensions:

  • MediaWiki: At this moment only MediaWiki web environment is considered in this article
  • MediaWiki Extensions: All the extensions you may need to work on Metrics Platform
  • Client Libraries: A set of common libraries that allows your instrument to produce events through the EventLogging extension.
  • Schemas: A set of common data contracts to validate the events that instruments produce
  • Stream configuration: A common place where to configure the streams that instruments use to produce their events

This section aims explain all these tools to provide a technical point of view from all of them, as well as to serve as an entry point to use and setup them properly.

MediaWiki

TBD

Event Platform

TBD Event Platform

MediaWiki Extensions

EventLogging

The MP client library is embedded in this extension, so its installation is a requirement to be able to create/modify any instrument or to work with Metrics Platform client libraries

EventBus

The EventBus extension propagates state changes (edit, move, delete, revision visibility, etc) to an EventGate instance, providing consumers of the service with the means of tracking changes to MediaWiki content.

EventStreamConfig

The EventStreamConfig extension is a utility extension that provides library functions and an API endpoint for exporting event stream configuration, using the variable $wgEventStreams. It does not directly provide any user functionality; rather, it provides code used by other extensions and services. It’s need for EventLogging and EventBus to work

WikimediaEvents

WikimediaEvents is the extension where almost all the instruments reside. Its installation is needed only in the case you want to create a new instrument or just to modify an existing one, except the Wikilambda instrument that has its specific extension (see below). If you only need to work on the client libraries, you don’t need to install this extension

Wikilambda

Wikilambda extension contains the instrument for Abstract Wikipedia. Its installation is needed only if you need to work on this instrument. No needed in any other case

Metrics Platform Client Libraries

Metrics Platform Client libraries is a set of standardized client libraries for analytics instrumentation at Wikimedia. Currently supported languages include Java, PHP, JavaScript, and Swift. You don’t need it if you are just creating instruments because they can be used them through the EventLogging extension

Schemas and data contract

There are a set of predefined schemas to validate the events that instruments produce. All of them are available at https://schema.wikimedia.org but, for Metrics Platform, we only need to work with the ones included in https://gerrit.wikimedia.org/g/schemas/event/secondary

Stream Configuration

Every time we create a custom schema to produce events according to a custom stream, we’ll need to configure it. We have to provide some information to identify the stream and the schema used and some other information as sampling rate or which fields from the common schema are going to be added to the produced event

Setup

Requirements

There are no OS requirement, just Docker to run MediaWiki and all Metrics Platform dependencies as a container.

Mediawiki Docker Environment for Metrics Platform

There is a dedicated article about how to Setup Mediawiki for Metrics Platform that also includes how to produce events using Metrics Platform with the Javascript implementation through EventLogging extension to test that everything works fine and events are produced using a local Event Gate.

Specific environments per language

Regardless of whether you are going to create an instrument or working on a specific client library for Metrics Platform, you’ll need to create the right environment depending on the language/platform you are going to use. That way you’ll be able to update dependencies and run tests with the same configuration they are going to be run in the production environment. At this moment the only environment we have available is the Javascript/NodeJS one through fresh-node which is a script that creates a bash session under a container with the same configuration (OS, node, . . ) that the production environment. We are also working on something similar for PHP. Nothing similar is needed for Java and Swift, it will be enough if you install the right version of the recommended tools and libraries

Javascript

For the Javascript library we provide an environment via Fresh

A Fresh environment is a fast and ready-to-use Docker container with various developer tools pre-installed. Including Node.js, and headless browsers. It aims to help to run npm packages on your machine, without putting your personal data at risk!

To install:

curl -fsS 'https://gerrit.wikimedia.org/g/fresh/+/23.08.1/bin/fresh-install?format=TEXT' | base64 --decode | python3

Once fresh-node is installed, you can launch your NodeJS environment just running fresh-node. You'll notice that your bash prompt change because you are now running your new environment. Once there you can start installing dependencies, for example:

santi@mylaptop ~ % fresh-node
# 🌱 Fresh 23.08.1 ░ Node.js 18 ░ npm 9 ░ Firefox 102 ░ Chromium 115 ░ Debian 11 Bullseye
# image: docker-registry.wikimedia.org/releng/node18-test-browser:0.0.1
# mount: /santi      ➟ /Users/santi      (read-write)

nobody@e99bb96f8bd0:/santi$ npm install
. . .
. . .

Note: Schemas/fragments work can be also done using this environment because we use npm to build them

PHP

We are currently working on a dedicated PHP Development Environment.

In the meantime (while working on creating a PHP environment) you can run a bash session directly using the buster-php74 image to install dependencies or run tests.

For example:

docker run -it -v .:/work docker-registry.wikimedia.org/dev/buster-php74:1.1.0-s2 /bin/bash
cd /work
rm -r vendor
rm composer.lock

Java

Swift

TBD

Recommended IDEs per language

  • Visual Studio Code [https://code.visualstudio.com]. This IDE is free to use and multilingual so it can be used for everything regarding Metrics Platform instrumentations, schemas, stream configuration and client libraries
  • PHPStorm [https://www.jetbrains.com/phpstorm]: This IDE can be use to develop PHP/Javascript instrumentation and also in the case you want to work with the Metrics Platform PHP Client Library
  • WebStorm [https://www.jetbrains.com/webstorm]: This IDE can be use to develop Javascript instrumentation and also in the case you want to work with the Metrics Platform Javascript Client Library
  • IntelliJ IDEA  [https://www.jetbrains.com/idea/download]: This IDE can be use to develop PHP instrumentation and also in the case you want to work with the Metrics Platform Java Client Library
  • XCode for Swift (VS Code is also a valid choice)
  • Android Studio [https://developer.android.com/studio]: This IDE can be use to work on the Android integration

Testing

Unit tests

TBD

End-to-end testing against a local EventGate

If you have already installed all you need to work on Metrics Platform and want to make an end-to-end test against a local EventGate to see if your installation is properly done, take a look at Setup MediaWiki for Metrics Platform#Metrics Platform


-> Metrics Platform/How_To/Create An Instrument