Jump to content

Test Kitchen/Create an instrument

From Wikitech

An instrument is code added to an application to monitor its behavior, measure its performance, diagnose errors, and collect user interaction data to answer questions about product experiences.

Instrumentation, the process of adding instruments to an application, helps make application behavior and performance observable in production environments.

This guide helps you create a Test Kitchen instrument from start to finish, including:

  1. Plan
    1. Measurement plan
    2. Instrumentation specification
    3. Data collection guidelines
  2. Configuration
    1. Test Kitchen
      1. Access
      2. Create an instrument
  3. Code
    1. Server-side instrumentation
    2. Client-side instrumentation
    3. Using a custom schema and/or stream
      1. Considerations
      2. Stream configuration
      3. Instrumentation
  4. Test
    1. Local
  5. Launch
    1. Deploy feature code to production
    2. Sample rates
    3. Turn on an instrument
    4. Emergency shutdown
  6. Document instrument
  7. Review data
    1. Data in Hive
    2. Product health monitoring
  8. Decommission an instrument

Plan

Measurement plan

Before you start collecting data, create a measurement plan (template) that documents what questions you want to answer, what data you plan to collect that enables you to answer those questions, and how you plan to analyze the collected data. This helps minimize data over-collection.

You can write your measurement plan in a document or on a Phabricator task, depending on the scale of the project. For examples of measurement plans, see the folder on Google Drive.

Instrumentation specification

Once you have a measurement plan, the next step is to create an instrumentation specification ("spec") (template). The instrumentation spec defines all the data you'll collect with your instrument. The spec is also a useful tool for engineers to ensure that all events are being produced and received correctly. For a template and examples of instrumentation specs, see the folder on Google Drive.

Key information to include:

Schema Type

  • A schema defines an event’s data model and is used for validation upon receipt of events, as well as for data storage integration. An event schema is just like a programming data type.
  • In the context of Test Kitchen, you can choose a base schema (managed by the Experiment Platform team) designed to fit the needs of most instruments.
  • Choose a custom schema if base schemas don’t capture your data needs.

Stream Name

Contextual Attributes

  • Contextual attributes are fields in the event data that provide additional information about the performer who triggered the event and the wiki where the event occurred. The values of contextual attributes included in the stream configuration are populated automatically by Test Kitchen when the event is generated.

Sampling

Data collection guidelines

We've designed the contextual attributes that are set for `product_metrics.web_base` stream to result in a low risk data collection activity. If your instrument is using `product_metrics.web_base` stream, please select "low risk".

If you are using a custom stream for your instrument, or selecting contextual attributes that could potentially reveal PII (personally identifiable information), please assess the risk level as per data collection guidelines.

All data collection activities must follow the data collection guidelines.

In the Test Kitchen UI, the Regulation section determines the risk level based on the selected contextual attributes. Certain combinations of contextual attributes trigger privacy considerations, thereby prompting security and legal review. Validation rules guide instrument owners in assessing their data collection risk and what actions may change the risk level to higher or lower risk.

Configuration

Test Kitchen

Test Kitchen is a platform or suite of tools that helps Wikimedia teams make data-informed decisions about product experiences. You can create and manage instruments and their configuration in the Test Kitchen UI (formerly known as Experimentation Lab i.e. xLab and MPIC i.e. Metrics Platform Instrument Configurator)

Access

The Test Kitchen UI user authentication via CAS-SSO requires a Wikimedia Developer Account and membership in the nda group.

Create an instrument

In the Test Kitchen UI, click "New Instrument" (or navigate to https://mpic.wikimedia.org/create-instrument). The form is designed to match up with your instrumentation spec. Provide the key information (schema type, stream name, contextual attributes, sampling, etc) and save your configuration.

Note that your instrument’s machine-readable name must match its reference in your corresponding feature code.

Code

You can write your instrument code in the WikimediaEvents extension or in your product codebase. The SDK docs provide details on how to code an instrument both server-side and client-side.

Server-side instrumentation

Currently the PHP metrics client is provided by the EventLogging extension.

The PHP SDK provides more details about how to use the PHP metrics client to submit events:

$interactionData = [
    'action_source' => 'value for action source',
    'action_context' => 'value for action context'
];
$streamName = 'mediawiki.product_metrics.example_stream';

// submitClick()
EventLogging::getMetricsPlatformClient()->submitClick( $streamName, $interactionData );

// submitInteraction()
$action = 'hover';
$schemaID = '/analytics/product_metrics/web/base/1.4.3';

EventLogging::getMetricsPlatformClient()->submitInteraction(
    $streamName, $schemaID, $action, $interactionData
);

Example usage: SuggestedInvestigationsInstrumentationClient

Client-side instrumentation

While the JavaScript metrics client is currently provided by the EventLogging extension, the Experiment Platform team is working to move this functionality out of EventLogging and into the (to be renamed) MetricsPlatform extension.

As such, if your instrument is configured in Test Kitchen, you can retrieve its configuration in your feature code to submit events:

const instrument = mw.xLab.getInstrument('my-machine-readable-instrument-name');
instrument.submitInteraction('action');

Example patch: gerrit:1190370

Using custom schema and/or stream

By default, events produced by instruments use the analytics/product_metrics/web/base schema and flow into the product_metrics.web_base stream. The base stream's current configuration can be viewed at any time via the Stream Configs API and its entry in Event Streams config.

If your instrument requires a different set of contextual attributes to be collected than the ones collected by the base stream, you will need to configure a custom stream – but you can still use the base schema.

If your instrument requires collecting data that is not supported by the base schema, you will need to create a custom schema and a custom stream; refer to this guide on custom schemas for instructions.

Note that using custom schemas and streams is currently only supported by the JS SDK for client-side instrumentation. Base schema/stream cannot be overridden when using the PHP SDK for server-side instrumentation.

Considerations

  • Preferably the custom stream is opted out of User-Agent string collection using this mechanism.
  • The resulting table in Hive would not be allowlisted for event sanitization and would thus be subject to 90 day data retention.

Stream configuration

You will need to deploy stream configuration to declare a custom stream. Once deployed, your custom stream will be available in the select dropdown for the Stream name field in the Test Kitchen UI when you select Custom for your Schema type.

An example custom stream configuration:

'mediawiki_product_metrics_your_custom_stream_name' => [
    'schema_title' => 'analytics/product_metrics/web/translation',
    'destination_event_service' => 'eventgate-analytics-external',
    'producers' => [
        'eventgate' => [
            'enrich_fields_from_http_headers' => [
                'http.request_headers.user-agent' => false,
            ],
        ],
        'metrics_platform_client' => [
            'provide_values' => [
                'performer_is_logged_in',
                'performer_is_temp',
                'performer_pageview_id',
                'mediawiki_database',
            ],
        ],
    ],
],

The stream will also need to be added to the $wgMetricsPlatformExperimentStreamNames configuration variable in mediawiki-config:

'wgMetricsPlatformExperimentStreamNames' => [
	'default' => [
		'product_metrics.web_base',
		'mediawiki.product_metrics.reading_list',
		'mediawiki.product_metrics.readerexperiments_imagebrowsing',
		'mediawiki.product_metrics.your_custom_stream_name'

	],
],

Instrumentation

Next we need to override the default schema ID when we initialize instrument:

const instrument = mw.xLab.getInstrument( INSTRUMENT_NAME ):
instrument.setSchemaID( SCHEMA_ID );

const interactionData = {
	action_context: 'pageview'
};
instrument.submitInteraction( 'page-load', interactionData );

When instrument.submitInteraction() is called, the events will declare our custom schema and be produced to our custom stream.

Test

This section covers testing of instruments. Refer to Test_Kitchen/Stream_configuration for general instructions for testing analytics instrumentation.

Local

To test locally, you will need to have a Test Kitchen instance to configure your instrument. In the MetricsPlatform extension, you can run a local instance of Test Kitchen UI (fka xLab) and EventGate by following the instructions in the MediaWiki Docker configuration recipes.

If you have a local EventGate instance, you should be able to see validated events in the devserver logging stream when you trigger the targeted event in your local MediaWiki instance on your feature branch. You can also observe the event POST in your browser’s console.

screenshot of a local EventGate stream
Example of a local EventGate stream

For instructions on how to set up a local EventGate instance, see the EventLogging Devserver docs.

Launch

Deploy feature code to production

Feature code for your instrument must be deployed to production before you turn on your instrument in order for events to be sent using the instrument configuration fetched by the Test Kitchen extension.

Sample rates

It is recommended to start with a smaller sampling rate when you first go live with your instrument on real wikis - 10% of sessions or even 1% of sessions if the instrument is expected to produce a lot of events per sessions (i.e. sessionTick) so that the instrument owner doesn’t accidentally DDoS WMF.

Turn on an instrument

In the action menu of the catalog/list view of the Test Kitchen UI, you must turn on your instrument to begin collecting data.

screenshot of the Test Kitchen UI list view
Example of the Test Kitchen UI Catalog/List view

This activation step prior to the start date is required to make the instrument configuration available to the Test Kitchen extension which houses the Javascript and PHP SDKs that provide APIs for feature code to create instruments and submit events.

When you turn on your instrument, and the start date is in the future, you should see your instrument in the Test Kitchen API response: https://mpic.wikimedia.org/api/v1/instruments

Once your feature code is deployed to production and your instrument is turned on, data collection will begin on the start date that you specify in the Test Kitchen UI.

Emergency shutdown

If something is wrong, you can turn the instrument off in the Test Kitchen UI using the same menu you used to turn the instrument on.

Document instrument

Now that your instrument is active, complete these steps to document your instrument:

  • Update the instrument list: Add your instrument to the instrument list with links to documentation.
  • Make your instrument documentation discoverable: Your measurement plan and instrumentation spec are important resources for data analysts and code maintainers. To help people find these documents, link to them from your codebase wiki page, README, project page, DataHub page, or other frequently used documents. Duplicated documentation is more likely to be outdated, so always try to maintain a single source of information.

Review data

Within a few hours of your activated instrument’s start date, you should be able to query the product_metrics_web_base (if using the web base stream) table in Hive by filtering on your instrument name and the 1st hour of data collection from your activated instrument.

If you are using a custom schema, query the table that corresponds to the stream name instead.

The Event Platform (upon which Test Kitchen is built) has documented the process for viewing and querying events in both Beta and Production environments. This is the best current method to check that your instrument is generating events as expected.

Data in Hive

If you are working in a production environment, your event data will be available in a Hive table with the same name as the event stream, with dots and dashes replaced with underscores.

According to Data_Platform/Systems/Spark#Command-line_interfaces spark3-sql is the command line interface that will allow you to query Hive tables from any stat machine.

For example, if you want to find the data produced by some instrument that has used the web schema you can do something like the following to see how many events were produced for a specific day:

$ spark3-sql
$ select count(*) from event.product_metrics_web_base where year = 2025 and month = 11 and day = 05;

Product health monitoring

Once your data is available in Hive, you can set up monitoring dashboards and visualizations of your collected data as metric measurements (i.e. Superset).

See Test Kitchen/Automated analysis of experiments/Converting queries for product health monitoring

Note that we currently do not have an automated analytics system for product health monitoring like we do for experiments. We are exploring options in this problem space as part of SDS 2 objective in FY25/26.

Decommission an instrument

Instruments that measure product health may be long lived. If you no longer want to continue collecting data, the instrument should be disabled. See Decommission an instrument.