Jump to content

Metrics Platform/Conduct an experiment

From Wikitech

An experiment is a test of a hypothesis designed to provide trustworthy and generalizable data. It imposes an intervention on subjects with the intention of observing what outcome that intervention leads to.[1] This page describes how to create an experiment using the Metrics Platform.

This guide describes the process for creating an experiment using the Metrics Platform Experimentation Lab (also known as MPIC). For the manual process for creating an instrument, see Create an instrument. If you're unsure which process to use, contact Experiment Platform.

Plan

Experimentation scorecard

[In beta] The experimentation scorecard (restricted access while in beta) provides a template for creating an experiment.

Measurement plan

Instruments collect data about user interactions so that we can answer questions about product experiences. Before you can start collecting data, create a measurement plan (template) that documents what data you plan to collect, why, and how you plan to analyze the data.

You can write your measurement plan in a document or on a Phabricator task, depending on the scale of the project. For examples of measurement plans, see the folder on Google Drive.

Instrumentation spec

Once you have a measurement plan, the next step is to create an instrumentation specification (template). The instrumentation spec defines all the data you'll collect for your instrument. The spec is also a useful tool for engineers to ensure that all events are being produced and received correctly. For a template and examples of instrumentation specs, see the folder on Google Drive. For more information about designing an instrumentation spec, see the instrument guide.

Data collection guidelines

All data collection activities must follow the data collection guidelines. Once you've identified the applicable risk tier, you can use your measurement plan and instrumentation spec to complete the steps in the guidelines under "What should WMF teams do next?".

Code

You can write your instrument code in the WikimediaEvents extension or in your product codebase. See the API docs to learn how to code an instrument.

Experiment enrolment sampling

Experiment enrolment sampling is the act of enrolling users into experiments and consistently assigning an enrolled user a variant of the feature that is being experimented on.[2]

An experiment enrolment sampling algorithm, therefore, is a function, method, or process that accepts some inputs and returns a variant, i.e.

module Experiments {
    enroll( user: User, experiment: Experiment ): Variant;
}

Where:

  • user is a token that represents that user for at least the duration of the experiment; and
  • experiment is one or more constants that define the experiment, e.g. name, sample rate, and variants

Properties

Such an algorithm must:

  • Ensure consistency of assignment within experiment. For example, if there are two experiments running, both having two variants, then the same user should be assigned the same variant for the same experiment in such a way so as to ensure that the following assignments are equally likely:
Experiment 1 Experiment 2
Variant 1 Variant 1
Variant 1 Variant 2
Variant 2 Variant 1
Variant 2 Variant 2

and not:

Experiment 1 Experiment 2
Variant 1 Variant 1
Variant 2 Variant 2
  • Be able to sample on a variety of levels, e.g. page, session, user, application install
  • Sample when needed, e.g. sample when a user visits a specific page and thereby not assign groups to users who never visit that page
  • Not require a backing store

Caveats

In order for the first and last properties mentioned above to hold, any system using such an algorithm must lock or freeze the inputs to the algorithm for the duration of the experiment. However, it should be OK to extend the end date of an in-progress experiment.

Launch

Once your instrument code has been deployed to production, complete the launch-your-experiment form to configure your experiment and start collecting data. Note that, once the experiment is registered, it won't start collecting data until you turn it on.

Monitor

You can use the experimentation directory to monitor the progress of your experiment. Your event data will be available in a Hive table with the same name as the event stream, with dots and dashes replaced with underscores.

Manage

The experimentation directory provides options to edit and turn off your experiment using the Actions menu.

Decommissioning

Instruments that capture metrics related to an experiment should be disabled once the experiment is complete. To decommission an experiment:

  1. Turn off the experiment using the Actions menu.
  2. Remove the instrument code that calls a Metrics Platform client API.

Example

This section shows how one can use the standard CTR instrument to run a simple A/B test in MediaWiki.

Note that in the case of MediaWiki, the WikimediaEvents extension contains modules that can be used for adding instrumentation and experimentation code.

Once you've gone through the preliminary steps of designing your experiment, you can use EPIC (fka MPIC) to manage your experiment configuration. Assuming that the experiment is named Experimentation Lab Test 1 Experiment with a feature name called experimentation_lab_test_1_feature, and that it is using the web base schema, this experiment would need to be activated once you are ready to start the experiment.

The following javascript file (with matching references to the corresponding experiment slug in EPIC/MPIC) would be named ExLabTest1.js and included in the /ext.wikimediaEvents.exLab directory (note that it is using the main menu hide button in the Vector skin as the selector for the CTR instrument):

const userExperiments = mw.config.get( 'wgMetricsPlatformUserExperiments' );
const selector = '[data-pinnable-element-id="vector-main-menu"] .vector-pinnable-header-unpin-button';
const friendlyName = 'pinnable-header.vector-main-menu.unpin';
const experimentName = 'experimentation-lab-test-1-experiment';
const featureName = 'experimentation_lab_test_1_feature';

/**
 * Experimentation Lab's first test module
 *
 * This module includes the ClickThroughRateInstrument for testing A/B test enrollment.
 *
 * Note this is temporary code - it will be removed once we validate data collection (T383801).
 */
const ExLabTest1 = {

	init() {
		if ( !mw.eventLog.isCurrentUserEnrolled( experimentName ) ||
			!( featureName in userExperiments.assigned ) ) {
			return;
		}
		const ClickThroughRateInstrument = require( './ClickThroughRateInstrument.js' );

		if ( userExperiments.assigned[ featureName ] === 'true' ) {
			ClickThroughRateInstrument.start( selector, friendlyName );
		}
	}
};

module.exports = ExLabTest1;

The above code would need to be initialized in /ext.wikimediaEvents.exLab's index.js file:

module.exports = {
	ClickThroughRateInstrument: require( './ClickThroughRateInstrument.js' )
};

// ---

// Experimentation Lab Test 1
// ==========================

// This part of the file initializes the first end-to-end test of the Experimentation Lab (ExLab).
//
// Note that this is temporary code - it will be removed once Experiment Platform validate data
// collection (T383801).

const exLabTest1Enabled = require( './config.json' ).exLabTest1Enabled;

if ( exLabTest1Enabled ) {
	const ExperimentationLabTest1 = require( './ExLabTest1.js' );
	mw.requestIdleCallback( () => {
		ExperimentationLabTest1.init();
	} );
}

The following configuration would be needed in the wiki in which the experiment is being run (in this case testwiki):

'wgMetricsPlatformEnableExperiments' => [
	'default' => false,
	'testwiki' => true,
],

'wgMetricsPlatformEnableStreamConfigsFetching' => [
	'default' => false,
	'testwiki' => true,
],

'wgMetricsPlatformEnableStreamConfigsMerging' => [
	'default' => false,
	'testwiki' => true,
],

// Enables a test module for verifying experiment enrollment.
// This config var is temporary - it will be removed in T383801
'wgWMEExLabTest1Enabled' => [
	'default' => false,
	'testwiki' => true,
],

The name of the experiment javascript file would need to be included in the ResourceLoader module /ext.wikimediaEvents.exLab of WME's extension.json file (note that you need to add a config var for that will be used to add the experimentation lab module):

"ext.wikimediaEvents.exLab": {
	"localBasePath": "modules/ext.wikimediaEvents.exLab",
	"remoteExtPath": "WikimediaEvents/modules/ext.wikimediaEvents.exLab",
	"packageFiles": [
		"index.js",
		"ClickThroughRateInstrument.js",
		"ExLabTest1.js",
		{
			"name": "config.json",
			"config": {
				"exLabTest1Enabled": "WMEExLabTest1Enabled"
			}
		}
	],
	"dependencies": [
		"ext.eventLogging"
	]
}

You will also need to ensure the WME hook (WikimediaEventsHooks.php) is updated with the config variable conditional check for loading the module - in this case it is using the BeforePageDisplay hook:

if ( ExtensionRegistry::getInstance()->isLoaded( 'MetricsPlatform' ) &&
	$this->config->get( 'WMEExLabTest1Enabled' ) ) {
		$out->addModules( 'ext.wikimediaEvents.exLab' );
}

Once the experiment code and configuration are in production and the experiment is launched in EPIC/MPIC, you should start seeing events populate the corresponding table in Hive (if using the web base stream, the table to query is product_metrics_web_base).

References

  1. mw:Product_Analytics/Glossary#Experiment
  2. This section is based on the discussion in T372108 Document desired properties of an enrolment sampling algorithm