Jump to content

Test Kitchen/Experiment exposure logging

From Wikitech

Experiment exposure events improve experiment results by improving data quality and the flexibility of analysis. They do this by:

  • They include a curated set of contextual attributes that enables us to provide dimensions for viewing experiment results (e.g. comparing results between logged-out, temporary, and registered users).
  • They enable us to exclude data collected from a subject prior to their first exposure.
  • They help us assess the experiment's health to make sure that no subject is exposed to more than one variation during the experiment.

This guide includes scenarios, code examples, best practices, and recommended patterns for logging an experiment subject's exposure to the experiment. It is intended to help developers:

  • maximize true positives (logging exposure when exposure actually took place)
  • minimize false positives (logging exposure when there wasn't any exposure) and false negatives (failing to log exposure when exposure should have been logged)

For example: in the Minerva skin (mobile view), the Donate button is hidden inside a navigation drawer. If you are A/B testing an appearance change for the Donate button, an exposure should only be logged after the reader taps the hamburger button and opens the navigation drawer.

Exposure should be logged at the point of user experience divergence. Ask yourself: when does the user experience actually diverge between the control and treatment groups? If the user experience is different at initial page load, then page loading is the exposure moment. If the user experience is the same until the user interacts with a particular UI element at which point the experience diverges, then that click is the exposure moment.

Prerequisites

Before reading this guide please review the guide on conducting experiments, particularly these sections:

  • Experiment design: identifier type
  • Code

as this guide assumes you are familiar with the differences between identifier types, client-side vs server-side instrumentation, and instrumentation for feature toggling vs analytics.

Types of experiments

A key thing to remember in all of these scenarios is that when a subject in the control group does not get the treatment that is being tested, that is still an exposure – it is an exposure to the control variation of the experiment.

Logged-in experiments

If you are not varying the DOM of the page server-side but are conducting a logged-in experiment and are only varying appearance using Test Kitchen's CSS classes, please refer to the section below.

For experiments which use the mw-user identifier for enrollment, exposure can be logged in the same context as the varying happens in because the PHP SDK can be used for both executing different code based on membership (not enrolled, enrolled & assigned to control, enrolled & assigned to treatment) and for sending events:

Good exposure logging in PHP
if ( $experiment->isAssignedGroup( 'treatment' ) {
  // Code that executes only for subjects in the "treatment" group
}
$experiment->sendExposure();

This would record when the enrolled subject – assigned to either control or treatment – was exposed to their respective variation.

What we don't want to do is limit exposure logging only to those in the treatment group:

Bad exposure logging in PHP
if ( $experiment->isAssignedGroup( 'treatment' ) {
  // Code that executes only for subjects in the "treatment" group
  $experiment->sendExposure();
}
// What about the control group??? They've been exposed to the experiment too!

The Experiment#sendExposure() method has been implemented to minimize volume of events being sent from the same context. Suppose you have a PHP file like:

Multiple calls in PHP
. . .
// Experiment-specific code:
if ( $experiment->isAssignedGroup( 'treatment' ) {
  // Code that executes only for subjects in the "treatment" group
}
$experiment->sendExposure();
. . .
// Non-experiment code
. . .
// Experiment-specific code:
if ( $experiment->isAssignedGroup( 'treatment' ) {
  // More code that executes only for subjects in the "treatment" group
}
$experiment->sendExposure();

When that code executes for a single page request, even though there are 2 calls to sendExposure(), there would only be 1 experiment exposure event sent by that instrumentation.

All user traffic ("everyone") experiments

If you are not varying the DOM of the page server-side but are conducting an all user traffic experiment and are only varying appearance using Test Kitchen's CSS classes, please refer to the the section below.

For experiments which use edge-unique identifier for enrollment, exposure events – like all other events – have to be sent client-side.

Client-side feature toggling

As with logged-in experiments, exposure can be logged in the same context as the varying happens – just with the JS SDK instead of the PHP SDK:

Good exposure logging in JS
if ( experiment.isAssignedGroup( 'treatment' ) {
  // Code that executes only for subjects in the "treatment" group
}
experiment.sendExposure();

Likewise, this would record when the enrolled subject – assigned to either control or treatment – was exposed to their respective variation.

What we don't want to do is limit exposure logging only to those in the treatment group:

Bad exposure logging in JS
if ( experiment.isAssignedGroup( 'treatment' ) {
  // Code that executes only for subjects in the "treatment" group
  experiment.sendExposure();
}
// What about the control group??? They've been exposed to the experiment too!

Do not be afraid to call sendExposure() whenever your code varied what code to execute based on which group the client was assigned to. The Experiment#sendExposure() method has been implemented to minimize volume of events being sent from the same context. In PHP that is limited to a single request, but in JS we not only minimize on page view basis, we also minimize on session basis. See T414738 for more technical details.

Multiple calls in JS
. . .
// Experiment-specific code:
if ( experiment.isAssignedGroup( 'treatment' ) {
  // Code that executes only for subjects in the "treatment" group
}
experiment.sendExposure();
. . .
// Non-experiment code
. . .
// Experiment-specific code:
if ( experiment.isAssignedGroup( 'treatment' ) {
  // Code that executes only for subjects in the "treatment" group
}
experiment.sendExposure();

Not only would those 2 calls to sendExposure() produce at most 1 experiment exposure event per page view, if the enrolled subject visited multiple pages where that code executed in the same session, all those calls to sendExposure() would produce at most 1 experiment exposure event per session.

Server-side feature toggling

Since experiments that use edge-unique identifier for enrollment can only send events client-side, this PHP code would not work in an experiment using edge-unique identifiers:

Exposure logging in PHP
if ( $experiment->isAssignedGroup( 'treatment' ) {
  // Code that executes only for subjects in the "treatment" group
}
$experiment->sendExposure();

We need a way to log exposure on the client-side with the JS SDK, but only when exposure to a variation actually happened.

Reminder: Both control and treatment are variations. If the enrolled subject is assigned to the control group, they are given the control variation of the experiment.

Our recommendation is to:

  • Put client-side instrumentation into a ResourceLoader module
  • Log exposure first thing, with the rest of the instrumentation handling interactions with the feature
  • Load that module conditionally server-side for subjects enrolled in the experiment

On the server-side (e.g. includes/Experiments/SomeExperiment.php):

public function onBeforePageDisplay( $out, $skin ): void {

    // if ( some targeting/eligibility criteria, e.g. using minerva skin ) {

    $experiment = $this->experimentManager->getExperiment( 'some-experiment' );
    $assignedGroup = $experiment->getAssignedGroup();

    if ( $assignedGroup === 'treatment' ) {
        // alter page output in some treatment-specific way
    }
    
    // since we varied the page output, we should log exposure (on the client)
    if ( $assignedGroup !== null ) {
        // load instrumentation which includes a call to sendExposure():
        $out->addModules( 'ext.wikimediaEvents.someExperiment' );
    }
    
    // }
}

On the client-side (e.g. modules/ext.wikimediaEvents/someExperiment.js):

mw.loader.using( 'ext.testKitchen' ).then( () => {
    const experiment = mw.testKitchen.getExperiment( 'some-experiment' );

    experiment.sendExposure();

    // rest of the instrumentation which produces the events needed to calculate
    // desired metrics for the experiment
} );

If you are collecting any data that does not depend on exposure to a variation (e.g. page_visit event to measure retention rate, DAU), you should put that instrumentation into a separate module that is loaded unconditionally.

Until T419679: Make WikimediaEvents depend on TestKitchen is resolved, you need to place your experiment's client-side instrumentation into an asynchronous promise (mw.loader.using( 'ext.testKitchen' ).then( () => { ... } );) if you are using WikimediaEvents. Alternatively, if your experiment lives in your own extension (e.g. GrowthExperiments, ReaderExperiments), you can declare a dependency on TestKitchen and use mw.testKitchen without promises.

Finally, register your experiment code in extension.json, making sure to add the relevant components to Hooks and ResourceModules.

Special case: CSS-only experiments

For experiments that are implemented with Test Kitchen's CSS classes exclusively – varying appearance of existing DOM elements, not modifying the DOM with server-side feature toggling – logging exposure can be tricky.

It is going to be highly dependent on which elements' appearance you are varying and when those elements are actually shown to the user, and only you – the owner of the experiment – will know those details, so the best we can do here is offer some guidance. We would love to include an example here, so please reach out to us in #talk-to-experiment-platform on Slack if you have a CSS-only experiment.

Suppose we have an experiment called larger-footer and a variation x-large. If we already had a CSS file that we were loading for everyone and we did not want to conditionally load a CSS file just for visitors enrolled in the experiment, then we could have the following:

.test-kitchen-experiment-larger-footer-x-large .mw-footer-container {
    font-size: var( --font-size-x-large );
}

We know that the footer is rendered on every page, but it's not visible on every page. This means that exposure only happens when the footer becomes visible in the viewport (often requiring the user to scroll down far enough), so our instrumentation should take that into account. Suppose that even half of the footer being visible is enough for us to consider that the user was exposed to the footer. Given that, our instrumentation might look like:

const experiment = await mw.testKitchen.getExperiment( 'larger-footer' );

const observer = new IntersectionObserver(
    (entries, observer) => {
        entries.forEach( ( entry ) => {
            if ( entry.isIntersecting ) {
                if ( entry.intersectionRatio >= 0.50 ) {
                    // More than 50% of the footer is visible -> exposure!
                    experiment.sendExposure();
                    // We can stop checking for visibility now:
                    observer.unobserve( entry );
                }
            }
        } );
    }
)

const footerContainer = document.querySelector( '.mw-footer-container' );

observer.observe( footerContainer );

Scenario: Special page

What if we had an element that was always visible right away (without the user having to scroll down to see it), but it was only on, say, Special:Statistics? Suppose we wanted to experiment with the appearance of that table.

In that case, we would want to log exposure only when the user visited that page:

modules/ext.wikimediaEvents/specialStatisticsTableExperiment.js
const specialPageName = mw.config.get( 'wgCanonicalSpecialPageName' );

if ( specialPageName === 'Statistics' ) {
	const experiment = await mw.testKitchen.getExperiment( 'special-statistics-table-formatting' );
	experiment.sendExposure();
}