Test Kitchen
Test Kitchen (previously Metrics Platform, Experimentation Lab, xLab) is a platform that helps Wikimedia teams make data-driven decisions about product experiences through and A/B testing.
Choose a workflow:
Run A/B tests to compare variants and measure impact of changes
Track metrics and user behaviour without testing variants
Is Test Kitchen right for you?
✓ Good fit if you're working on:
- MediaWiki-based products (Wikipedia, mobile apps)
- New features without existing Event Platform instrumentation
- Products where you need standardized measurement
✗ Use Event Platform instead if you have:
- Browser extensions, backend services outside of MediaWiki, or programming languages not covered by the SDK
- Websites outside of MediaWiki (although these sites can still use the Test Kitchen schema conventions)
Guides
- Create an instrument
- Measuring clickthrough rates
- Local development setup
- Decommission an instrument
- Automated analysis of experiments
- Converting queries for product health monitoring
- Regulation section
Templates
- Measurement plan
- Instrumentation spec (for non-experiment instrumentation)
- Instrumentation spec for experiments
- Experimentation scorecard
Reference
- App schema
- Web schema
- Custom schemas
- Contextual attributes
- Interaction data
- Stream configuration
- Analytics sampling
- Feature Availability
Meta
Test Kitchen maintainer documentation
- Maintainer introduction
- Architecture
- Deploy client libraries
- Validate events
- Test Kitchen UI administration
- Custom Data Monitor
- Test Kitchen UI Design Document
- Documentation maintenance
- Incident reports
- SLOs
Support
If you have questions that are not covered by these pages, or additional guidance is needed, please do not hesitate to reach out using the Experiment Platform intake process.
Contributing
Contributions to the Test Kitchen client libraries are most welcome and appreciated. Learn more about contributing to Test Kitchen development
About
Built on the Event Platform, Test Kitchen provides standardized algorithms, behaviors, and basic necessities for web and app instrumentation, including:
- a specialized Event Platform client designed to require less work in creating instrumentation
- a predefined core interaction schema covering the most common data fields
- a library of event schemas designed for use across a wide range of projects
- a library of already existing instruments to be reused directly or as a starting point
- easy means to build new schemas to further enrich events with contextual data
- the ability to mix in different schemas depending on your needs
- standardized session ID generation, consistent across MediaWiki, Android, and iOS
- standardized session expiry
- determining which events are in-sample or out-of-sample based on a specific identifier (currently: pageview, session, or app install ID).
To learn more about Test Kitchen and which components are available to use, see the FAQ.
Background
Previously, different teams implemented their own analytics solutions in isolation from one another. Those solutions were typically based on the Legacy EventLogging pipeline and, more recently, the Event Platform. Test Kitchen is an effort to unify that previous work and to establish uniformity and consistency across platforms. Its objectives include:
- making it easier to implement and maintain instruments. Test Kitchen provides the SDKs and protocols that MediaWiki developers need to create sophisticated instruments in as few lines of code as possible while maintaining quality, rigor, and safety.
- making it easier for analysts to support teams which are not their primary teams. Previously, every instrument has its own quirks and conventions, which need to be remembered by analyst that works with the data produced by the instrument – including analysts that are providing temporary support.
- making it easier to leverage data from multiple platforms to yield insights into how our users use our whole ecosystem of products in unison.
All pages
- Analytics/Fragments
- Analytics sampling
- App schema
- Architecture
- Architecture/GrowthBook JS SDK Analysis
- Automated analysis of experiments
- Automated analysis of experiments/Converting queries for product health monitoring
- Automated analysis of experiments/Prepared metrics
- Conduct an experiment
- Contextual attributes
- Contextual attributes/Java
- Contextual attributes/JavaScript
- Contextual attributes/PHP
- Contextual attributes/Swift
- Contribute
- Core properties
- Create an instrument
- Custom Data Monitor
- Custom schemas
- Decision Records
- Decision Records/Changing Sampling During Experiment
- Decision Records/Deprioritize Custom Data
- Decision Records/Experiment end behaviour
- Decision Records/Increase action context limit
- Decision Records/Remove 24h requirement
- Decision Records/Single Table Per Base Schema
- Decisions informed by experiments
- Decommission an instrument
- Demos
- Deploy client libraries
- Documentation maintenance
- Experiment Notes
- FAQ
- Feature availability
- Glossary
- Incident reports
- Incident reports/2028-09-17 MinT for Readers AA test missing subject IDs
- Incident reports/2028-12-18 Incorrectly calculated frequentist statistics
- Instrument list
- Interaction data
- List of active and archived experiments
- Local development setup
- Maintainer introduction
- Measure product health
- Measuring clickthrough rates
- Overriding experiment enrollment
- Privacy considerations
- Proposals
- Regulation section
- SDK
- SDK/JavaScript SDK
- SDK/Java SDK
- SDK/PHP SDK
- SDK/Swift SDK
- Sampling
- Schemas
- Stream configuration
- Test Kitchen UI
- Test Kitchen UI/Administration
- Troubleshooting
- Validate events
- Web schema