Metrics Platform/Analytics/Fragments
For information about creating schema fragments for Metrics Platform, see Metrics Platform/How to/Create a Custom Schema.
Schema Fragments
Fragment | $ref
|
---|---|
Common | /fragment/analytics/common/1.0.0#
|
App Identifiers | /fragment/analytics/app_identifiers/1.0.0#
|
Web Identifiers | /fragment/analytics/web_identifiers/1.0.0#
|
In the schema, reference the fragment(s) you wish to use and list which fields are required in every event.
Note: You must always reference the /fragment/analytics/common fragment in your schema. This fragment provides the client_dt
field which is assigned by EventLogging on the web and the Event Platform Client on iOS (and soon Android). It is recommended to include the client_dt
field in the required
list.
Fragment | $ref
|
---|---|
Activity sequencing (WIP) | /fragment/analytics/activity_seq/1.0.0#
|
MediaWiki Common (WIP) | /fragment/analytics/mediawiki_common/1.0.0#
|
User (WIP) | /fragment/analytics/mediawiki_user/1.0.0#
|
Page (WIP) | /fragment/analytics/mediawiki_page/1.0.0#
|
User Interface (UI) (WIP) | /fragment/analytics/ui/1.0.0#
|
A/B Testing (WIP) | /fragment/analytics/ab_testing/1.0.0#
|
Campaign attribution (UTM parameters, WIP) | /fragment/analytics/utm_parameters/1.0.0#
|
Example 1
Suppose we're running an A/B test on a new default skin for anonymous users and we are interested in measuring session length and average number of visited articles per session.
The schema would use the following fragments: core identifiers, page, UI, and A/B testing via:
allOf:
- $ref: /fragment/analytics/common/1.0.0#
- $ref: /fragment/analytics/web_identifiers/1.0.0#
- $ref: /fragment/analytics/mediawiki_page/1.0.0#
- $ref: /fragment/analytics/ui/1.0.0#
- $ref: /fragment/analytics/ab_testing/1.0.0#
And the following fields would need to be included in the one (1) event logged by the instrument on every page load:
required:
- client_dt
- web_session_id
- web_pageview_id
- page_namespace
- ui_screen
- test_name
- test_group
The remainder of this section describes these fields and others in those fragments.
Identifiers
App Identifiers
Use the following to include the app_identifiers fragment in your schema for Android and iOS mobile apps:
allOf:
- $ref: /fragment/analytics/app_identifiers/1.0.0#
- app_identifiers
app_install_id
(string)- Identifies an install of the app and persists across all sessions. When the user uninstalls the app and re-installs it, a new app install ID is randomly generated.
app_session_id
(string)- Identifies an app session: a cluster of actions taken by the user in the app within a limited period of time. A session ID is generated the first time it is requested by the instrumentation code, which will usually be soon after the user launches the app. A new session ID is generated anytime the app has been inactive (that is, in the background state) for at least 15 minutes or has been forcibly stopped by the OS or the user.
Web Identifiers
Use the following to include the web_identifiers fragment in your schema for MediaWiki-based desktop and mobile websites:
allOf:
- $ref: /fragment/analytics/web_identifiers/1.0.0#
- web_identifiers
web_session_id
(string)- Identifies a web session: a cluster of actions taken by the user on a website within a limited period of time. A session ID is generated the first time it is requested by the instrumentation code, which is usually the first time the user visits the website. In the current implementation, this ID is shared across windows, tabs, and page views in the same browser. The ID is normally regenerated after the browser is shut down; however, if the browser's "restore previous session" feature is used when it restarts, the previous ID is retained. Interactions across multiple pages in the same web session may be linked together via this identifier.
web_pageview_id
(string)- Identifies a single web page view (visit). This identifier is randomly generated the first time it is requested by the instrumentation code on any page view and persists for the lifetime of the page. When the user navigates to another page or refreshes/reloads the page, this identifier disappears and a new one is regenerated (when needed). Different visits to the same page will yield different pageview IDs (also called tokens). Interactions with multiple features (instrumented separately) on the same web page may be linked together via this identifier.
Sequences
Use the following to include the activity sequencing fragment in your schema:
allOf:
- $ref: /fragment/analytics/activity_seq/1.0.0#
- Activity sequencing (for reconstructing sequences of events)
activity_id
(string)- Identifies a sequence of actions in the same context or funnel. In the past, teams have used terms like "session ID" and "sub-session ID" to refer to a set of connected events, such as interacting with a widget. This identifier is useful for grouping together impressions with corresponding clicks, and for grouping together steps in a process such as making an edit. Activity identifier can be randomly generated or a counter.
sequence_id
(integer)- Starting at 1, this is a counter for reconstructing the order of events in the same activity. For a variety of reasons we cannot trust the timestamp of receipt or the client-side timestamp of when the event was generated for putting events in order. In cases where the exact sequence of events needs to be established, this identifier can be used to record which event happened 1st, which happened 2nd, and so on.
For example, suppose the user is making an edit. We group the actions performed in this activity with activity_id
. In the old way of doing things it would be feature-specific "editing_session_id". As the user interacts with various (instrumented) features/elements in the editor, previews the edit, continues editing, and finally publishes the edit, specific data about all of those interactions can be tracked in schema-specific fields, but the order in which those interactions happen is recorded in sequence_id
.
Data
User
Use the following to include this fragment in your schema:
allOf:
- $ref: /fragment/analytics/user/1.0.0#
Information about the user associated with the event
- Information about the user generating the event
is_anon
(boolean)- Whether user is logged-in (false) or anonymous (true)
user_id
(integer)- User's MW user ID; 0 if user is anonymous. User ID is specific the wiki that the event came from.
user_name
(string)- Cross-wiki username
user_edit_count
(integer)- The total number of edits by the user at the time of the event. Growth team retrieves this with
mw.config.get( 'wgUserEditCount' )
to record it for their experiments. May be useful as a proxy for experience at the time of the event.
Page
Use the following to include this fragment in your schema:
allOf:
- $ref: /fragment/analytics/page/1.0.0#
Information about the page associated with the event
- Information about the page the event generated on
wiki_db
(string)- Database name of the wiki (e.g. "enwiki", "commonswiki")
page_id
(integer)- Page's numeric ID in MediaWiki
page_ns
(integer)- Page's namespace code in MediaWiki (e.g. 0 for Main/Article, -1 for Special)
page_title
(string)- Title of the page
page_is_redirect
(boolean)- Whether the page is a redirect or not at the time of the event
User Interface
Use the following to include this fragment in your schema:
allOf:
- $ref: /fragment/analytics/ui/1.0.0#
Information about the UI associated with the event
- Information about the interface the user saw when the event was generated
ui_mw_skin
(string)- MediaWiki skin name (e.g. "Vector", "MinervaNeue", "Modern") at the time of the event; only applicable on MediaWiki, not on mobile apps
ui_color_mode
(string, enum)- Mode at the time of the event, currently only applicable on mobile apps, but Web is experimenting with it for MediaWiki.[1] One of: "light", "sepia", "dark", "black", "night"
ui_text_scale
(integer)- Only applicable for mobile apps where the user chooses from predefined text scales. 0 is for the middle (application default), -1 is for the smaller size while 1 is for the larger size. The actual size in points or pixels will vary by app and device, so we record a relative scale.
ui_screen
(object)- Information about the screen, such as dimensions, detailed below:
ui_screen.width_px
(integer)- Width of the screen in pixels
ui_screen.height_px
(integer)- Height of the screen in pixels
A/B Testing
Use the following to include this fragment in your schema:
allOf:
- $ref: /fragment/analytics/ab_testing/1.0.0#
Information about the A/B test (experiment) associated with the event
- Information about the experiment the user was enrolled in when the event was generated
tests
(array)- Any and all A/B tests the user is enrolled in at the time of the event. If the array is empty, the user was not in any A/B tests. If there is only one item, the user was in exactly one A/B test. If there are two or more items, the user was in several A/B tests.
Each item in the tests
array is an object identifying enrollment in a single A/B test with the following fields:
name
(string)- Name of the A/B test the user is enrolled in (e.g. "Desktop Redesign (Phase 3)" or "desktop-redesign-3"
group
(string)- Name of the group (sometimes called "bucket") the user was randomly assigned to â e.g. "control", "variant-a", "variant-b", "variant-c"
Examples:
"tests": [] "tests": [ { "name": "growth-homepage", "group": "control" } ] "tests": [ { "name": "growth-homepage", "group": "variant-1" }, { "name": "growth-help-panel", "group": "variant-2" } ]
Campaign Attribution
Use the following to include this fragment in your schema:
allOf:
- $ref: /fragment/analytics/utm_parameters/1.0.0#
Information about the UTM parameters associated with the event
- Information about where the user came from
utm_source
- Identifies which site sent the traffic, and is a required parameter. For example: "Wikipedia", "Twitter", "Facebook"
utm_medium
- Identifies what type of link was used such as "socialmedia" or "email"
utm_campaign
- Identifies a specific product promotion or strategic campaign. For example: "app_marketing_20200704" or "india_awareness_2017"
utm_term
- Identifies search terms (e.g. "mobile+app")
utm_content
- Identifies what specifically was clicked to bring the user to the site, such as a banner ad, a text link, or a sidebar button. It is often used for A/B testing and content-targeted ads.