Analytics/AQS/Wikistats 2/Metrics Definition
This page defines the metrics presented in Wikistats 2 and its API
Dimensions
Time boundaries - [start; end [
The boundaries in time in between which the metrics are computed. Any metric defined below needs it.
Time granularity - {daily, monthly}
Aggregation period for the metric. We provide daily or monthly aggregation period for almost all metrics.
Project
Filter for the wiki project you're interested in, represented as its main internet domain: en.wikipedia.org or www.wikidata.org for instance. The metrics cover almost all wiki projects (see Analytics/AQS/Wikistats 2/Data Quality#Other things).
For metrics making sense over multiple projects, you can use all-projects
to have the value accross all projects. Project-families can be used for all metrics except number of edited pages and number of editors (for technical reasons, we plan overcome that limitation in the future). You can access statistics aggregsated for all wikipedias using all-wikipedia-project
, while for instance all-wikivoyage-projects
will give you statistics aggregated over the wikivoyage project-family. Note: wikidata is not a project family but a project on its own, so you shouldn't use all-wikidata-projects
.
Editor-type - {anonymous, group-bot, name-bot, user, all-editor-types}
Filter or the the type of editors having performed the related action. Can be anonymous
for users not logged in, group-bot
for logged in users that are part of the bot group, name-bot
for logged in users whose name contains `bot` (high probability of being a bot, even if counter examples exists), and user
for logged in users not in group-bot nor name-bot sets. Finally, you can use all-editor-types
not to filter by editor type.
Page-type - {content, non-content, all-page-types}
Filter for the type of page over which the action is performed. Can be content
for pages belonging in content namespaces. This page type is also referred to as articles, and for most wikis includes pages in namespace 0 only. Can also be non-content
for pages in namespaces not considered content (talk pages, user pages etc). Finally you can use all-page-types
not to filter by page type.
Activity-level - {1..4-edits, 5..24-edits, 25..99-edits, 100..-edits, all-activity-levels}
Filter for the activity level of editors or pages, meaning their number of edits over the aggregation time period of the metric. Values are pretty self-explanatory here and can be 1..4-edits
, 5..24-edits
, 25..99-edits
, 100..-edits
. As for other filtering dimensions, you can use all-activity-levels
not to filter by activity level.
Metrics
Number of edits
The count of edits (or revisions), including edits on redirects. Dimensions available for this metric are time boundaries
, time granularity
, project
, editor-type
and page-type
.
Sum of net bytes difference
The sum of the difference in bytes made by each edit (or revisions), including edits on redirects. It is to be noted that this difference can be either positive in case of more content being added, or negative in case of content being removed. Dimensions available for this metric are time boundaries
, time granularity
, project
, editor-type
and page-type
.
Sum of absolute bytes difference
The sum of the absolute difference in bytes made by each edit (or revisions), including edits on redirects. In comparison to the previous metric which value can be positive or negative, this metric uses the absolute value of modified bytes, therefore is always positive. Dimensions available for this metric are time boundaries
, time granularity
, project
, editor-type
and page-type
.
Number of new edited pages
The count of new pages having been created, excluding pages being redirects. Dimensions available for this metric are time boundaries
, time granularity
, project
, editor-type
and page-type
.
Number of edited pages
The number of pages having been edited, and how much, excluding pages being redirects. Dimensions available for this metric are time boundaries
, time granularity
, project
, editor-type
, page-type
and activity-level
.
List of most edited pages
The list of 100 pages having been most edited, excluding pages being redirects. Dimensions available for this metric are time boundaries
, time granularity
, project
, editor-type
and page-type
.
List of pages with biggest net bytes difference
The list of 100 pages with the highest sum of net bytes difference, excluding pages being redirects. Dimensions available for this metric are time boundaries
, time granularity
, project
, editor-type
and page-type
.
List of pages with biggest absolute bytes difference
The list of 100 pages with the highest sum of absolute bytes difference, excluding pages being redirects. Dimensions available for this metric are time boundaries
, time granularity
, project
, editor-type
and page-type
.
Number of editors
The count of editors having performed edits, and how much, including on pages being redirects. Dimensions available for this metric are time boundaries
, time granularity
, project
, editor-type
, page-type
and activity-level
.
List of editors with biggest number of edits
The list of 100 editors (either user-ids or user IPs if anonymous) having performed most edits, including on pages being redirects. Dimensions available for this metric are time boundaries
, time granularity
, project
, editor-type
and page-type
.
List of editors with biggest net bytes difference
The list of 100 editors (either user-ids or user IPs if anonymous) with the highest sum of net bytes difference, including on pages being redirects. Dimensions available for this metric are time boundaries
, time granularity
, project
, editor-type
and page-type
.
List of editors with biggest absolute bytes difference
The list of 100 editors (either user-ids or user IPs if anonymous) with the highest sum of absolute bytes difference, including on pages being redirects. Dimensions available for this metric are time boundaries
, time granularity
, project
, editor-type
and page-type
.
Newly Registered Users
The count of users having registered themselves. This metric excludes user-account created automatically through the auto-login system, and accounts created by peers. Dimensions available for this metric are time boundaries
, time granularity
and project
. Note: Thanks to CentralAuth
, we now are sure that the same username in different wiki-projects belong to the same user, and actually, only the first registration is counted in the new-registered-users metric since the system then automatically creates user-accounts at visit of other projects, and automatic account-creations are not counted. However, For data before CentralAuth
, we have no way to know if two accounts with the same name on different wikis belonged to the same user or not. The metric we provide for periods before CentralAuth
does NOT deduplicate accounts by name across projects, and therefore can be somehow overcounting when used with project-families.