Analytics/Archive/Pageviews/Aggregation

From Wikitech
This page contains historical information. It may be outdated or unreliable.
Project Animal {wren}
Phabricator Ticket https://phabricator.wikimedia.org/T100027
ETA Q1 FY2016

Value Proposition

For Analysts, Product Managers, Researchers and the Wikimedia Community interested in analytics

Who are dissatisfied with a lack of easy to get & granular pageview data

Our Product is a table you can query via SQL

That provides useful breakdowns (dimensions) while maintaining privacy

Unlike our current pageview data dumps.

Deliverable

A table in a database that is query-able using SQL.

Dimension Name No. of Possible Values Examples
project 1000 jpwiki | frwiki
country 300 Japan | France
access_method 3 desktop | mobile_web | mobile_app
agent_type 2 user | spider
referer_class 3
day ?
Dimension Name
project
country
access_method
agent_type
referer_class
article
day

Privacy

It is possible the pageview count for a particular bucket will be so small it could be used to identify someone because perhaps only one person could have produced a particular pageview. Therefore when pageview counts are lower than X, the database will return 0.

Implementation