Analytics/Data Lake/Edits/Metrics

From Wikitech
Jump to navigation Jump to search

This table stores daily and monthly metrics computed over the denormalized mediawiki history dataset. It is partitioned by wiki_db and metric name to facilitate using its data outside of Hive, namely for display in Dashiki.

Schema


col_name	data_type	comment
dt                  	string              	The date of this measurement, as YYYY-MM-DD
value               	bigint              	The measurement     
snapshot            	string              	Versioning information to keep multiple datasets (YYYY-MM for regular labs imports)
metric              	string              	The metric being computed to measure
wiki_db             	string              	The wiki this measurement pertains to
	 	 
# Partition Information	 	 
# col_name            	data_type           	comment             
	 	 
snapshot            	string              	Versioning information to keep multiple datasets (YYYY-MM for regular labs imports)
metric              	string              	The metric being computed to measure
wiki_db             	string              	The wiki this measurement pertains to

As March 2017, possible values for metric include:

daily_edits
daily_edits_by_anonymous_users
daily_edits_by_bot_users
daily_edits_by_registered_users
daily_unique_anonymous_editors
daily_unique_bot_editors
daily_unique_editors
daily_unique_page_creators
daily_unique_registered_editors
monthly_new_editors
monthly_new_registered_users

Definition

The Hive queries that generate these metrics are in wikimedia/analytics-refinery/oozie/mediawiki/history/metrics.