Analytics/Data Lake/Traffic/Interlanguage

From Wikitech

Inter-language Datasets

This page describes datasets that count web requests which stay within a project family but change language.

Interlanguage Navigation

The wmf.interlanguage_navigation table in Hive measures the number of webrequests, for example, coming from de.wikipedia.org and going to ro.wikipedia.org. The schema for this table is below, and in this example, the fields would have had these values:

  • project_family: wikipedia
  • previous_project: de
  • current_project: en
  • navigation_count: the number of webrequests that made this jump on this specific date
  • date: 2017-11-01
    `project_family`    string  COMMENT 'The project family to aggregate on',
    `previous_project`  string  COMMENT 'The project (language) found in the referers of this group of requests',
    `current_project`   string  COMMENT 'The project (language) of this group of requests',
    `navigation_count`  bigint  COMMENT 'The number of times a user navigated from the previous to the current project',
    `date`              string  COMMENT 'The date in YYYY-MM-DD format'

See also