Jump to content

Data Platform/Data Lake/Edits/Mediawiki project namespace map

From Wikitech

The wmf_raw.mediawiki_project_namespace_map table contains project and namespace data for every project referenced in the Wikimedia sitematrix.

It is generated by bin/download-project-namespace-map, which mainly queries the SiteMatrix API to get the list of projects and then queries each project's Siteinfo API. The script runs automatically on the first of each month.

TODO: Merge this dataset into the canonical_data hive database.

Schema

See the DataHub entry.

Changes and known problems since 2019-08

Date from Task Details
2017-?? ?? Table is created with first automated snapshots (doc created on 2020-02)