Jump to content

User:Triciaburmeister/Sandbox/Data platform

From Wikitech
This page contains historical information. It may be outdated or unreliable.

Wikimedia's data platform is a collection of systems and services that enable data producers and consumers to collect, discover, and use trustworthy data to derive data insights, conduct research and build new data products. The data platform is maintained by the Data Platform Engineering team. To contact us please use the following intake process.

Get started

Find datasets and documentation for WMF private data sources.

Use SQL query engines, Jupyter notebooks, libraries, and compute resources to explore and analyze data.

Define and schedule jobs to transform existing data. Share data artifacts, reports and dashboards.

Add new instrumentation and analytics data sources to the Data Platform.

Data platform infrastructure

Data systems

Lists of data platform systems and links to their docs are currently at:

Data pipelines