Category:Data pipelines

From Wikitech

Documentation of data ingestion and processing pipelines.

  • Includes documentation describing how specific datasets are derived or computed, for example: MediaWiki history computation (ingestion from DB, history rebuilding, computation of metrics, extraction onto other systems, ad-hoc querying).
  • Does not include documentation for the data platform infrastructure or system components that implement a given data pipeline, for example: Airflow, Gobblin.