Category:Data pipelines
Documentation of data ingestion and processing pipelines.
- Includes documentation describing how specific datasets are derived or computed, for example: MediaWiki history computation (ingestion from DB, history rebuilding, computation of metrics, extraction onto other systems, ad-hoc querying).
- Does not include documentation for the data platform infrastructure or system components that implement a given data pipeline, for example: Airflow, Gobblin.
Pages in category "Data pipelines"
The following 9 pages are in this category, out of 9 total.
A
- Analytics/Cluster/Edit data loading
- Analytics/Cluster/Edit history administration
- Analytics/Cluster/Edit serving layer
- Analytics/Cluster/Mediawiki history reduced algorithm
- Analytics/Cluster/Mediawiki History Snapshot Check
- Analytics/Cluster/Page and user history reconstruction
- Analytics/Cluster/Page and user history reconstruction algorithm
- Analytics/Cluster/Revision augmentation and denormalization