Data Engineering/Systems
These subpages explain in technical detail the systems that process data for Analytics at Wikimedia Foundation. They include information about setup, maintenance, architecture, and more.
Child Pages of Data Engineering/Systems
AQS · Airflow · Analytics Meta · Archiva · Ceph · Conda · DB Replica · Dashiki · DataHub · Dealing with data loss alarms · Druid · Event Data retention · Hadoop Event Ingestion Lifecycle · Java · Jupyter · Kerberos · Maintenance Schedule · Managing systemd timers · Matomo · Reportupdater · Varnishkafka · Wikistats · Wikistats 2
All Subpages of Data Engineering/Systems
- AQS
- AQS/Scaling
- AQS/Scaling/2016/Hardware Refresh
- AQS/Scaling/2017/Cluster Expansion
- AQS/Scaling/2020/Cluster Expansion
- AQS/Scaling/LoadTesting
- Airflow
- Airflow/Airflow testing instance tutorial
- Airflow/Developer guide
- Airflow/Developer guide/Python Job Repos
- Airflow/Instances
- Airflow/Upgrading
- Analytics Meta
- Archiva
- Ceph
- Cluster/Geotagging
- Cluster/Hadoop/Load
- Conda
- DB Replica
- Dashiki
- Dashiki/Configuration
- DataHub
- DataHub/Data Catalog Documentation Guide
- DataHub/Upgrading
- Dealing with data loss alarms
- Druid
- Druid/Alerts
- Druid/Load test
- Event Data retention
- Event Data retention/AppInstallId
- Hadoop Event Ingestion Lifecycle
- Java
- Jupyter
- Jupyter/Administration
- Kerberos
- Kerberos/Administration
- Maintenance Schedule
- Managing systemd timers
- Matomo
- Reportupdater
- Varnishkafka
- Wikistats
- Wikistats/Traffic
- Wikistats 2