Jump to content

SLO/Global Editor Metrics API

From Wikitech
< SLO

Status: draft

Organizational

Instructions

Service

Global Editor Metrics API

AQS / Data Gateway backed HTTP API endpoints.

T403660 Global Editor Metrics for YiR, Apps Activity Tab, and Growth Impact Module

Teams

Team Components
Data Engineering Data Lake (upstream data dependencies), Airflow
Data Persistence Data Gateway (AQS Cassandra cluster and internal API)
Search Platform Wiki search indices updates
Growth Client API via GrowthExperiments Extension
Mobile Apps Client APIs via App Activity Tab on Android and iOS

Architectural

Environmental dependencies

Service dependencies

Hard Dependencies:

  • mediawiki.page_change.v1 stream is the data source for computing ongoing daily editor metrics.
  • mediawiki_history dataset allows us to backfill metrics, as well as to determine what pages an editor has ever edited (for which we can calculate daily pageviews to an editor's edited pages).
  • pageviews dataset allows us to calculate pageviews to an editor's edited pages.
  • Airflow allows us to schedule Spark jobs that regularly compute daily metrics in Hive tables, and then write them to Cassandra.
  • Data Gateway / AQS / Cassandra serve metric data to real user requests.

Client-facing

Clients

Service Level Indicators (SLIs)

  • TODO: SLI for Data Gateway / AQS / editor metrics api service?
  • Freshness SLI for Data Pipeline: Time from end of UTC day to updated daily metrics available in serving layer.

Operational

Monitoring

How is the service monitored?

Troubleshooting

How complex is the service to troubleshoot?

Deployment

Service Level Objectives

Instructions

Realistic targets

  • TODO: SLI for Data Gateway / AQS / editor metrics api service:
  • Freshness SLI for Data Pipeline: < 1 day

Ideal targets

  • TODO: SLI for Data Gateway / AQS / editor metrics api service:
  • Freshness SLI for Data Pipeline: < 2 hours

Reconciliation

Reconcile the realistic vs. ideal targets, documenting any decisions made along the way.

Once the SLO is final, consider collapsing the above three sections.

What are the agreed-upon SLOs, for each SLI and each request class?

Each SLO should be defined in Pyrra; include links here.