Wikifeeds

From Wikitech
Wikifeeds simplified request flow
Wikifeeds simplified request flow

Wikifeeds is an API service served via API Gateway. The functionality provided by this was previously part of Mobile content services (MCS).

Contacts

The service is maintained by the Content Transform Team, who can be reached on Slack in #content-transformers.

Primary contacts:

  • Yiannis Giannelos (nemo-yiannis)
  • Mateus Santos (mateusbs17)

In case there is a need of an emergency deployment or rollback, serviceops team should be able to handle it (irc: #wikimedia-operations, #wikimedia-serviceops).

Overview

Repository

Wikifeeds is a node.js webservice supporting the Explore feeds in the official Wikipedia Android and iOS apps.

It creates content that is served via the public REST API feed content endpoints (served by API Gateway):

  • /api/rest_v1/feed/featured/{yyyy}/{mm}/{dd}
  • /api/rest_v1/feed/announcements
  • /api/rest_v1/feed/onthisday/{type}/{mm}/{dd}
  • /api/rest_v1/page/random/{format}

See docs on the English Wikipedia: https://en.wikipedia.org/api/rest_v1/#/Feed and https://en.wikipedia.org/api/rest_v1/#/Page%20content/get_page_random__format_.

Technology

Wikifeeds is split off from the main mobileapps/mobile content service. As such it is a nodejs service. It's a service-runner compatible service based on the Node.js service template.

The service is deployed on the WMF services kubernetes cluster using helm. This means that the service is packaged as a docker image. The docker image is built by the Deployment pipeline.

Deployment

The images that are used in production can be found on the WMF docker registry. New images are built, after code is merged to the master branch, automatically by the deployment pipeline.

The production clusters are managed using kubernetes and helm. These are also used for a staging instance as well. The configuration for these can be found in the operations/deployment-charts repo. Details for applying those adjustments to the production clusters can be found at Kubernetes deployments docs.

Internal endpoints

After each deployment its useful to test wikifeeds in its internal endpoints before it gets served to public traffic. For example on staging (from the deployment node):

  • GET featured feeds responses
curl -v https://staging.svc.eqiad.wmnet:4101/en.wikipedia.org/v1/aggregated/featured/2023/09/29
  • GET announcements
curl -v https://staging.svc.eqiad.wmnet:4101/en.wikipedia.org/v1/announcements

For production endpoints the hostnames for the internal service endpoints are:

  • https://wikifeeds.svc.codfw.wmnet:4101
  • https://wikifeeds.svc.eqiad.wmnet:4101

Architecture

In a nutshell, the service gathers information about featured content from the templates used to display it on various Wikipedias, and provides it in a standard, structured format.

  • For /feed/featured/{yyyy}/{mm}/{dd}:
    • /page/featured/{yyyy}/{mm}/{dd}
    • /media/image/featured/{yyyy}/{mm}/{dd}
    • /page/most-read/{yyyy}/{mm}/{dd}
    • /page/news
    • /feed/onthisday/selected/{mm}/{dd}
  • For /feed/onthisday/all/{mm}{dd}:
    • /feed/onthisday/births/{mm}/{dd}
    • /feed/onthisday/deaths/{mm}/{dd}
    • /feed/onthisday/events/{mm}/{dd}
    • /feed/onthisday/holidays/{mm}/{dd}
    • /feed/onthisday/selected/{mm}/{dd}

The service also provides a /feed/announcements endpoint that is used to provide content of special announcements for things like campaigns and fundraising. The content of these announcements is currently defined entirely within the service code.

The service also provides a /page/random/title endpoint. API Gateway exposes this as /page/random/{format}. {format} can be something like title, summary, html. So a client can get either just the title or the associated page content.

Metrics

The service reports response times via the metrics API provided by service-runner (as is the standard behavior of services based on the Node.js service template).

Service level indicators/objectives (SLIs/SLOs)

SLI: Traffic
Endpoint SLO: Requests per second
/page/random 25 ≤ RPS ≤ 100
All others 1 ≤ RPS ≤ 10
SLI: Latency
Endpoint SLO: Max latency (p50)
/feed/onthisday/events 500ms
/media/image/featured 250ms
/page/most-read 200ms
All others 100ms
SLI: Server errors (5xx)
Endpoint SLO: Max 5xx error rate
All endpoints 1/1000

Table patterns from [1].

See also

  1. https://landing.google.com/sre/sre-book/chapters/service-level-objectives/