Jump to content

RESTBase

From Wikitech
This page is currently a draft.
More information and discussion about changes to this draft on the talk page.
FIXME: This document needs expansion

RESTBase is an API proxy serving the REST API at /api/rest_v1/. It uses Cassandra as a storage backend.

It is currently running on hosts with the profile::restbase class.

Deployment and config changes

RESTBase is deployed by Scap.

What to check after a deploy

Deploys to do not always go according to plan, and regressions are not always obvious. Here is a list of things you should check after each deploy:

Other considerations

Be sure to log all actions ahead of time in #wikimedia-operations. Don't be shy about including details.

Administration

Adding a new RESTBase host

Before following these instructions, ensure you follow the provisioning documentation for a new Cassandra node.

  • Add hosts to the deployment list in the Restbase deploy repo
  • If there have been changes to the restbase service since you applied the correct roles to the host (the latest deployed version should be pulled via Puppet during the first puppet runs), deploy restbase to the hosts: from deployment.eqiad.wmnet, cd /srv/deployment/restbase/deploy/, git pull and then scap deploy -f -l restbaseNNNN.DC.wmnet "First deploy to restbaseNNNN"
  • Add the hosts to conftool-data
  • If the hosts are healthy in Icinga at this point and if you feel it is safe as regards deployment timing and so on, pool the hosts:
    • sudo confctl select name=restbaseNNNN.DC.wmnet  set/pooled=yes:weight=10
  • Verify that the hosts have been added and are healthy via the pybal API

Renewing expired certificates

Every now and again Cassandra certificates will come close to expiry (for example: SSL WARNING - Certificate restbase2016-a valid until 2020-11-29 09:26:14 +0000 (expires in 53 days)). Certificates need to be deleted and recreated in the Puppet secrets directory - See the Cassandra documentation for details.

Monitoring

instance-data

In production, the instance-data path is usually a RAID array. It is used for hints, commitlogs and caches - all vital to the stable operation of the Cassandra instances. Under unusual circumstances (a large rebalancing, an instance behaving erroneously etc) this mount can fill up quickly and space will sometimes be required to back out of this condition. For this reason, we set a lower threshold for disk free on this path than for other disks.

Debugging

To temporarily switch to local logging for debugging, you can change the config.yaml log stanza like this:

logging:
  name: restbase
  streams:
    # level can be trace, debug, info, warn, error
    - level: info 
      path: /tmp/debug.log

Alternatively, you can log to stdout by commenting out the streams sub-object. This is useful for debugging startup failures like this:

cd /srv/deployment/restbase/deploy/
sudo -u restbase node restbase/server.js -c /etc/restbase/config.yaml -n 0

The -n 0 parameter avoids forking off any workers, which reduces log noise. Instead, a single worker is started up right in the master process.

Analytics and metrics

Hive query for action API & rest API traffic:

use wmf;

SELECT
  SUM(IF (uri_path LIKE '/api/rest_v1/%', 1, 0)) as count_rest,
  SUM(IF (uri_path LIKE '/w/api.php%', 1, 0)) as count_action
FROM wmf.webrequest
WHERE webrequest_source = 'text'
  AND year = 2017
  AND month = 9
  AND (uri_path LIKE '/api/rest_v1/%' OR uri_path LIKE '/w/api.php%');

Notes on purging

RESTBase is using cache-control headers to handle cache on request/response cycle. In order to purge a URL we can try one of the following:

  • Run a purge on wiki
    • Changeprop will eventually pregenerate the content on restbase
    • Remove the entry from the cassandra table that is stored
    • Run a manual HTTP request to purge

For a given URL eg. /en.wikipedia.org/v1/page/mobile-html/Dog to purge the cassandra content you need to run the following request:

curl restbase.svc.codfw.wmnet:7233/en.wikipedia.org/v1/page/mobile-html/Dog -H "cache-control: no-cache"

Script

import pandas
import requests
import mpire

BASE_PATH = "http://127.0.0.1:7231"
WORKER_POOL_SIZE = 6


def read_input(path, column):
    df = pandas.read_csv(path)
    df = df[[column]]
    df = df.drop_duplicates(subset=[column])
    df.rename(columns={column: "path"}, inplace=True)
    return df


def purge_url(url):
    rest_url = f"{BASE_PATH}{url}"
    res = requests.get(rest_url, headers={"cache-control": "no-cache"})
    return res


def purge_urls(urls):
    with mpire.WorkerPool(n_jobs=WORKER_POOL_SIZE) as pool:
        results = pool.map(purge_url, urls, progress_bar=True)
        
if __name__ == "__main__":
    df = read_input("<CSV_PATH>", "<COLUMN>")
    purge_urls(df.path)