AQS 2.0

From Wikitech
Jump to navigation Jump to search

Analytics Query Service (AQS) is the software behind the /metrics family of endpoints in RESTBase. It is a read-only HTTP proxy to results served from Cassandra and Druid. It is currently based on a very outdated fork of RESTBase, and has received little updates over the years. As a part of the goal to sunset RESTBase, AQS 2.0 is a project to migrate the /metrics endpoints to a set of services exposed via the API Gateway.

Services

Device Analytics
Repository: generated-data-platform/aqs/device-analytics
API specification: swagger.json
Edit Analytics
Repository: generated-data-platform/aqs/edit-analytics
API specification:
Editor Analytics
Repository: generated-data-platform/aqs/editor-analytics
API specification:
Geo Analytics
Repository: generated-data-platform/aqs/geo-analytics
API specification:
Media Analytics
Repository: generated-data-platform/aqs/media-analytics
API specification:
Page Analytics
Repository: generated-data-platform/aqs/page-analytics
API specification: swagger.json
The AQS 2.0 repositories on GitLab are deprecated.

Project overview

Epic task in Phabricator | API Platform Team workboard

  1. Underway In progress Implement the new, stand-alone AQS service(s)
  2. Deploy to k8s
  3. Expose the /metrics hierarchy from the new service(s) using the API Gateway
  4. Switch RESTBase to proxying requests from the old AQS service, to the new k8s-based one
  5. Deprecate the http://{project}/api/rest_v1/metrics resources
  6. Eventually phase out the RESTBase /metrics hierarchy

Proposal

From phab:T263489

We propose to break down the rewrite along dataset boundaries — similar to the module structure in RESTBase — with a separate project used to implement each.

  • Device Analytics
  • Page Analytics
  • Media Analytics
  • Geo Analytics
  • Edit Analytics
  • Editor Analytics

The resulting services will be proxied by RESTBase and/or the API Gateway (the former to eventually be deprecated in favor of the latter) in order to maintain complete compatibility with the existing API.

The target language for these implementations is Go. While a complete comparison of Javascript/NodeJS and Go is out of scope for this issue, the (simplified) rationale is:

  • Strong, static typing; Statically typed languages eliminate entire classes of bugs common to dynamic languages, improve security, and making code easier to reason about
  • Ease of use; Go is more obvious, more explicit, and easier to understand. Complicated concepts like concurrency are easier to get right
  • Performance; Service latency can be expected to be both lower, but more importantly, more predictable with Go

Developer guide

Getting started

AQS 2.0 consists of several repositories. Some correspond to individual services that expose APIs. Others correspond to cross-service common functionality or test environments. These repositories are mostly stored in WMF's GitLab, but speculative/formative repositories may be stored elsewhere for now.

You will need:

Go (aka "golang") is an opinionated language in various ways. Among these is that you're probably much better off keeping your Go code under your "GOPATH" rather than wherever you may be used to keeping code. (There are, of course, always ways for savvy developers to cheat the system. If you choose to do that, any consequences are on you.) On my Mac, I cloned all the AQS 2.0 repositories under ~/go/src/.

The current list of repositories is:

Cassandra-backed services:

  • Page Analytics service
  • Device Analytics service
  • Media Analytics service
  • Geo Analytics service

Druid-backed services:

  • Edit Analytics
  • Editor Analytics

Common functionality:

Test environments:

It is possible that one or more additional services may be required for new production endpoints that are being discussed, but which we do not yet have details (or data to serve).

Running a service

The various service README files contain details about running that particular service. But the summary is that you'll need to open several command line (aka "terminal") windows/tabs and run commands in each. The following describes how to execute the "pageviews" services. Other services operate similarly.

  • In one terminal, navigate to <GOPATH>/aqs-docker-test-env
  • Run "make startup", wait for it to say "Startup complete", then leave it running
  • In another terminal, also in <GOPATH>/aqs-docker-test-env, run "make bootstrap" and wait for it to complete
  • Navigate (either in that terminal or a different one) to <GOPATH>/pageviews
  • Run "make"
  • Run "./pageviews" (and leave it running)
  • In another terminal, navigate to <GOPATH>/pageviews and run "make test"
  • In your browser, visit http://localhost:8080/metrics/pageviews/per-article/en.wikipedia/all-access/all-agents/Banana/daily/20190101/20190102

We haven't started the Druid-based endpoint(s) yet, but the process will likely be similar, with perhaps some differences in how to launch the test environment.

Tips and troubleshooting

Because Go is an opinionated language, it may refuse to run over seemingly small things, such as whitespace. If you see something like this:

   goimports: format errors detected

You can execute this to see what Go is unhappy about:

   goimports -d *.go

And this to automatically fix it:

   goimports -w *.go

Our services depend on several packages, including our own “aqsassist”, which is in active development. This means you may sometimes need to update dependencies for your local service to run. You can update all dependencies via:

   go get .

or update specific dependencies via something like:

   go get gitlab.wikimedia.org/frankie/aqsassist

API documentation

To create API reference docs that are reliable and easy to update, AQS 2.0 services use swag to generate an OpenAPI specification based on a mix of code annotations and the code itself. For more information about swag, see the API documentation page on mediawiki.org.

Updating the docs

Install swag:

go install github.com/swaggo/swag/cmd/swag@latest

Generate the spec:

swag init --markdownFiles .

Swag outputs the spec in YAML and JSON formats to a /docs directory.

Reading the specification

You can view the spec using the API spec reader.

Integrating with swag

To set up API docs with swag:

  1. Annotate main.go (example): Anywhere in main.go, add annotations to document general information about the API.
  2. Create api.md (example): Write the API description in a separate markdown file indicated in the @description.markdown annotation in main.go.
  3. Annotate handler (example): Add annotations to any code file to document an endpoint. Endpoint annotations should be stored as close as possible to the code they describe. The block of endpoint annotations must end on a line immediately preceding a function.
  4. Annotate entity (example: Swag automatically gets information about the response format from the struct. To complete the schema in the docs, add these elements to the struct definition:
    1. an example value within the JSON encoding definition using the syntax example:"example value"
    2. a description of the attribute as an inline comment
  5. Generate the spec: Run swag to generate the spec.

Here's an example of a patch adding swag docs to a service.

See also