Jump to content

Ratelimit

From Wikitech

ratelimit is a service running on wikikube clusters, it consists of the envoy ratelimit service connected to redis misc via nutcracker.

It can be used to rate limit internal requests globally per datacenter (e.g. the rate limits apply per DC, not globally across all DCs) within our service mesh on an opt-in basis.

ratelimit can be configured with one or many rate limit domains, each domain can have one or many descriptors which define the actual rules to apply. See https://github.com/envoyproxy/ratelimit/tree/main#configuration for full documentation.

The clients (envoy service mesh members calling another service) send the domain, as well as additional identifiers (like the user-agent or client ip) along to the ratelimit service (via gRPC), which then decides to pass or deny.

Components

Configuration

As of now there is only one rate limit domain (mw-api-int) configured that grants 1000rps per user-agent. Other keys than user-agent are currently not supported (these would have to be implemented in the mesh.configuration helm chart module).

The configuration of the ratelimit service is done via its values.yaml files, each configured domain will end up as a separate file in the container. For complete documentation of the configuration syntax, please see https://github.com/envoyproxy/ratelimit/tree/main#configuration

On the caller side (envoy) the rate limit service is configured as a regular cluster (called ratelimit). An envoy.filters.http.ratelimit configuration defines the domain, and multiple other settings like the timeout (20ms by default, this can be configured on the caller side via mesh.ratelimit.timeout: 0.02s) after which envoy considers the rate limit check to have failed and will allow the request (as we fail open by default). The ratelimit filter is glued into the request flow via a ratelimit action in the envoy.filters.http.router config.

Enable/opt in to rate limiting

Services can opt-in to being rate limited (since mesh.configuration_1.8.0) by adding a map with the service listener name as key to the discovery.ratelimit_listeners structure. By default (e.g. if that map is empty) the rate limit domain will be set to the name of the service listener. This can be overridden via the domain key.

Example opting in to rate limit the mw-api-int-async-ro listener via the mw-api-int domain:

discovery:
  listeners:
    - mw-api-int-async-ro
  ratelimit_listeners:
    mw-api-int-async-ro:
      domain: mw-api-int
      # by: user-agent # This is the default and currently the only key rate limiting is supported on

Test changes

The ratelimit service can be called via HTTP (in addition to gRPC) which allows for easy testing of config changes etc. The following example will request rate limit in the mw-api-int domain for the user-agent curl test:

curl -XPOST -H "Content-Type: application/json" -w '\n' \
  -d '{"domain":"mw-api-int","descriptors":[{"entries":[{"key":"user-agent","value":"curl test"}]}]}' \
  POD_IP:8080/json

It should return something like:

{"overallCode":"OK", "statuses":[{"code":"OK", "currentLimit":{"requestsPerUnit":1000, "unit":"SECOND"}, "limitRemaining":999, "durationUntilReset":"1s"}]}


See also