REST Gateway/Rate limiting
REST Gateway limits are being deployed at the beginning of 2026 (see the timeline) to ensure fair and sustainable access to WMF resources. In particular, the intent is to reduce the amount of unauthenticated non-browser traffic (about 33% at the end of 2025) and to prevent high-volume commercial reusers from putting undue load on the infrastructure.
Ideally, bots used by the Wikimedia communities will be unaffected by this change. They may even benefit from the fact that fewer resources are being used by commercial consumers. It is however possible that bots which operate at a very high rate may get rate limited.
The gateway limits primarily aim at reducing sustained load, and generally allow for short bursts of high activity, as long as large numbers of parallel requests are avoided.
See also:
Note: this page is referenced in source code. If it is renamed or restructured these references need to be updated.
Architecture
Rate limiting is implemented using an envoyproxy/ratelimit instance that runs as a side-car of Envoy and uses Redis for storing rate limit counters. Access to Redis is sharded using nutcracker to provide horizontal scalability.
Concepts:
- Each route is associated with a set of rate limit policies. Typically, there is only one active policy, and possibly a second policy in "shadow mode", to collect data on the effect it would have when activated.
- Each request is assigned a rate limit class to apply, see Ratelimit Classes for details.
- For each request, we determine the rate limit key (e.g. the user ID, IP address, etc).
- Each rate limit policy defines limits for each rate limit class. There may be one limit per unit of time (per second, minute, or hour).
- Separate rate limit counters are maintained per key and unit of time. Occasional loss of counters (e.g. during deployment) is acceptable since it only leads to a temporary doubling of the accepted access rate.
Ratelimit Classes
The rate limit for each client is determined by that client’s rate limit class. Below is an overview of the classes and their meaning. The specific limits for each class will be determined based on experimentation and observation at the beginning of 2026, see the timeline on mediawiki.org and the tracking ticket on Phabricator.
Request classification relies heavily on the headers provided by the CDN backend API, namely x-client-ip, x-trusted-request, x-is-browser, and x-provenance.
| Group | Class | Description |
|---|---|---|
| Unidentified | * | Any class name not specified in the policy definition but will show up separately in the ratelimiter metrics. Policies typically assign the same limits to "*" as to "anon". The x-client-ip header is used to construct the rate limit counter key. |
| anon | Requests that have no identifying characteristics other than their IP address. Includes bots that do not authenticate and do not comply with the user agent policy. The x-client-ip header is used to construct the rate limit counter key. | |
| anon-cgnat | Anonymous client from a network that uses a CG-NAT, so many clients share an IP address. Assigned by matching x-client-ip header against the anon_class_by_address setting. | |
| User-Agent only | unauthed-bot | Unauthenticated requests that provide a User-Agent header that is compliant with the User-Agent policy. The x-ua-contact key is used to construct the rate limit counter key, but that may have to change if this mechanism is abused. |
| unauthed-mediawiki | Unauthenticated requests coming from a third party MediaWiki installation, according to the User-Agent header header. Includes requests from InstantCommons (ForeignApiRepo) and QuickInstantCommons. | |
| Authenticated(*) | authed-user | Request authenticated with a valid JWT, but with no other class specified in the rlc claim. |
| established-user(**) | Established wiki user. MediaWiki assigned this class automatically based on conditions such as global edit count and account age. | |
| highlimits-user(**) | High limits user. MediaWiki assigned this class based on global group membership (e.g. for Stewards). | |
| approved-bot(**) | Bot approved by the community. MediaWiki assigned this class based on global group membership (specifically global-bot and local-bot).
| |
| Internal (WMCS) | known-network | Requests from WMCS or another WMF network (x-trusted-request is A). The User-Agent header will be used to construct the rate limit counter key. |
| Known client | known-client | Requests from a client that is well-known to the WMF (x-trusted-request is B). The x-provenance header will be used to construct the rate limit counter key. |
| Pseudo classes | BYPASS | Pseudo-class that can be used to bypass rate limiting entirely. Requests with this class will not even show up in the ratelimit statistics. |
| DENY | Pseudo-class that can be used to deny access. Always has a limit of 0. Useful for testing and incident response. |
(*) Authenticated means that the request has a JWT that can be validated by the gateway. The JWT can be supplied as a bearer token, a centralauth token, or in the sessionJwt cookie. We may add support for additional ways to supply the JWT, but the basic mechanism is always the same. Note that expired JWTs from cookies and centralauth tokens are ignored, while expired bearer tokens will cause the request to be rejected with status 401.
(**) These classes are not determined by the REST gateway but by MediaWiki. They are passed to the gateway in the rlc ("rate limit class") claim of the JWT. This is implemented in the WikimediaCustomizations extension, by the RateLimitHookHandler class.
Configuration
Configuration in the service values.yaml file:
main_app.ratelimiter.policies: a set of possible policies, each defining rate limits for each client class.main_app.ratelimiter.anon_class_by_address: override the anon class for certain address ranges, used when no other classification applies. This can be used to apply alternative rate limits for clients behind a CGNAT.
Internal Requests
API gateway rate limits should no apply to internal requests, that is, requests from services (including MediaWiki) to other services (again including MediaWiki) running inside the WMF network. There are three cases to be distinguished:
- Requests made via the service ports. These should go through the service mesh and bypass the gateway entirely. This is the preferred way for services to interact.
- Requests made directly to the gateway, not via the CDN layer. These should be rare except for k8s health checks. These requests have no
x-client-ipheader set, which causes them to be exempt from rate limiting. - Internal requests accidentally made via the CDN (using a public URL). These would have the
x-client-ipheader set and would be subject to rate limiting. They would however have x-trusted-request set to A, which would result in permissive limits.
There are currently more internal requests going through the gateway than expected/desired, see phab:T410198.
Configuration and Implementation
Source of truth for the implementation of the above table in WMF production:
- restgw_ratelimits.lua determines the class and key for a given request.
- rest-gateway/values.yaml contains the definition of the rate limit policies, assigning limits per class and time unit.