X-Provenance
Appearance
X-Provenance header
The X-Provenance HTTP header is used within the Wikimedia CDN and request classification systems to signal the origin or trust level of a request. It provides early, lightweight identification of known traffic sources, helping optimize filtering and rate-limiting decisions.
Purpose
This header is meant to:
- Tag traffic based on its origin before deeper inspection (e.g. session token validation or UA classification)
- Enable fast-path handling (e.g. skip filtering, assign different rate limits)
- Allow Requestctl, HAProxy and Varnish logic to apply differentiated rules based on known provenance
Syntax
The header follows the form:
X-Provenance: label1=value1;labelN=valueN
Where label identifies the provenance of the request. Examples:
- net: used to flag internal or requests coming from trusted network ranges
- abuser: request coming from a known abuser
- client: request coming from a known client ipblock
- cloud: request coming from a known cloud
- isp: ISP data provided by MaxMind ISP database
- net=unknown: default fallback value
- datacenter=true: indicates the request is coming from a datacenter, not from a eyeballs provider. Data is provided at the moment by the Spur datacenter feed
- id: request coming from a verified client, for which we have both a matching user agent and a matching provenance expression. For instance, a request with user-agent "Googlebot" coming from the ip ranges of googlebot.
Use Cases
- Applied by CDN edge terminators (HAProxy layer)
- Enables bypassing generic rate limits or Requestctl rules for trusted sources
- Can be used as an input to moat-mode rules or future trust scoring systems
Implementation
Currently implemented in:
- HAProxy sets its value based on known IP ranges and MaxMind database
- Requestctl rules consume the header for filtering decisions in both HAProxy and Varnish
Future Plans
- Tighter integration with session/token-based identification
- Use in shaping rate-limiting tiers dynamically
- Expanded label taxonomy to support more trusted classes