Distributed tracing/Propagating tracing context
How to make distributed tracing work for your service
If you want to make distributed tracing work for your service, you MUST follow this spec.
General applicability
Any services run in production by Wikimedia SRE which both:
- receive requests in the user-facing query path, and
- submit RPC calls or other RPC-like HTTP requests to other services in order to answer those requests
SHOULD follow this specification.
Context propagation requirements
When we say "propagate" a header, we mean that, if a header with that name is set on an incoming request to the service, then it must also be set on all outgoing RPCs made "on behalf" of that incoming request.
Wikimedia services MUST make outgoing RPCs to other services via the Envoy services proxy running as a sidecar.
Wikimedia services MUST propagate the x-request-id
header.[1]
Wikimedia services SHOULD propagate the W3C standard traceparent
and tracestate
headers.
Following all three of these recommendations means distributed tracing works across service boundaries.
Easy ways to implement
Within Mediawiki core and extensions, use Wikimedia\Http\TelemetryHeadersInterface. When constructing an outgoing request you can call Telemetry::getInstance()->getRequestHeaders()
. One of the multiple built-in HTTP clients that you might be already using may do this for you automatically.
Within other microservices, there is not yet an easy, drop-in solution, but we hope to have one soon. Some resources:
- service-template-node shows you how to do this but there's not a scaffolding/library
- see also https://phabricator.wikimedia.org/T371120