Jump to content

Linked Artifacts Cache

From Wikitech

Background

In data parlance, an entity is something that exists as itself. Users, pages, and revisions are all examples of entities. Conversely, an artifact is something conceived, produced, or shaped by craft.

In our infrastructure, we have long produced artifacts that are associated with (linked to) entities in MediaWiki. For example: We use machine-learning models to score paragraphs of article text for potential tone issues. These scores are an artifact, one bound to the revision of the corresponding page. The content transforms served by PCS are likewise artifacts associated with revisions.

While it is the database that is canonical for entities, it is the process used to craft the artifact that is its canonical source. Whenever persisted, artifacts are secondary data, or —for all intents and purposes— cache. The Linked Artifact Cache service provides durable storage with caching semantics for these objects.

Service endpoints

https://linked-artifacts.discovery.wmnet:30443 (production)

https://linked-artifacts.k8s-staging.discovery.wmnet:30443 (staging)

Deployment

See: Kubernetes/Deployments

Monitoring & debugging

/healthz

The /healthz endpoint is used for k8s readiness, but returns a JSON-encoded object with useful metadata.

$ curl -D - https://linked-artifacts.k8s-staging.discovery.wmnet:30443/healthz
HTTP/2 200 
content-type: application/json
date: Fri, 24 Apr 2026 19:52:08 GMT
content-length: 131
x-envoy-upstream-service-time: 8
server: main-tls

{
  "version": "v1.1.3",
  "build_date": "2026-04-24T18:56:12:UTC",
  "build_host": "buildkitsandbox",
  "go_version": "go1.24.9"
}
$

/metrics

The /metrics endpoints returns Prometheus metrics.

$ curl -D - https://linked-artifacts.k8s-staging.discovery.wmnet:30443/metrics
HTTP/2 200 
content-type: text/plain; version=0.0.4; charset=utf-8; escaping=underscores
date: Fri, 24 Apr 2026 20:37:56 GMT
x-envoy-upstream-service-time: 2
server: main-tls

# HELP go_gc_duration_seconds A summary of the wall-time pause (stop-the-world) duration in garbage collection cycles.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 7.6168e-05
go_gc_duration_seconds{quantile="0.25"} 0.000104071
go_gc_duration_seconds{quantile="0.5"} 0.000119226
go_gc_duration_seconds{quantile="0.75"} 0.000138608
go_gc_duration_seconds{quantile="1"} 0.000217679
go_gc_duration_seconds_sum 0.005898937
go_gc_duration_seconds_count 47
[ ... ]
http_request_duration_seconds_bucket{code="200",method="GET",le="0.001"} 0
http_request_duration_seconds_bucket{code="200",method="GET",le="0.0025"} 57
http_request_duration_seconds_bucket{code="200",method="GET",le="0.005"} 92
http_request_duration_seconds_bucket{code="200",method="GET",le="0.01"} 92
http_request_duration_seconds_bucket{code="200",method="GET",le="0.025"} 92
http_request_duration_seconds_bucket{code="200",method="GET",le="0.05"} 92
http_request_duration_seconds_bucket{code="200",method="GET",le="0.1"} 92
http_request_duration_seconds_bucket{code="200",method="GET",le="0.25"} 92
http_request_duration_seconds_bucket{code="200",method="GET",le="0.5"} 92
http_request_duration_seconds_bucket{code="200",method="GET",le="1"} 92
http_request_duration_seconds_bucket{code="200",method="GET",le="+Inf"} 92
http_request_duration_seconds_sum{code="200",method="GET"} 0.20765826500000004
http_request_duration_seconds_count{code="200",method="GET"} 92
$

Logging

TODO: Do.

See also