Portal:Toolforge/Admin/Logging
Material may not yet be complete, information may presently be omitted, and certain parts of the content may be subject to radical, rapid alteration. More information pertaining to this may be available on the talk page.
Logging on Toolforge, both for user workloads and for the infrastructure, is being moved to a setup based on Grafana Loki.
Overview
Storage
Log storage is handled by Grafana Loki, with persistant storage in the Ceph cluster via the S3 interface. The s3 buckets exist in separate projects, tools-logging
and toolsbeta-logging
, as our RadosGW implementation does not allow for more specific than per-project access control restrictions. The buckets are created via tofu-provisioning system.
There are two Loki deployments in each project (tools and toolsbeta):
- tools
- Log storage for tool workloads. (So everything in an individual
tool-
namespace.) - infrastructure
- Log storage for Toolforge infrastructure. This includes all non-
tool-
namespaces exceptingress-nginx-gen2
.
Both instances are installed in the loki
Kubernetes namespace.
Collection
Each Kubernetes worker node runs a Grafana Alloy pod that forwards logs from pods running on that node to the appropriate Loki instance.
Deployment
The entire logging stack is deployed via the logging
component of toolforge-deploy.
Operations
Upgrading Loki or Alloy
Monitoring
A Grafana dashboard is available.