Jump to content

Data Platform/Systems/Ceph/Troubleshooting

From Wikitech

Common Alerts

CephOSDClusterInWarning

This ticket contains comprehensive troubleshooting for this alert. We'll move the content here soon.

Ceph provides the persistent storage for DPE applications running in the dse-k8s-cluster. As such, problems with Ceph can lead to instability with k8s applications.

For example, this ticket covers a recent alert for

FIRING: AirflowDeploymentUnavailable: airflow-scheduler.airflow-search is unavailable on dse-k8s-eqiad

, which was directly caused by Ceph.

You may be able to test this by execing into the pod, navigating to its volume mounts, and attempting simple operations like ls or touch hello.txt