Incidents/2018-11-06 maps

From Wikitech

document status: final

Summary

Tilerator failed on maps100[1-3]. Tilerator is a non-public service to prepare vector tiles (data blobs) from OSM database into Cassandra storage. This happened on the 6th November 2018. Icinga first reported this failure around 00:18 UTC.

Timeline

This is a step by step outline of what happened to cause the incident and how it was remedied.

00:18 UTC: Icinga reported failure of Tilerator ports :

PROBLEM - tilerator on maps1003 is CRITICAL: connect to address 10.64.32.117 and port 6534: Connection refused
1:19 AM PROBLEM - tilerator on maps1002 is CRITICAL: connect to address 10.64.16.42 and port 6534: Connection refused
1:19 AM PROBLEM - tilerator on maps1001 is CRITICAL: connect to address 10.64.0.79 and port 6534: Connection refused

07:25 UTC: Tilerator Service was restarted on maps100[1-3]

07:26 UTC: Tilerator Service came back up.

Conclusions

Links to relevant documentation

Maps Runbook: Maps/RunBook

Actionables

NOTE: Please add the #wikimedia-incident Phabricator project to these follow-up tasks and move them to the "follow-up/actionable" column.