Analytics/Systems/Cluster/Coordinator

From Wikitech
Jump to navigation Jump to search

The coordinator hosts

The Analytics team relies on two multi-purpose nodes called "an-coord100[1-2]" to run critical services like Hive/Presto/Oozie and the Analytics Meta database. In T257412 we started a major effort to reduce SPOFs in the Analytics infrastructure, since there was only one coordinator host (an-coord1001) and its hostname was hardcoded almost everywhere (puppet, refinery, etc..). This meant that the failover procedure was really complex and long to execute, so the Analytics team tried to reduce the gap. The final goal would be to have a very quick and easy (order of minutes) failover procedure for all the services, but it will take time. This document tracks the status of the failover procedures to give a good overview to how things run on the coordinator nodes.

Services

This is the list of services broken down by node (up to December 2020):

an-coord1001

Hive Metastore (port 9083)

Hive Server 2 (port 10000)

Mariadb Meta db instance

Oozie server

Presto coordinator

an-coord1002

Hive Metastore (port 9083)

Hive Server 2 (port 10000)

Mariadb Meta db instance (replica of the one running on an-coord1001)

Hive

Analytics coordinator hive.png


At first this picture seems really complicated, but in reality it is all very easy, so please keep reading :)

The Hive servers (port 10000) on every coordinators use their local metastore (port 9083) when needed (this is a standard for Hive). The metastores all fetch and store Hive sessions/tokens to the same database, in this case an-coord1001, not the local one. The database on an-coord1002 only replicates from its twin on an-coord1001, and it is set as read-only replica, so it cannot be written to. The idea is that if an-coord1001 fails badly (motherboard broken, cpu on fire, etc..) we'll have a quick way to failover the Analytics Meta database (and all its clients) very quickly without configuring anything or changing settings for db1108 (that is another db host that we use for Bacula backups etc..).

The analytics-hive.eqiad.wmnet is a DNS CNAME that points only to one of the coordinators, and it is set for all the Hive clients. This means that only one coordinator is active at any given time, the other one is in "standby" state.

How does it work from the Kerberos point of view? The keytab and principal that Hive uses is hive/analytics-hive.eqiad.wmnet@WIKIMEDIA on both coordinator nodes, so they share the same credentials and can answer to the same queries.

With a simple change of the analytics-hive.eqiad.wmnet CNAME in the DNS it is possible to change the coordinator that serves Hive traffic transparently (already tested, it works) without changing any job config!

Failover

See Analytics/Systems/Cluster/Mysql_Meta#Failover