Data Engineering/Systems/AQS/Scaling/2017/Cluster Expansion

From Wikitech

Goals

Expand the current AQS Restbase cluster with more hosts to improve fault tolerance and available storage space. aqs100[789] nodes are added to the current cluster composed by aqs100[456].

Current cluster

  • aqs100[123].eqiad.wmnet
  • Two Cassandra instances per node.
  • Cassandra 2.2.6

Problems

  • More Cassandra dedicated storage space is needed to avoid deleting old data (setting a retention policy).
  • More Cassandra instances in the cluster improve fault tolerance to hardware failures.

Related tasks

https://phabricator.wikimedia.org/T155654

Remarks

The cluster is expanded with three hosts, running two Cassandra instances each (to a total of 6 more instances added to the cluster). We decided to use the following configuration for rack_awareness:

  • 1007,1004 -> rack01
  • 1008,1005 -> rack02
  • 1009,1006 -> rack03

This keeps our replication factor of 3 equal to the number of Cassandra logical racks. The major benefit is more predictability when reasoning about the cluster.

Example: bootstrapping all the new 6 instances on the new nodes starting from the existing ones on aqs100[456]

  • 1007-a would stream data from 1004-a and 1004-b
  • 1007-b would stream from 1004-a, 1004-b, and 1007-a (and afterward we would run cleanups on 1004-a, 1004-b, and 1007-a)
  • 1008-a would stream from 1005-a and 1005-b
  • 1008-b would stream from 1005-a, 1005-b, and 1008-a (and afterward we would run cleanups on 1004-a, 1004-b, and 1007-a)
  • 1009-a...

The thing to keep in mind to understand the above example is that Cassandra always tries to stream data from the smallest subset of instances to avoid complex consistency validation issues. Since the replication factor is 3 and the number of logical racks is 3, all the data is available on each rack (and split among multiple instances).