From Wikitech
Jump to: navigation, search

etcd is a distributed key/value store.



Labs project: Nova_Resource:Etcd

For documentation on etcd use from clients, see EtcdClients

Use in the WMF

We currently have:

  1. One cluster in eqiad for general use, running with https and client AUTH
  2. One cluster in codfw for general use, running with https and client AUTH
  3. One cluster on ganeti for kubernetes in eqiad, running with https; access is firewall-controlled.


Note: There's no TLS for peer communications yet, so pay close attention to http vs https in the URLs and the port numbers used in various places.

Bootstrapping an etcd cluster

For setting up a cluster initially made of one server (, you can do the following:

  1. Assign profiles ::profile::etcd and ::profile::etcd::tlsproxy to your server's role
  2. Define the following variables via hiera:
# Name of the cluster. 
profile::etcd::cluster_name: "cluster-name"
# Set to true when first building the cluster, it should be set to false if adding/removing members
profile::etcd::cluster_bootstrap: true
# set this to "dns:domain-name" if you want to use dns discovery                                                                      
profile::etcd::discovery: "etcd1001="
# If including the tls proxy (recommended in case of high throughput) set this to true
profile::etcd::use_proxy: true
# Set to true if you want to use client cert auth. Recommended: false. This conflicts with setting use_proxy to true
profile::etcd::use_client_certs: false
profile::etcd::do_backup: false                                                                                                                                               
profile::etcd::allow_from: "$DOMAIN_NETWORKS"
# If you chose to use the TLS proxy, you need the following variables too:                                                                                                                                
# This cert is generated using puppet-ecdsacert, and includes                                                                                                                
# all the hostnames for the etcd machines in the SANs                                                                                                                        
# Will need to be regenerated if we add servers to the cluster.                                                                                                              
profile::etcd::tlsproxy::cert_name: "etcd.%{::domain}"                                                                                                                       
profile::etcd::tlsproxy::acls: { /: ["root"], /conftool: ["root", "conftool"], /eventlogging: []}                                                                            
# This should come from the private hieradata                                                                                                                                                                                                                                                                         

This is just for creating the initial node.

In case you are using SRV records for server discovery in the format, add all your current nodes to that record, then just set instead in hiera:

profile::etcd::discovery: ""

Now run puppet on the etcd1001 node, and it should bring up an etcd cluster. You can verify this with:

etcdctl  -C cluster-health

If using the SRV records, you can run puppet on the other nodes of the cluster and they should come up and be configured correctly.

Once verified, flip the profile::etcd::cluster_bootstrap hiera variable to 'true' from 'false', and continue adding more nodes via the following procedure.

Adding a new member to the cluster

Say we want to add a new server called to our cluster. The steps are as follows:

  1. Add the member via the members api, using the etcdctl tool:
    $ etcdctl -C member add conf1001
    Added member named conf1001 with ID 5f62a924ac85910 to cluster
    # Next line is broken down artificially for ease of reading
    Write down the output as it will be useful for our puppet changes.
  2. Assign the etcd role to the node in puppet. Also use hiera to set the following variables for the whole cluster:
    profile::etcd::discovery set to the value of ETCD_INITIAL_CLUSTER from the output of the etcdctl command before
  3. Run puppet on the host. It should join the cluster. Confirm this is the case with the other hosts in the cluster as well (the logs should stop complaining about not reaching the new member)
  4. Finally, add the new server to the SRV records that clients consume, if that's the case

Removing a member from the cluster

  1. Verify the node you want to remove is not the current leader, that could run us into trouble:
    $ curl -k -L https://etcd1001:2379/v2/stats/leader
    {"message":"not current leader"}
  2. Remove the server from the clients SRV record
  3. Dynamically remove the server from the cluster:
    $ etcdctl -C member remove etcd1001
    $ etcdctl -C cluster-health
  4. Remove the server from the cluster's SRV record if present, or from the hiera variable profile::etcd::discovery if not using SRV records

Recover a cluster after a disaster

In the sad case when RAFT consensus is lost and there is no quorum anymore, the only way to recover the cluster is to recover the data from a backup, which are regularly performed every night in /srv/backup/etcd. The procedure to bring back the cluster is roughly as follows:

  • Stop all etcd instances that might be still running
  • Copy the backup to a new location, start etcd from there; the etcd server listening to the public endpoints with the --force-new-cluster option. It will start with peer urls bound to localhost.
  • Change the peer url of this server to what you'd expect it to be in normal situations
  • Add your other servers to the cluster, as follows:
    • Verify the original etcd data are removed
    • Add the server to the cluster logically with etcdctl
    • Start etcd in order to join the cluster.

As usual with etcd, the devil lies in the details of the command-line options; but there is a python script that, given the current cluster configuration, can generate the correct commands you'll have to enter into a shell. It can be found here.