Confd
Confd is a configuration management system that can actively watch a consistent kv store like etcd or zookeeper and change config files based on templates whenever those key change; it can also tell the service depending on said configs to reload its own configuration. It is ideal to coordinate changes across a cluster almost atomically whenever your application doesn't support the kv store as a config source. It will probably also speed up our integration of more and more services easier.
How to add a configuration/service to confd
How do you write a template?
Templating in confd is based on the (quite awkward) go text/template system, to which confd adds a few of specific functions that you can use in pipelines. They are all listed in the confd docs
What to expect from confd
Confd will watch the keys you tell it to watch in the service file, and re-apply your template whenever they change. If the resulting file has changed compared to what you precedingly obtained, confd will change the file on disk, and - if you have given it such information - will first validate the new config file and in case it's valid, it will issue the reload command you indicated. Please note that confd will not watch its own templates directory for changes, so if you change a template, you need to restart confd. No interruption of service will come from confd not running for a short while, and any changes it didn't get earlier will be accounted for. If you are using our own confd puppet module, it will take care of it
What should I configure via confd, and what should I use puppet for instead?
Roughly speaking, puppet holds configuration, and etcd holds the state of resources. So, using f.e. varnish, the whole VCL logic should be configured via puppet, but the backends list should probably be generated via confd. Which brings us to the next topic
Making puppet and confd work together
The main strategy, whenever the same config file holds configuration and state, is to make puppet generate just the confd template files, and let confd create the actual config files and take care of the service.
Monitoring
Compilation is broken
Investigate /var/log/confd.log (syslog) on the alerting host for errors.
Possible causes:
- A service with no backends weighted/pooled
Stale template error files present
This means compilation is not currently broken, but was broken at some point in the past. This commonly occurs when adding a new LVS service, for instance.
If you are aware of the condition, then you can simply remove the error files. For instance, if you get this alert for the cloudceph
service on puppetmaster2001
's expansion of pybal state, you can:
ssh puppetmaster2001.codfw.wmnet sudo rm /var/run/confd-template/.cloudceph*.err