PKI/CA Operations
The root CA is an offline signing CA. The root key is only stored on the root ca server and in (TODO: some offline location). When creating new intermediate certificates one will need to manually generate the new intermediate certificate on the root ca server and then copy the certificate content into puppet.
Intermediate Certificates
Creating an intermediate certificate is mostly a manual process however we do make use of puppet to ensure the certificate is created with the correct parameters
Adding a new intermediate
To add a new certificate we first neeed to add an entry to the profile::pki::root_ca::intermediates: array in hiera for the root CA server.
profile::pki::root_ca::intermediates:
- debmonitor
- discovery
- kafka
+ - test
Once you have done this you will need to run puppet and the puppet root ca server (like pki-root1001.eqiad.wmnet) to create the public and private key pair. They should be created in /etc/cfssl/ssl/$intermediate_name and you should also see the puppet output.
$ sudo puppet agent -t
Info: Using configured environment 'production'
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Retrieving locales
Info: Loading facts
Info: Caching catalog for pki-root.pki.eqiad1.wikimedia.cloud
Info: Applying configuration version '(e739fdec88) Jbond - P:pki::root_ca: use correct title for intermediate certs'
Notice: The LDAP client stack for this host is: sssd/sudo
Notice: /Stage[main]/Profile::Ldap::Client::Labs/Notify[LDAP client stack]/message: defined 'message' as 'The LDAP client stack for this host is: sssd/sudo'
Notice: /Stage[main]/Profile::Pki::Root_ca/Cfssl::Cert[WMF_test_intermediate_ca]/File[/etc/cfssl/ssl/WMF_test_intermediate_ca]/ensure: created (corrective)
Notice: /Stage[main]/Profile::Pki::Root_ca/Cfssl::Cert[WMF_test_intermediate_ca]/Exec[Generate cert WMF_test_intermediate_ca]/returns: executed successfully (corrective)
Notice: /Stage[main]/Profile::Pki::Root_ca/Cfssl::Cert[WMF_test_intermediate_ca]/File[/etc/cfssl/ssl/WMF_test_intermediate_ca/WMF_test_intermediate_ca.pem]/mode: mode changed '0644' to '0440' (corrective)
Notice: /Stage[main]/Profile::Pki::Root_ca/Cfssl::Cert[WMF_test_intermediate_ca]/File[/etc/cfssl/ssl/WMF_test_intermediate_ca/WMF_test_intermediate_ca-key.pem]/mode: mode changed '0600' to '0440' (corrective)
Notice: /Stage[main]/Profile::Pki::Root_ca/Cfssl::Cert[WMF_test_intermediate_ca]/File[/etc/cfssl/ssl/WMF_test_intermediate_ca/WMF_test_intermediate_ca.csr]/mode: mode changed '0644' to '0440' (corrective)
Notice: Applied catalog in 4.94 seconds
Once you have generated the new certificate you will need to:
1) Copy the certificate file to the public puppet repo (modules/profile/files/pki/intermediates/$intermediate_name.pem).
2) Copy the private key to the secret store (currently puppet private repo modules/secret/secrets/pki/intermediates/$intermediate_name-key.pem). For example:
(export INTERMEDIATE=aux ; sudo SSH_AUTH_SOCK=/run/keyholder/proxy.sock \
scp -3 pki-root1001.eqiad.wmnet:/etc/cfssl/ssl/${INTERMEDIATE}/${INTERMEDIATE}-key.pem \
puppetserver1001.eqiad.wmnet:/srv/git/private/modules/secret/secrets/pki/intermediates)
3) Add a dummy private key to the labs/private repo (echo nosecret > modules/secret/secrets/pki/intermediates/$intermediate_name-key.pem).
4) Add an entry under the profile::pki::multirootca::intermediates: key. Note the location of the certificate and key must go in the specified paths for puppet to find them e.g.
profile::pki::multirootca::intermediates:
test:
ocsp_port: 10001
# The following parameters are optional and override the defaults
nets:
-192.0.2.0/24
auth_keys:
default_auth:
key: $custom_key
type: standard
profile:
database:
expiry: 8760h
usages:
- 'digital signature'
- 'key encipherment'
- 'server auth'
- 'client auth'
Renewing an existing intermediate
The root pki root server will automatically renew intermediate certificates it manages and re-sign the public key on disk on the root CA server (the root CA server is defined in site.pp in the puppet repo), look for the role called pki::root.
The last round of renewals was related to the debmonitor and discovery intermediates, and it was tracked in T420993. We applied puppet patches to automatically force the renewal of all leaf certificates belonging to an intermediate cert that changes: on bare metal nodes this part should be automatic and handled transparently by puppet, but what about restarts and Kubernetes services?
The main difference between the two intermediates was their scope: debmonitor supported TLS for an auxlliary service, meanwhile discovery supported TLS for front end services that if broken could have impacted external users.
We decided to handle the two renewals in different ways. To distribute the new debmonitor's certificate we followed these steps:
- Fetch the certificate from
/etc/cfssl/ssl/${ISSUER}.pemfrom the root CA server. - Create a puppet patch that replaces the old cert. You can find the old certificates at
profile/files/pki/intermediates/${ISSUER}-cert.pem. Remember that you also need to change the private key in the puppet private repo. - Merge/deploy the puppet patches.
- Let puppet rollout the change and check if anything needs to be restarted.
We took a more conservative road for discovery: we created a new intermediate called discovery2026 and we moved clients to the new intermediate in steps, to avoid a single "rollout day" with potentially dangerous side effects and a lot of people involved at the same time. In this case the clients were of multiple kinds:
- Bare metal Envoy-fronted services, where TLS was handled by Envoy: in this case, puppet took care of deploying the new intermediate cert, request the new leaf cert and reload Envoy.
- Bare metal services not using Envoy. In this case, some restarts were needed since not all of them reloaded when the TLS keys were changed by puppet.
- Kubernetes services. It turned out that cert-manager (that uses cfssl-issuer as plugin to talk with the PKI infra's API) doesn't force a renewal of the leaf certificates once the intermediate is changed, but it waits until their expiry is close before doing it. Janis created https://gitlab.wikimedia.org/repos/sre/certoid to support this use case: certoid is a simple golang script that forces the re-issue of leaf certificates signed by an intermediate, matching a certain regex filter. It was applied to all K8s clusters after switching their cfssl-issuer's config to the new
discovery2026intermediate and it worked nicely.
There is no definitive procedure to use when an intermediate needs to be changed, keep in mind the two use cases above and choose wisely :)
Signing Profiles
A signing profile is a way of specifying particular options that are to be used when generating certificates from a particular intermediate CA. These commonly include certificate duration and a specific auth_key but may also include a number of other parameters that are documented on Cloudflare's github.
Adding a new signing profile
There are three steps required in order to create a new signing profile.
- Add a new entry for the signing profile to the
profile::pki::multirootca::intermediateshash in puppet. This is in the filehieradata/role/common/pki/multirootca.yaml. It should specify at least theauth_keyvalue, which is a reference to the actual key in the private repo. - Add a new auth key for this signing profile to the
profile::pki::multirootca::default_auth_keys:hash in the private repo. This is in the filehieradata/role/common/pki/multirootca.yaml. A suitable value can be generated withopenssl rand -hex 8. - Add a dummy value of the same to the
labs/privaterepo to allow the puppet compiler to refer to it too.
Revoking certificates
If you are required to revoke a certificate you can use the cfssl-certs revoke command which expects the public certificate file as input
$ cfssl-certs -vvvv revoke -R cessationOfOperation /etc/cfssl/signers/debmonitor_discovery_wmnet/ca/debmonitor_discovery_wmnet.pem
DEBUG:root:running: /usr/bin/cfssl revoke -db-config /etc/cfssl/db.conf -serial 436540461805550888668031556404920484612797891140 -aki 3bada271e634bd1bfc80bf35718391d0ef691336 -reason cessationOfOperation
DEBUG:root:b''
OCSP
OCSP responses are generated on the multicaroot server however the OCSP signing certificate is created on the Root CA server in a similar way to the the intermediates i.e. it is generated using the cfssl::cert resource and files are maintained on the root ca server. however you will need to copy the certificate to the puppet repo (modules/profile/files/pki/ROOT/$CN.pem) and secret store (pki/ROOT/$CN.pem). This is only an issue when boot strapping or if you need to extend the life of the ocsp certificate
New server/HW refresh
When adding a new server it should be as easy as giving it the pki::multirootca role and imaging it. However the IP addresses of the pki servers are hardcoded in Kubernetes network policies (firewall rules), so it needs some co-ordination with the various k8s clusters owners. See https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/898759 for an example change adding pki2002.
Alt DNS names
At the time of writing the pki::multirootca use the puppet agent certificates to provide authentication. As the pki service listens on pki.discovery.wmnet we make use of the puppet dns_alt_names configuration. This can cause problems if rebuilding the server as this options is not currently supported by the reimage scripts. As such it is recommended to follow the following steps when (re)building
- first move the host into the
spare::systemrole this allows you to use the reimage scripts to rebuild the host and prevents it getting stuck - once up move the host into the
pki::multirootcarole and run puppet. you should see a change like the following (although puppet will fail)
--- /etc/puppet/puppet.conf.d/10-main.conf 2021-03-25 11:19:13.680926176 +0000
+++ /tmp/puppet-file20210330-30308-1y7a7nd 2021-03-30 11:46:43.552449866 +0000
@@ -14,7 +14,7 @@
server = puppet
ca_server = puppetmaster1001.eqiad.wmnet
-
+dns_alt_names = pki.discovery.wmnet
daemonize = false
http_connect_timeout = 60
http_read_timeout = 960
- once this is in-place run the
sre.puppet.renew-certcookbook to regenerate the new cert
$ sudo cookbook sre.puppet.renew-cert --allow-alt-names pki1001.eqiad.wmnet
- finally run puppet on the pki servers
Boot strapping
When installing a new root CA or in the unfortunate event of having to revoke and replace the current root CA one must follow a number of boot strapping steps to create the initial root ca key. To this end, setting profile::pki::root_ca::bootstrap: true will instruct the root CA host to initialize the keypair. At the next puppet run you'll find ca.pem and ca-key.pem in /etc/cfssl/signers/<CA NAME>/ca/ .
Once you have the key you will need to upload the public part (ca.pem) of the key to the puppet/operations repo recommended to place in modules/profile/files/pki/ROOT/$CN.pem and the private