Cergen

From Wikitech

cergen is a Python command line tool and library to help with managing and generating asymmetric key pairs and x509 certificates for use with TLS/SSL encryption and authentication.

The openssl CLI is useful and very featureful, but fairly complicated. The Java keytool CLI serves a similar purpose, but outputs files in a different format than openssl. cergen allows you to declare desired certificates and keys in a YAML manifest, and then idempotently generate them into files of various formats for use with different technologies.

cergen was originally based on cassandra-ca-manager, and expanded to generate more file formats and to integrate with custom CA implementations, including Puppet CA.

cergen is installed on puppetmaster frontends that are also CAs. It is intended to be used there to generate certificates signed by our Puppet CA, and committed to the ops puppet private repository's secret module.

For more information, see the cergen README.

Usage on Puppet CA host

cergen production certificate manifest files are checked into Puppet private. On puppetmaster1001, these can be found at /srv/private/modules/secret/secrets/certificates/certificate.manifests.d/. *.certs.yaml files in this directory declare certificates and keys that should be generated and managed, as well as any CAs that will be used to sign those certificates.

Most likely, you'll be using the Puppet CA to sign your certificates. The puppet_ca signer is declared in puppet_ca.certs.yaml:

# This is our Puppet CA.  It will be used to sign other certificates.
puppet_ca:
  class_name: puppet
  hostname: puppetmaster1001.eqiad.wmnet

Here puppet_ca is the name of the authority. You can refer to puppet_ca as the authority for the certificates you want to generate and have signed by our Puppet CA. In kafka_jumbo.certs.yaml, we declare a certificate for use by Kafka brokers in the jumbo-eqiad cluster:

# Certificates for the kafka-jumbo cluster.
kafka_jumbo-eqiad_broker:
  authority: puppet_ca
  key:
    password: 'XXXXXXXX'
    algorithm: ec

With the CA and certificates declared, we can now use cergen to check the status of our certificates. We keep generated files in /srv/private/modules/secret/secrets/certificates, so we need to provide that as the --base-path= option.

$ cergen --base-path=/srv/private/modules/secret/secrets/certificates /srv/private/modules/secret/secrets/certificates/certificate.manifests.d

Status of certificates ['kafka_jumbo-eqiad_broker']

Certificate(kafka_jumbo-eqiad_broker, authorities=[PuppetCA(puppetmaster1001.eqiad.wmnet_8140)]):
	/srv/private/modules/secret/secrets/certificates/kafka_jumbo-eqiad_broker/kafka_jumbo-eqiad_broker.key.private.pem: PRESENT (mtime: 2017-12-05T14:47:45.105453)
	/srv/private/modules/secret/secrets/certificates/kafka_jumbo-eqiad_broker/kafka_jumbo-eqiad_broker.key.public.pem: PRESENT (mtime: 2017-12-05T14:47:45.105453)
	/srv/private/modules/secret/secrets/certificates/kafka_jumbo-eqiad_broker/kafka_jumbo-eqiad_broker.crt.pem: PRESENT (mtime: 2017-12-05T14:47:46.417451)
	/srv/private/modules/secret/secrets/certificates/kafka_jumbo-eqiad_broker/kafka_jumbo-eqiad_broker.keystore.p12: PRESENT (mtime: 2017-12-05T14:47:46.433451)
	/srv/private/modules/secret/secrets/certificates/kafka_jumbo-eqiad_broker/kafka_jumbo-eqiad_broker.keystore.jks: PRESENT (mtime: 2017-12-05T14:47:47.261450)
	/srv/private/modules/secret/secrets/certificates/kafka_jumbo-eqiad_broker/truststore.jks: PRESENT (mtime: 2017-12-05T14:47:47.649449)

Without the --generate flag, cergen will just print out a list of certificates it is managing, and the status of the expected files. By providing --generate, cergen will attempt to generate any files for certificates declared in its manifests that are not PRESENT. For these certificates, the key.private.pem and the crt.pem files are considered the canonical sources of data. The other files are subordinate to those. E.g., if the keystore.jks file is missing, it will be re-generated from the key.private.pem and crt.pem files. However, if either crt.pem or key.private.pem is missing, you will need to manually generate a brand new key and certificate by removing the files and running again with --generate.

NOTE: If you are regenerating a Puppet signed certificate, you must first remove the certificate from the Puppet CA. puppet cert clean <common_name> should do it.

Certificates that declare authority: puppet_ca will be auto-signed by our Puppet CA. If you need to remove a Puppet signed cert, remember to also destroy that cert using the puppet cert CLI, e.g. puppet cert destroy kafka_jumbo-eqiad_broker.

To avoid messing with certificates you are not concerned with, you can instruct cergen to only select specific certificates by using the -c flag. E.g.

cergen --generate -c 'kafka.*' ... might select all certificates who's names start with 'kafka'. See cergen --helpfor more options.

Cheatsheet

On the puppetmaster (eg: puppetmaster1001.eqiad.wmnet), create /srv/private/modules/secret/secrets/certificates/certificate.manifests.d/SERVICENAME.certs.yaml as follows:

SERVICENAME.discovery.wmnet:
  authority: puppet_ca
  expiry: null
  alt_names: ["SERVICENAME.svc.eqiad.wmnet","*.whatever.org", ...]
  key:
    password: YOURSECRETPASSWORD
    algorithm: ec

Note: the service name / common name (i.e. SERVICENAME.discovery.wmnet) will be implicitly included in alt_names, no need to duplicate it there as well.

By setting a key.password, cergen will output encrypted private key files. If you need an unencrypted private key file you can omit the key.password and skip the openssl command below.

cergen -c 'SERVICENAME.*' --generate --base-path=/srv/private/modules/secret/secrets/certificates /srv/private/modules/secret/secrets/certificates/certificate.manifests.d

# For certificates with key.password, run:
openssl ec -in modules/secret/secrets/certificates/SERVICENAME.discovery.wmnet/SERVICENAME.discovery.wmnet.key.private.pem -out /srv/private/modules/secret/secrets/ssl/SERVICENAME.discovery.wmnet.key
# openssl will ask for YOURSECRETPASSWORD at this point

# For certificates without key.password, just copy the private key over:
cp ./modules/secret/secrets/certificates/SERVICENAME.discovery.wmnet/SERVICENAME.discovery.wmnet.key.private.pem /srv/private/modules/secret/secrets/ssl/SERVICENAME.discovery.wmnet.key
git add /modules/secret/secrets/ssl/SERVICENAME.discovery.wmnet.key

If everything went well, the public key is now under /srv/private/modules/secret/secrets/certificates/SERVICENAME.discovery.wmnet/SERVICENAME.discovery.wmnet.crt.pem. If your service is using sslcert::certificate with "use_cergen" then there's nothing else to do other than commit your changes to /srv/private.

Otherwise copy the public key to the operations/puppet repository under files/ssl/SERVICENAME.discovery.wmnet.crt Notice the extension difference! The extension is .crt.pem on the puppetmaster, but it needs to be just .crt in operations/puppet.

The private key is /srv/private/modules/secret/secrets/ssl/SERVICENAME.discovery.wmnet.key. There's nothing else to do other than commit your changes to /srv/private.

In order for pcc to work properly, a dummy secret key (NOT the real one) needs to be added to the labs/private repository under modules/secret/secrets/ssl/SERVICENAME.discovery.wmnet.key. Please note that the labs/private repository is NOT private as the name might suggest, it is public. The key you add should look like this:

-----BEGIN RSA PRIVATE KEY-----
dummy
-----END RSA PRIVATE KEY-----

See https://gerrit.wikimedia.org/r/#/c/labs/private/+/523929/ as an example.

Update a certificate

  • Only in case the cert's manifest needs to be changed, like adding a SAN (cert renewal should need any change to the manifest for example). Update the YAML file /srv/private/modules/secret/secrets/certificates/certificate.manifests.d/SERVICENAME.certs.yaml
  • Remove the old certificate from the Puppet CA. The step will cause the cert's revocation in the Puppet CA, but as of April 2024 we don't have any revocation workflows for clients in production, so the operation shouldn't cause any harm to running clients.
  puppet cert clean <common_name>
  • Remove the following files from /srv/private/modules/secret/secrets/certificates/SERVICENAME.discovery.wmnet/:

SERVICENAME.discovery.wmnet.crt.pem SERVICENAME.discovery.wmnet.csr.pem SERVICENAME.discovery.wmnet.keystore.jks SERVICENAME.discovery.wmnet.keystore.p12 truststore.jks

  • Update the certificate with:

cergen -c 'SERVICENAME.*' --generate --base-path=/srv/private/modules/secret/secrets/certificates /srv/private/modules/secret/secrets/certificates/certificate.manifests.d

  • The new cert should be under /srv/private/modules/secret/secrets/certificates/SERVICENAME.discovery.wmnet/SERVICENAME.discovery.wmnet.crt.pem.
  • Add the updated certificate to puppet under modules/profile/files/ssl/SERVICENAME.discovery.wmnet.crt.

Distribution via Puppet

Once you've generated your certificate and key files using cergen, you should commit them, as well as any changes you've made to certificate.manifests.d/. Then you can use the puppet secret template function to render the contents of the files into locations for use by your production service.

file { "/path/to/my/my_certificate_keystore.jks":
    content => secret('certificates/my_certificate/my_certificate.keystore.jks'),
}

See also the profile::kafka::broker class for another example.

Future work

It would be very convenient if the manual steps of declaring a certificate, generating it, and distributing the desired files were all abstracted away into a single handy puppet define. This might be possible, but is complicated. It might look something like this:

cergen::certificate { 'kafka_jumbo-eqiad_broker':
    destination => '/etc/kafka/ssl',
    properties  => {
      'authority' => 'puppet_ca',
      'key' => {
          'algorithm' => 'rsa',
          'password'  => hiera('passwords::blabla'),
      }
    }
}

This would allow us to declare a certificate in the class/node/location where it should be deployed. This is easier said than done, as Puppet signed certificates must be created and signed on the same host as the Puppet CA (you can't just use the Puppet CA HTTP API). Creating a define like above would require

  • declare the certificate on the host
  • Using exported resources, realize the rendered cergen manifest.yaml to the Puppet CA host (e.g. puppetmaster1001)
  • Use the generate() puppet function to run the cergen command on the puppetmaster and commit the resulting certificate files public Puppet, and private key files to the puppet private repo.
  • Pull the certificate files using source and key files from puppet private using the secret() function.

I wrote a PoC to do this, but it is totally untested and probably would not work as is.

The original Phabricator ticket tracking this work was T166167.

Migrating a service based on Envoy to CFSSL

Gather the list of alt_names for the new certificate

  1. Ensure the service already runs envoy
  2. Gather a list of the alt_names listed on the certificate
  3. Define which alt_names need to be added on the new certificate, this can be done by:
    • For domains in the .wikimedia.org realm, they can be accessed from a web browser as they should have a public IP address.
    • For domains in the .wikimedia.org and our intranet (ex. .eqiad.wmnet), running the host command followed by the hostname from a server with internal access must return an IP address or an error message like Host {HOSTNAME} not found: 3(NXDOMAIN).
    • If the host command returns an IP address, that IP address can be pinged from a server with intranet access to see if it points to a running server. If the IP does not point to any existing server, the alt_name must be removed from the list of alt_names for the new certificate.

Generating the new certificates

  1. Downtime the hosts that use the certificates to migrate. Ex. From the cumin host: sudo cookbook sre.hosts.downtime -r "{PHABRICATOR_TASK}" -M 30 {HOSTNAMES}.
  2. Disable Puppet on the hosts that use the certificate to migrate. Ex. From the cumin host: sudo cumin '{HOSTNAMES}' 'disable-puppet "{PHABRICATOR_TASK}".
  3. Send a Puppet patch with the following information to the role with certificates to migrate:
    profile::tlsproxy::envoy::ssl_provider: 'cfssl'
    profile::tlsproxy::envoy::global_cert_name: {CERTIFICATE_NAME}
    profile::tlsproxy::envoy::cfssl_options:
        hosts:
            - {ALT_NAMES}
    
    • See an example of a patch here.
  4. Merge the Puppet patch and run Puppet on the passive host. Ex. From the cumin host: sudo cumin '{PASSIVE_HOST}' 'run-puppet-agent'
  5. Ensure the new certificates for the passive host are in /etc/cfssl/csr.
  6. Ensure the new certificates for the passive host contain the correct alt_names with sudo openssl x509 -noout -ext subjectAltName -in /etc/cfssl/{CERTIFICATE}
  7. Ensure the new certificates are valid with HTTPbb
  8. Run Puppet on the active host. Ex. From the cumin host: sudo cumin '{ACTIVE_HOST}' 'run-puppet-agent'
  9. Ensure the new certificates for the active host are in /etc/cfssl/csr.
  10. Ensure the new certificates for the active host contain the correct alt_names with sudo openssl x509 -noout -ext subjectAltName -in /etc/cfssl/{CERTIFICATE}
  11. Ensure the new certificates are valid with HTTPbb
  12. Delete the old certificates in /etc/ssl/localcerts/ from both hosts.
  13. Restart the envoyproxy service on both hosts. Ex. From the cumin host: sudo cumin '{HOSTS}' 'systemctl restart envoyproxy'
  14. Ensure the site is using the new certificates. For domains in the .wikimedia.org realm, they must be accessible from an incognito window in a web browser.