CAS-SSO/Administration

From Wikitech

Stub page with information on how to debug/handle Apereo CAS as deployed by the Wikimedia Foundation.

Icinga

$vhost requires authentication

This icinga check is in place to ensure protected sites correctly redirect and unauthenticated connection back to https://idp.wikimedia.org. You can run the check manually from icinga with the following command. (use the -v switch to increase verbosity)

icinga1001 ~ $ # test cas-icinga.wikimedia.org
icinga1001 ~ $ IP=208.80.154.84
icinga1001 ~ $ VHOST=cas-icinga.wikimedia.org
icinga1001 ~ $ URI=/icinga
208 ~ % /usr/lib/nagios/plugins/check_http -I ${IP} -H ${VHOST} -e 'HTTP/1.1 302' -d 'ocation: https://idp.wikimedia.org/login' -S -u ${URI}  
HTTP OK: Status line output matched "HTTP/1.1 302" - 604 bytes in 0.005 second response time |time=0.004526s;;;0.000000;10.000000 size=604B;;;0

common things to check

  • the -u switch points to the correct protected uri
  • The check is hitting the correct vhost
  • mod_auth_cas is correctly installed and configured

To debug mode_auth_cas update the vhost to have Loglevel debug & CASDebug On

API endpoints

These commands need to be run locally from the IDPs.

The ssoSessions endpoint exports a JSON description of current sessions, restricted to access from the IDP hosts (doesn't currently work since we switched to MemcachedTicketRegistry, which doesn't implement the getTickets() function used by the actuator):

 curl https://idp.wikimedia.org/api/ssoSessions?type=ALL

To logout a user, you first need to find out the TGT by running "sudo memcdump localhost:11000" and checking the cas audit.log. Then we can send a DELETE request to /api/ssoSessions/$TGT and terminate a session for a different user, e.g.

 curl -X DELETE https://idp-test.wikimedia.org/api/ssoSessions/TGT-1-9DJVA0Mr8NvkBiWSHW3kuKmCH5ifJ1ya8FnA15N6TT6IKjQG3hNHrziQQS-mJAEfWeo-idp-test1001

Session timeout handling

First some terminology:

  • A ticket-granting ticket (TGT) is an string generated by the CAS server containing user session details that is issued after a successful authentication at the /login endpoint. When the TGT expires, no further SSO is possible. Conceptually, the life time of the TGT defines the lifetime of the SSO session. The TGT is kept in the ticket storage system on the IDP server.
  • A ticket-granting cookie (TGC) is a cookie in your browser which identifies itself against CAS. It allows the CAS server to match the user against a TGT.
  • A long term ticket granting ticket (LTGT) is used for automatic relogins when using the "remember me" feature, see below.

There are two ways for the user to authenticate when logging in:

  • If "Remember me" is enabled, a long term authentication session is requested. This creates a "Long-Term Ticket Granting Ticket". As long as this LTGT is valid and the user has a valid TGC cookie, the session is automatically re-logged-in, which generates a new TGT. Security-wise this is acceptable if there are no other users for a computer, but must not be used when e.g. logging in from a shared computer.
  • If "Remember me" is not enabled, the session life time is defined by the configuration parameters below and might need a relogin

The configuration property cas.ticket.tgt.rememberMe.enabled enables/disables this feature.

For force a logout, either the logout/ endpoint can be accessed (which voids the TGT (and the LTGT)) or the TCG cookie can be removed. When the TGC cookie is gone, the TGT of the user cannot be accessed, so no SSO is possible. If one wants to force that the TGC is rendered invalid when the browser session ends, this can be configured by setting tgc.maxAge=-1 (TODO: what is the default)

TGTs follow an LRU policy for expiration (similar to an Apache session timing out). The configuration property cas.ticket.tgt.timeToKillInSeconds configures a time interval in seconds (all time intervals mentioned here are specified in seconds); If a TGT isn't used until the time frame has passed, the TGT expires.

The configuration property cas.ticket.tgt.maxTimeToLiveInSeconds defines an upper bound after which the TGT expires. A TGT is marked as expired once it's creation time plus the defined interval is reached (TicketGrantingTicketExpirationPolicy.isExpired())

There's also tgt.timeout.maxTimeToLiveInSeconds which if used creates a TGT which will remain valid as long as it is used once within the time interval i.e. there is no upper bound. however this setting seems to causes issues when used with a Memcached configuration

Current Settings

cas.ticket.tgt.rememberMe.enabled: true
cas.ticket.tgt.rememberMe.timeToKillInSeconds: 604800
cas.ticket.tgt.timeToKillInSeconds: 3600
cas.ticket.tgt.maxTimeToLiveInSeconds: 604800

Updating the CAS Debian package

  • Bump the version in debian/changelog with your change (e.g. enabling a new module or rebasing to a new upstream) and merge your change
  • Instructions for rebasing to a new CAS upstream release can be found at https://gerrit.wikimedia.org/r/plugins/gitiles/operations/software/cas-overlay-template/+/refs/heads/master/debian/README.Debian
  • On a idp-test host, enter /srv/cas-build/cas and run Puppet to refresh the repo
  • sudo dpkg-buildpackage --no-sign (the sudo could be dropped at some point, when setting "Rules-Requires-Root: no" it almost builds, but the Gradle build tries to start some daemon temporarily, which seems to need a privileged port, needs some more poking, but simply using sudo is fine for now)

To fetch the package on apt1001:

 rsync "rsync://idp-test1001.wikimedia.org/cas-build-result/*cas*" .

Rebooting

We currently replicate all Memcached state changes to eqiad/codfw to be able to failover. If we however reboot an IDP node, the current session state is lost. As such we need todump it pre-reboot and restore it postreboot

$ sudo /usr/local/sbin/memcached-dump dump -f  /srv/cas/memcached.$(date +%s).dump 
$ reboot
$ sudo /usr/local/sbin/memcached-dump restore -f  /srv/cas/memcached.1608203240.dump 
$ rm /srv/cas/memcached.1608203240.dump