Swift/ObjectExpiration

From Wikitech

Investigation in how object expiration works in swift.

Swift supports expiring objects by passing in custom headers (X-Delete-At/X-Delete-After) (see expiring objects in swift

This process is implemented internally by having an "expirer account" where objects are added as they are asked to be deleted. If during a regular request the "TTL headers" are spotted by the proxy they will be forwarded to the backend which then does something like this:

 if delete_timestamp:
    add_to_account(".expired-objects", delete_timestamp, "%(timestamp)-%(account)/%(container)/%(object)")

so inside the .expired-objects account there will be several containers named after the timestamp at which the objects inside should be deleted. Inside each container there is an object that encodes which (account, container, object) to delete.

This "queue" is processed by swift-object-expirer in the background by reading the container list and deleting objects accordingly, the process itself can have multiple levels of concurrency (added in 209c5ec418):

  • A single expirer process can run a pool of workers to farm out requests to delete objects
  • Several expirer processes can work on the same queue via a simple modulo-based sharding

caveats

  • We are running icehouse (1.13) which doesn't contain this fix for expiring objects not available 8d20e0e927d
  • If done for all thumbnails this would increase the object count by whatever (TODO: how much?) thumbnail churn we have over a month (TODO: check if correct), which we might not want to do. Another approach would be to expire only non-standard thumbnail sizes to save some (TODO: how much?) space and not have a big expiry queue (assumption being that non-standard sizes are the minority, TODO: check this)
  • Other noteworthy commits on the object expirer that we got during the upgrade: