DONE move mediawiki reading originals to swift (aaron)
the deploy to test wiki on monday worked.
DONE updated squid and swift/rewrite.py to allow reads for originals (scheduled for monday 8/20)
squid change is acl work similar to how thumbnails got moved
rewrite does not need changes to accept non-thumbnails and get to the right bucket
finish building eqiad cluster
ms-be1004 is waiting on a replacement SSD eta friday 8/17
ms-be1005 doesn't see any of its spinning disks. RobH to investigate
it's ok to continue building the cluster without those two hosts.
DONE upgrade to 1.5.0 (with ganglia statsd stuff disabled)
test in labs (lucid)
done. tested fetching existent and nonexistent thumbs. tested with mismatched proxies and storage servers.
test on eqiad (precise)
tested mixed cluster upgraded by hand. tested container creation, thumb creation, thumb fetching, lost object recovery.
need to test puppet rules (scheduled monday)
test mediawiki auth - Jan claims MW fails to auth against 1.4.4+. replicate his test, find and fix the problem (if replicable) (aaron)
to start before 8/28
sync content
test between eqiad-prod cluster and ??? (eiqad-test? labs?
redo zones in pmtpa
audit and replace disks across all backends
rt-3282 and rt-3432
to do in sept
improve reaction-based documentation (instead of feature-based documentation)
what to do when a host fails; what to do when a nagios alert triggers (for each nagios alert); etc.
improve dead disk detection methods, automate alerting and replacing
installed and configured swift-drive-audit to find them.
how to hook into nagios?
set up swift-recon
to do Sometime(tm)
enable 1.5 statsd ganglia stuff
disable ganglia-logtailer
disable local logging?
update ganglia view for new metrics
document how to switch from pmtpa to eqiad
container synchronization is an eventually consistent thing; how to synchronize the change?
have 2 users that interact with containers - one that can create / destroy containers and the other that can't
talk to aaron for more detail
upgrade pmtpa cluster from lucid to precise
add SSDs into ms-be1-5 to get hardware parity with the rest of the ms-be servers (and get the OS and local logs onto SSD instead of sharing with the object store)
LVS currently has the same monitoring URL for both pmtpa and eqiad but the URL includes the account ID, which is different between the two clusters. Separate the LVS config (lvs.pp line 663ish) into separate things so they can have separate monitoring URLs.
replace all the swift C2100 hardware with something that doesn't have hardware failures left and right