Deployments/Features Process/E3 Notes

From Wikitech
  • we have a lot of noisy errors
    • syncdir/scap, timeout erros
    • you're conflicted LOLNEWB vs OMGWHYDIDN'TYOUSAY?!
    • some of them aren't even mediawiki servers (eg: spence)
    • the machines should be a list, when it is taken out of rotation it should be removed from this list as well
    • treat it like compiling (either no warnings or they are surpressed)
    • syncdir, "no syntax errors found..."
      • phplint isn't invoked correctly
    • recoverable exceptions are aggregated on fluorine in logs
      • 3 people looking at that file
      • make this more attractive and accessible
      • TODO: ask Antoine about this
    • the mystery of scap
      • distinction between syncdir, syncfile, and scap
      • what's going on there
    • edit them so they echo out more useful indicators of what is *actually* happening
  • TODO: get antoine to talk more about beta to E3
  • all test local
  • qunit on suacelabs - need this
  • easy browser testing on all browsers
  • pyramido - testing for next deployment (nearest thurs)
  • toro - anything they want to test without messing up pyramido
    • get qunit against these

Krinkle/Timo on qunit

  • if you have +2 it runs the test right away
    • all php unit tests are run, but only on sqlite
      • only php
    • none of cucumber or qunit are run in jenkins
  • E3 does a/b tests for specific time spans
  • reluctant to add deployment steps, even when precautionary
  • wish people understood the nature of the work better, time-sensitivity especially
    • just the fact that people notice it
  • no more manual steps, more automated if their doing smart things
  • weirdest with two branches
    • for a short period of time only one (last days of cycle)
    • people want to deploy more often, but we don't do new branch cuts
    • a huge simplificatioin if we had a deploy and a dev branch
  • Documenting caching
    • we use it on so many levels
    • problems in the first 5-10 minutes after deploy could go away
    • what are the implications of the message cache?
  • people who go in and ssh in to specific apaches vs those who don't
    • some animosity
    • we aren't training new employees to do it
    • TODO: how to poke around using shell?
  • Lightning deploy
    • there are needs to sync small things that are low risk but still need to go out
    • the cost of waiting until the next branch is too much
    • an attempt to create designated time to do it, "you break it, you bought it" - karma based ;)
    • others will be around
    • TODO: write a post to wikitech-l and wikitech.wm.org page about the purpose
  • big picture overview of server infrastructure
    • repurpose kraken diagram ;)
    • who would use it?
  • Subdomains on betalabs ?
    • certain things turned on/off for testing
  • when you modernize you feel you should say you don't need to know the details, it's all automated
    • shouldn't shield people too much