Talk:Deployments/Train vs backport
What makes up a "risky change"? Usually I feel more comfortable having my risky changes ride the train because then they get full exposure in the beta cluster and then a progressive rollout to the wikis. I think that should be recommended ("I'll merge this Tuesday after the branch cut") via "SHOULD" rather than being told to go into a backport/dedicated window, which doesn't get any of that integration testing.
- "Risky change" in this context was left to the developers discretion -- the thinking there is that developers often have a keen sense of the risk of a given patchset.
- The train process can be a way to gain confidence about a risky patchset -- that's true. There are a few additional considerations though: (1) signal about your patchset could be buried in the noise of the other 400 patchsets on the train (2) this patchset could mean that we hold back the other 400 patchsets that might all be very low risk -- there may be other means to "de risk" this deployment that doesn't affect so many others (3) the train adds an artificial time constraint. Quoting the Continuous Delivery book, "As pressure increases the defined process for collaboration between the development and deployment teams is subverted, in order to get the deployment done within the time allocated to the deployment team" -- tl;dr: time constraints make us make worse decisions.
- scaptrap! :D -- I suppose this is a similar idea. Thcipriani (talk) 21:38, 23 March 2021 (UTC)
- This is different than current practice. There may be cases where its desirable or appropriate to de-risk a large change via the train process -- that's fine. The goal is to encourage more backports for several reasons:
- Smaller deploys are easier to reason about
- De risking via the train may unnecessarily hold back the other 400 changes on the train
- The train adds an artificial time constraint to deployment
- Developers and patch authors may be better positioned to reason about the immediate impact of their deployment, but the train deployment may be happening at an inconvenient time. If developers deploy directly (and in isolation) the hope is that problems will be caught quickly. Additionally, since backport windows use
mwdebugservers problems may be caught with no user impact.
- To be clear, while this is different than current practice, nothing is changing with the train in the near term -- the train is a service that will continue to be offered. This is meant to move us in the direction of a more continuous delivery model -- I'm open to other ideas in this space.
Shouldn't fixes to critical regressions be deployed via dedicated window (ie. as soon as possible)? I suppose it depends on how critical it is, but I don't think we should give people the impression that they need to wait for the next backport window when something has significant user impact. --tgr (talk) 15:40, 19 March 2021 (UTC)