Jump to content

Talk:Deployments

From Wikitech
Latest comment: 2 months ago by BryanDavis in topic Delete the UTC morning backport window?

Phase or group?

The schedule says things like "phase 1, phase 2, phase 3" and this page says "group0, group1, group2". Further, I don't think loginwiki is phase 1 or group0 anymore. --marktraceur (talk) 20:31, 14 November 2013 (UTC)Reply

Yeah...... so, I think I'mma gonna kill mw:Roadmap#Schedule_for_the_deployments (just that section, leave the rest) and replace it with a description of our WMF-specific release cycle (where when, generaly) and point to Deployments for the canonical list of what's coming when for specific wmfXXs. Greg Grossmeier (talk) 17:51, 15 November 2013 (UTC)Reply

Would it be useful to mark SWAT patches to be self-deployed by the author?

I've noticed that most patch authors with deployment privileges prefer to scap their own changes. However, it's not clear who plans to do this unless the reader has memorized the list of deployer names. When all patches will be self-deployed, there's no need to have any single person managing the SWAT process (AIUI) so tagging oneself might be useful? For discussion...

This would be possible to automate if Template:ircnick included some code to check against the list of deployers, and perhaps add a colored dot or icon next to the name if the author has permissions to scap. Awight (talk) 08:40, 18 November 2019 (UTC)Reply

Personally, when I'm intending to deploy something myself I don't put it in the SWAT window so as to leave more spaces for people without deployment rights. There's usually good times still open on the calendar outside of the windows. Anomie (talk) 14:04, 18 November 2019 (UTC)Reply

Outdated infobox?

The notes in the infobox currently put the first SWAT of the day at 6:00 Pacific, but the current calendar entries start at 4:00 PDT (which I assume is the same as Pacific). The infobox also claims that on Wednesday, the second SWAT of the day is at 10:00 Pacific, but in the current calendar it’s at 11:00 PDT just like on Monday and Thursday. Is the infobox wrong/outdated, or is the calendar being created incorrectly? --Lucas Werkmeister (WMDE) (talk) 11:17, 9 March 2020 (UTC)Reply

@Lucas Werkmeister (WMDE): Good spot, updated. Jforrester (talk) 18:54, 12 March 2020 (UTC)Reply
Great, thanks! --Lucas Werkmeister (WMDE) (talk) 10:58, 13 March 2020 (UTC)Reply

No deployments next week?

@Greg Grossmeier: We see on Deployments/Yearly calendar that there will be no deployments or backports next week, but the detailed Deployments calendar still includes regular backport windows. Please advise! —Awight (talk) 07:43, 7 June 2021 (UTC)Reply

Sorry about that, calendarbot saw there was not train and posted the new schedule. The yearly calendar is correct: no deploys except emergencies next week Thcipriani (talk) 22:44, 7 June 2021 (UTC)Reply

Deployment schedule for weeks of August 30 weeks is missing!

I notice that deployment schedule for weeks of August 30 is missing. Any plans to add them? --Agusbou2015 (talk) 20:46, 28 August 2021 (UTC)Reply

They were fixed by @Thcipriani in this edit, thanks for spotting! Jforrester (talk) 15:28, 30 August 2021 (UTC)Reply

Thurdsay

Please consider fixing Thurdsay's schedule: https://wikitech.wikimedia.org/w/index.php?title=Deployments&oldid=1949425

  • UTC evening backport and config training: 21:00–22:00 UTC: Deployer Brennen (brennen)
  • UTC late backport window: 21:00–22:00 UTC: Deployer Roan (RoanKattouw), Lucas (Lucas_WMDE), Martin (Urbanecm)

4nn1l2 (talk) 22:22, 15 February 2022 (UTC)Reply

@4nn1l2 I’ve moved the training window to the usual “afternoon backports” time, assuming that this is correct (pending review in 763268). Feel free to add your config change there. Lucas Werkmeister (WMDE) (talk) 16:34, 16 February 2022 (UTC)Reply

Wikimedia blog announcements still relevant?

@thcipriani: Page says "Deployments of new or major features should be announced on the Wikimedia blog". Not sure which blog that is about nowadays (as it lacks a link). I assume it's not anymore official https://wikimediafoundation.org/news/ / meta:Wikimedia Blog, and https://techblog.wikimedia.org/ is also more for dedicated stories than announcements, and https://diff.wikimedia.org/ is more a community catch-all. Should the blog item be removed? --aklapper (talk) 08:09, 18 May 2022 (UTC)Reply

I honestly don't know what blog that referred to; it's definitely ambiguous now. I changed that bullet point to be two separate bullet points—one for notifying community, one for notifying engineers. There are links in the new text to a couple of "blog" disambiguation pages now. Does that look better to you? –Thcipriani (talk) 21:00, 23 May 2022 (UTC)Reply

Missing deployment schedules for September 12 and 19 weeks

The deployment schedules for September 12 and 19 weeks are missing. Could you add them? --Agusbou2015 (talk) 21:28, 11 September 2022 (UTC)Reply

Missing deployment schedules for October 3 week

The deployment schedule for October 3 week is missing. Why? --Agusbou2015 (talk) 18:27, 29 September 2022 (UTC)Reply

@Agusbou2015 Hey there, the bot went wrong. Now fixed. Jforrester (talk) 20:34, 29 September 2022 (UTC)Reply

Highlighting train and backport windows

When I look at this page, I only ever look for the train windows (to see if they're there this week) and the backport windows (to get my changes deployed). I suspect I am not alone in this. How about highlighting them with some more lively colors or icons? Bartosz Dziewoński (talk) 22:46, 3 November 2022 (UTC)Reply

I went ahead and just did it; today seems like a good time, since it's Thursday and no one will be deploying anything until Monday. I hope y'all like it. If it causes any issues, please revert: . Bartosz Dziewoński (talk) 23:51, 3 November 2022 (UTC)Reply
This is rather neat, thank you. Should we also highlight the (automated but actual) production deploys of the train to test wikis? (" Automatic deployment of of MediaWiki, extensions, skins, and vendor to testwikis only") Jforrester (talk) 20:01, 4 November 2022 (UTC)Reply
@Jforrester Maybe? I have never noticed them before, and I don't know what they are. What's the difference between that and the group0 deployment? (Or where can I read about it? It's not mentioned on Deployments/Train or on Heterogeneous deployment/Train deploys.) Bartosz Dziewoński (talk) 17:45, 7 November 2022 (UTC)Reply
It's the automatic train deployment to 'test wikis', i.e. testwiki, testwikidatawiki, and labtestwiki ahead of group0. The docs you linked need updating, but this step was "Sync to cluster and verify on testwiki". Jforrester (talk) 17:55, 7 November 2022 (UTC)Reply

Missing deployments schedules for January 1 week

The deployment schedule for January 1 week is missing. Why? Agusbou2015 (talk) 15:38, 2 January 2023 (UTC)Reply

This was added in this edit yesterday; I imagine the delay in running the bot was due to people being on leave. Jforrester (talk) 14:24, 4 January 2023 (UTC)Reply

Missing deployments schedules for April 10 and 17 weeks

The deployment schedule for April 10 and 17 weeks are missing. Why? --Agusbou2015 (talk) 17:07, 10 April 2023 (UTC)Reply

Hi there, this was done in this edit. Jforrester (talk) 18:28, 10 April 2023 (UTC)Reply

Deployment template

Not sure of the best place to bring this up, but I just wanted to make folx aware of {{Deploy}} — it works a little like this:

I'd be keen to hear any feedback, and if y'all think it'd be worth using? Samtar (talk) 13:58, 14 March 2024 (UTC)Reply

@Samtar: It feels a bit complicated? Given you can't use VE in nested contexts, and this would be used inside Template:Deployment calendar event card blocks, people are going to have to memorise the template, or copy-paste it from other uses, so //e.g.// they'd have to know to write 'config' and not 'site change' or 'logo' or whatever. I like it pushing people to fill in all the details though. Not sure. Jforrester (talk) 14:53, 14 March 2024 (UTC)Reply

Very belated comment but I'm finding the * built into the template very annoying to work with (probably due to some parser edge case I don't understand, rather than anything specific to the template). Sometimes I want something like

  • scap 1:
    • patch 1
    • patch 2
  • scap 2:
    • patch 3
    • patch 4

but I have no idea how to make it happen - *{{deploy|...}} doesn't work, ** {{deploy|...}} doesn't work, <ul><li>{{deploy|...}}</li></ul> works but makes a mess of the wikitext... --Tgr (WMF) (talk) 13:43, 5 March 2025 (UTC)Reply

@Tgr (WMF): I've added the nobullet parameter to the template, which removes the leading bullet point, e.g.
{{ircnick|TheresNoTime|Sammy}}
* {{deploy|type=config|gerrit=951042|title=IS: Enable Phonos on all projects|status=|nobullet=true}}
** {{deploy|type=config|gerrit=951042|title=IS: Enable Phonos on all projects|status=|nobullet=true}}

becomes:

Sammy (TheresNoTime)

does that help at all? — TheresNoTime (talk • they/them) 14:55, 9 April 2025 (UTC)Reply

Awesome, thank you! Tgr (WMF) (talk) 14:54, 10 April 2025 (UTC)Reply

"Announce changes..."

Announce changes to the ops mailing list ahead of time if they are likely to affect HTTP caching, introduce new cookies, or utilize new database tables.

I don't think this matches actual practice. What would be a more reasonable thing to write? Is the cookie thing about cookies with "session" in the name which prevent caching (in which case maybe we should just document that)? I'm not even sure what "utilize new database tables" means - writes to a table that wasn't used at all before? (That would have been created in close collaboration with DBAs anyway, right?) Any major changes to utilization of a DB table? Tgr (WMF) (talk) 13:47, 5 March 2025 (UTC)Reply

@Tgr (WMF): I think the new DB tables bit pre-dates the co-ordination with the DBAs, yes. I don't expect devs to magically know what cookie names/types might split the cache this week, so checking with SRE ServiceOps/Traffic before deployment seems like it's still good advice? Jdforrester (WMF) (talk) 15:08, 5 March 2025 (UTC)Reply
But why not just publicly document it and ask people to check that documentation before making changes?
It's also a weird warning because most such changes will happen via the train, not a backport window, and I don't think breaking our caching infrastructure via the train is less bad than breaking it via backports.
(Also, how many people even have access to ops-l?) Tgr (WMF) (talk) 16:06, 5 March 2025 (UTC)Reply
@Tgr (WMF): Per development policy, all 'exciting' new code (which would definitely including writing to new tables) is meant to be feature-flagged and only enabled by its own window. Anyone that deploys is meant to be on ops-l as a condition of that right, given the need to be aware of issues. Jdforrester (WMF) (talk) 22:36, 6 March 2025 (UTC)Reply
That line has been in here a while.
What about: "Announce changes to the ops mailing list ahead of time if you anticipate or are uncertain about noticeable impacts to database load or caching"?
re:problems with train being as bad as other windows—yes, but SRE is aware if they're seeing something bad and its timing is correlated with train, check the train. This is meant to make them aware of risky one-off windows. TCipriani (WMF) (talk) 22:53, 2 April 2025 (UTC)Reply
I'd think if you see something bad, you check SAL, and if it correlates with a backport, check the patch (or couple patches since these days we have to deploy them in batches due to time pressure). If it correlates with the train, there are about a hundred patches to check; and if you are really unlucky, it's some unexpected interaction between multiple patches, or unexpected interaction between the old and new versions of the code on different groups. Backports are vastly simpler, both conceptually and in scale.
Anyway that text sounds reasonable to me. Tgr (WMF) (talk) 10:34, 3 April 2025 (UTC)Reply
Done! I hope that this better aligns with reality. TCipriani (WMF) (talk) 15:33, 4 April 2025 (UTC)Reply

Bot to update deployment item status

I've been working on a bot which scans the deployment page for backport window items (iff they are using the correct template) and sets their deployment status to either done or not done. and also attempts to mark which deployer did the item's deployment (based on the SAL entry) and link to said SAL entry. (more info here) — it's made a couple of very supervised edits (e.g. this, that, plus most of the history here), and before doing any more I'd like to just check that a) this is wanted and b) it's okay for me to do some larger, supervised test runs against Deployments. The code is available on GitHub. — TheresNoTime (talk • they/them) 15:12, 9 April 2025 (UTC)Reply

@TheresNoTime: This is rather fun, and I like it. Ideally deployers would be marking done/not-done as they go, but certainly filling in the blame/SAL link is not something I'd expect to be done by humans, and a belt-and-braces coverage approach with a bot seems great! Jdforrester (WMF) (talk) 17:30, 9 April 2025 (UTC)Reply
@TheresNoTime Really cool! I like it. Offhand, I don't think this conflicts with any other bots operating on this page, so larger tests seem fine to me. TCipriani (WMF) (talk) 19:33, 9 April 2025 (UTC)Reply
Given I'm spamming RecentChanges quite a bit, is there any chance someone could add it to the bot group? :-) — TheresNoTime (talk • they/them) 12:56, 10 April 2025 (UTC)Reply
@TheresNoTime: Done: https://wikitech.wikimedia.org/w/index.php?title=Special:Log&logid=995209 Jdforrester (WMF) (talk) 17:37, 10 April 2025 (UTC)Reply
Thanks! — TheresNoTime (talk • they/them) 08:51, 11 April 2025 (UTC)Reply
@TheresNoTime: Any updates on this? Is there anything I can do to help? Jdforrester (WMF) (talk) 13:27, 11 June 2025 (UTC)Reply
I think something broke slightly the last time I did a big test run, but the only delay is me putting some time back into it :D it's still high on my to-do list! — TheresNoTime (talk • they/them) 13:31, 11 June 2025 (UTC)Reply

Delete the UTC morning backport window?

I've had a couple experiences now, including today, where myself or others scheduled patches for backport in this window and no deployers were available to do them. Would it make sense to take the "UTC morning backport window" off the calendar? This would reduce wasted time for backport patch writers, and would adjust expectations to more closely match the actual situation. Novem Linguae (talk) 07:10, 10 June 2025 (UTC)Reply

@Novem Linguae: The "European morning" window was added mostly for WMDE who wanted to deploy earlier in their day (instead of just at going-home time), which was reasonable. However, as you say, the windows only work if a deployer is around. Are people from WMDE not generally around to do the deploys any more?
(Ping @TCipriani (WMF) in case he didn't get auto-subscribed.) Jdforrester (WMF) (talk) 12:00, 10 June 2025 (UTC)Reply
Are people from WMDE not generally around to do the deploys any more? I think that's correct. I've experienced no deployer before, DreamRimmer and Bunnypranav experienced it today, and my friend mentioned it in DM. Novem Linguae (talk) 12:32, 10 June 2025 (UTC)Reply
This is actually my second time without a deployer in this window. Another time, me and 3 other patch writers were left hanging for half an hour, until I pinged Hashar, who was the only one I know available at that time. Bunnypranav (talk) 12:44, 10 June 2025 (UTC)Reply
I have had similar experiences while deploying, the morning deployment is typically not worth it and I always make a mental note to not schedule patches for that slot if possible. Sohom Datta (talk) 14:08, 10 June 2025 (UTC)Reply
Some other options besides deleting it are renaming it to "WMDE backport window" so that non-WMDE folks stop signing up for it, or changing the deployers that get pinged to folks that are active at that time. Novem Linguae (talk) 19:14, 10 June 2025 (UTC)Reply
This is our least-used backport window https://people.wikimedia.org/~thcipriani/hourly-backports.png
But that window is needed to support WMDE (as @Jdforrester (WMF) mentioned), and to ensure people in far UTC+ timezones have any time during daylight hours where they can deploy. Here's a google sheet with deployment windows vs. timezones.
Sounds like the window is not undesirable, since folks here are frustrated trying to use it. I'd prefer to keep it and focus on deployer recruiting efforts. TCipriani (WMF) (talk) 00:29, 11 June 2025 (UTC)Reply
[…] focus on deployer recruiting efforts. @TCipriani (WMF), out of interest, do you know how well these are going (both in general, and in terms of people signing up to be on the 'deployer list' for a recurring backport window)?
What's prompting me to ask this question just now was noticing that @Bunnypranav (I hope you don't mind the ping) had to schedule their most recent patch for deployment three times (each for a UTC afternoon backport window) before there was someone in IRC who was able to deploy the patch (window 1 logs, window 2 logs, window 3 logs). In the third window, there was someone who was able to deploy Bunnypranav's patch (which added a new namespace to crhwiki), but they didn't feel confident to run the new-namespace cleanup script (which, to be clear, is fair enough on their part;/gen but now means that both Bunnypranav & crhwiki have been left somewhat in limbo, with the new namespace created but namespaceDupes.php not ran).
While I don't want to criticise any individuals involved in this process, I hope you will agree with me that this definitely seems like a far from ideal experience to say the least. Best, a smart kitten (mw // phab // talk) 09:42, 31 October 2025 (UTC)Reply
I would like to add to this that even when scheduling in the afternoon windows I had to reschedule thrice, I wonder how much more that would be if I did it in the morning window, based on prior experience. Just like a smart kitten said above though, I do not intend to blame anyone, but this is definitely not ideal of a scenario. Bunnypranav (talk) 10:34, 31 October 2025 (UTC)Reply
Idea: Maybe the clinic duty SRE of the week should also be assigned to one of the daily backports. Then we'd be guaranteed to have a deployer at that backport. Novem Linguae (talk) 17:21, 31 October 2025 (UTC)Reply
I'm sorry for your struggles getting a namespace change out @Bunnypranav.
@TCipriani (WMF), out of interest, do you know how well these are going (both in general, and in terms of people signing up to be on the 'deployer list' for a recurring backport window)?
In the aggregate, 2025 has seen more deployers and backports than any previous year. That's a success due largely to efforts to make backport deployment easier and subsequently adding deployers to the pool.
But I agree that @Bunnypranav's experience was far from a success.
For backports, there are two big problems:
  1. a dearth of people who can deploy code at all
  2. a dearth of people with confidence deploying other's code/config (and running maintenance scripts)
This year we focused on the first problem, that's gone well.
The other half of the problem—finding folks who are confident with deploying config/code for others and running subsequent maintenance scripts is harder.
For a while we did deployment training, at the time I wondered about the efficacy of training—we spent so much time training folks on the unnecessary complexity of the process that we seldom (read: basically never) got to things like "here's how to add a namespace." So we paused the time-intensive training and we made deployment easier.
Maybe revamping the training now makes sense. Maybe we should focus on making common requests well-known and easy to handle by expanding Backport_windows/Deployers#Maintenance_scripts (or abstracting these things). I'm open to ideas. I'll flag this with some folks and see if they have thoughts, too.
Thanks for the ping @A smart kitten. TCipriani (WMF) (talk) 21:37, 31 October 2025 (UTC)Reply
Thanks for the reply, @TCipriani (WMF)!
I'm open to ideas. Novem's idea for the clinic duty SRE to be assigned to one of the daily backports (I assume if no scheduled deployer is available for that backport window) sounds interesting -- I wonder if it's worth suggesting that to the SRE team / opening up a conversation about that? I had a similar idea myself (for a designated person/team to cover one/some of the backport windows in case no scheduled deployer is available), but - before Novem suggested SRE - RelEng was the team that originally came to my mind regarding that.
Another idea might be to put out a call internally for folks to sign up to be on the 'deployer list' for some recurring backport windows (and yeah, now that SpiderPig is a thing, perhaps it might be possible to offer training on e.g. the usual types of config modifications that can be proposed by volunteers)? Maybe people thinking about signing up for this could be offered the opportunity to 'shadow' experienced deployers for a number of backport windows, to hopefully get an idea & a feel for the types of [config] patches that usually get proposed in these windows.
As a side note, if there is somewhere that I can follow any updates on this, please let me know - I'd be interested to keep in the loop with developments on this front :) a smart kitten (mw // phab // talk) 11:16, 3 November 2025 (UTC)Reply
The intersection of knowledge, availability, and interest seems to be a long term challenge in backports in general and community configuration requests in particular. A specifically challenging aspect of community config changes is that they can involve running maintenance scripts that the deployer has never seen before. This can create anxiety and discomfort for the deployer as one might imagine. We have prizes for fixing the wikis after they get broken, but most folks are not excited to be involved in the breaking aspect due to lack of familiarity with the tools.

The current emphasis on volunteer operation of the windows feels like it is finding challenges similar to the ones that led to us stopping the Technical Advice IRC Meeting process in the past. Volunteering by Foundation and affiliate staff is still volunteering which makes it a thing that folks will reasonably have waxing and waning interest and availability for over time.

I like the idea of finding a rotating chore group with institutional support to assist with MediaWiki deployments and maintenance scripts. It is likely not well understood by the volunteer community that the SRE teams are largely detached from either activity in both their training and normal job responsibilities. The Release Engineering team has close connections to deployment, but they too often lack knowledge of how and when to use particular maintenance scripts. -- BryanDavis (talk) 19:08, 5 November 2025 (UTC)Reply
I also would like to keep in the loop and look forward to any and all implementations. Small thing I observed during the times I scheduled patches (less than many regulars, limited to morning and afternoon windows, still valid though) was that the people who are listed in the Deployers section are rarely ever the ones who actually are available and deploy. I don't know how regular it is in general, but I've seen it a lot. I'd say the team should look at making sure the people who are available are pinged, and those who aren't are removed so that we do not feel that the windows is active wheras it has no active and available deployers. Bunnypranav (talk) 15:29, 3 November 2025 (UTC)Reply

Clarify the day of late-UTC-night windows

Certain late night windows can happen in different days for different time zones. For example, the next branch cut is Tuesday July 8th at 02:00 UTC, which is still Monday in PDT. The time box on the left reads "(Mon) 19:00–20:00 PDT", but this is redundant because the event is already in the section for Monday. And instead, there is no indication that the UTC time (and potentially the user's local time) is NOT on Monday. So, I believe we should either:

  • Use UTC times for grouping items in sections. The example item above would be in the Tuesday section, and we'd leave the "(Mon)" for PDT.
  • Keep using PDT times for sections. But then we should remove the day indication from PDT times and add it to UTC times as needed.

In both cases, we would need to add the day indication for local time (as needed). Daimona Eaytoy (talk) 12:34, 4 July 2025 (UTC)Reply