Agenda
- abandoned tool policy, standards committee inactivity
- Setting timeline for turning off the grid
- Quotas (Taavi)
Notes
Abandoned tool policy
TV: Itâs inactive, whatâs the historical background?
BD: We should reboot the committee, itâs not been refreshed since 20172016. We could ask in the general mailing list who is interested. To clear the backlog, anyone with Toolforge admin rights could clear it. The idea of the committee was not to have only paid staff decide.
Timeline for turning off the grid
SKS: Weâre reaching out to maintainers once more. 100 maintainers were not reached because the emails on file were not reachable. Some of the tools are probably just experimental. For specific dates we havenât communicated any so far, but loosely the migration will continue through the end of 2023.
AB: While I last talk to Nicholas about this, we discussed a âwarning shotâ, an email saying that weâre gonna turn stuff off, wait some more, then turn off the tools. I donât have specific dates.
SKS: Last resort is to temporarily turn off a tool, and if they come back in time we can still turn it back on. We want to some more communication.
AB: Currently the tool disabling process is tool-wide. We donât have a way to turn off Grid things but not Kubernetes things. Is it possible to have a tool running in both systems?
BD: You can only have one webservice, either Grid or K8s. But you can have more jobs.
AB: We need to split out the âdisableâ button. Or we can say that weâll disable both if we donât hear back.
BD: Can we just stop the Grid jobs?
AB: If we do and people show up and ask, we can tell them their code is still there.
BD: If we want intermediate blocks we need additional software development.
AB: Thereâs a trivial way of stopping, but maybe not of blocking. I assumed it was a subset of the current shutdown process.
TV: We can delete crontabs and the user is free to recreate it, but weâll turn it off again.
AB: If the goal is to do all of them at the same time, there are easier paths.
BD: If we want to selectively block tools, I would block all new tools from using the grid as the first thing. We shouldâve done that a year ago, but thatâs another story.
AB: Telling people âmigrate or dieâ is obnoxious. Punishing users for using a tool they didnât know is unsupported makes everybody sad.
BD: But itâs the only way we find out about unmaintained tools.
AB: We donât need to write new software, it could be a 4-line Bash script. As a start we can just stop jobs.
TV: Is it just a matter of sending an email and picking a date?
TV: What do we want to do about people without a valid email?
AB: I donât know what to do. We need to broadcast things also on Discord, etc.
BD: A thing Iâve done in the past was to look for SUL accounts that were associated. Turns out there was a bug that caused the email addresses to blank.
SKS: I used LDAP search, but sometimes itâs still blank
BD: The bug was, logging into wikitech it might have blanked your email in LDAP, because the email address wasnât set in mediawiki db yet. If the email is blank, it might not be the userâs fault. The place to find potentially a mail to reach them is the SUL database, separate from wikitech. Sometimes you can correlate those accounts through Phabricator, sometimes from Striker database (if they used OAuth). You have to poke at the db. Sometimes they have the same username. Itâs tedious.
TV: Dev account email are generally considered public, and SUL emails are not.
AB: About 100 email bounced, but the number with blank emails should be smaller. Another way would be to broadcast to talk pages.
BD: We used to do it.
AB: Itâs a legit fallback if we canât find emails.
TV: Itâs something large enough to mention in the Tech News newsletter.
BD: I would always give 3 month notice for this sort of thing. It seems like forever, but I think itâs the right thing to do.
SKS: For people without emails, I wanted to reach out through the specific ticket for each tool. Would that work?
AB: It's totally possible that they have different phab emails from wikitech emails. Itâs a lot of trouble, but doing all these things would be nicest thing to do.
Default quotas
TV: I was looking this week at implementing them. Pick a reasonable number of pods and then define CPU and RAM quotas based on that. Is 10 pods reasonable?
AB: Are there any scenarios with multiple pods for a given tool? Can users scale? Do they have access to those APIs?
BD: Yes they can. They can create 6 pods that can communicate with each other.
AB: 10 seems like a big default
BD: We havenât tracked cronjobs per tool. In Grid Engine you can request to run 70 jobs at the same time, and the Grid gives you 10 slots in parallel. K8s doesnât have a similar functionality for queuing things that will eventually execute.
TV: The Job object. If it doesnât have enough quota it will wait, so it will be equivalent. There is limit for jobs, higher than the number of pods.
BD: I donât think 10 sounds that big.
TV: It would be easier if we tracked quotas in Git. Essentially having maintain-kube-users handle the quotas too.
BD: Sounds ok to me.
Action items
- Recruit more people to the Standards Committee (Komla?)
- Go through the tool adoption backlog