Portal:Toolforge/Admin/Monthly meeting/2025-06-17
Appearance
Jun 17, 2025 | Toolforge monthly meeting
Attendees (shuffled)
- Francesco Negri
- Chuck Onwumelu
- David Caro
- Bryan Davis (bd808)
- Taavi Väänänen
- Andrew Bogott
- Seyram Komla Sapaty
Notes
k8s upgrade workgroup progress OKR/hypotheses progress Decision request - Tool account management and Striker (phab:T394035) Notes
k8s upgrade workgroup progress
- David started testing the 1.30 version, tests worked in lima-kilo but kyverno does not officially support 1.30, so have to upgrade.
OKR/hypotheses progress
- [SKS] Sustainability score, completing the proposal text
- [DC] Push-to-deploy beta, added more features for minimal version of the beta. Created an admin page and started creating a user help page. Beta is still scheduled for the end of the month. I will also prepare the email to send around. There are more features we want to add, but for the start we will only support build-service continuous jobs. We’ll then add other types of jobs and pre-built images (where you just need to restart the job).
- [BD] Trying to think of an interesting trigger for supporting pre-built images. Where would it be valuable? A lot of the time you would want to also change the code on NFS.
- [TV] It can be useful if combined with images from the build service, e.g. running a mariadb command after a build service deploy, or a logrotate.
- [DC] No updates on Toolforge UI
Loki
- [TV] Collecting the logs in Loki is the easy part, the hard part is how to display logs to users. We have a working Loki deployment in lima-kilo that collects all stdout/stderr logs. I hope to deploy this to toolsbeta in the coming weeks, to test it with the Ceph-based S3 storage, and see how much storage it is going to need. Hopefully more interesting updates in a month or two.
Decision request - Tool account management and Striker (phab:T394035)
- [TV] The discussion is a bit stuck. It’s not so urgent, but maybe we can discuss this now
…
- [BD] Trying to think of the threat model. Striker is now deployed in the prod realm to keep secrets that can interact with LDAP. We want to make it difficult to have access to the LDAP endpoints.
- [TV] Right now Striker can write to any LDAP groups or users, including the groups controlling access to Logstash, etc.
…
- [DC] We need to find a balance between a “fat API” and leaving as much logic as we can in the Toolforge/Striker codebase. Something that is not clear is what is the flow? What makes that API very attractive is that user don’t use it directly.
- [TV] Striker would be the only client. Right now we could use a shared secret between the API and Striker. But in the future, e.g. with Toolforge CLI, my concern is that at this point we would need to reimplement a new version of that API but suitable for these clients.
- [DC] The way I see it in the future we could expose what Striker can do with an API integrated with the Toolforge API, and that would in the backend interact with the LDAP API.
…
- [DC] Toolforge API and Striker now are completely separate.
- [AB] Right now Striker is both “View” and “Controller”. Can we split that and create a separate “Controller”?
- [BD] Toolforge API runs inside Toolforge. We have business logic in Striker that we don’t want to run in Toolforge. Another way of looking at it is: can we make Striker part of the Toolforge API? The threat model question is: is it ok to put the creds inside Toolforge API to make writes to LDAP?
…
- [AB] The most minimal thing we can do is a shim that runs in prod and is limited to LDAP, and is limited to things that we expect the Toolforge API can do.
- [DC] What we want to do is split the “Controller” into a secondary controller, with just what Striker needs. Extract a little component and move it to prod so we can leave the other parts.
- [TV] The disagreement is if the service should just include logic that ensures that toolforge-related things can be touched? Or also that the toolforge user can do that specific action?
- [AB] I think if Moritz were to review this (security-wise), he would not care.
…
- [FN] Could we possibly in the future move away from LDAP?
- [BD] Everything is designed around Unix users and groups, so it is theoretically possible but it would require removing not just NFS, but also bastions and anything relying on Unix auth.
- [DC] Have you ever planned to have developer accounts with a “portal” that gives you access both to Toolforge and to other things like Grafana, etc.
- [AB] The backend is basically already like that, we just miss a single centralized UI
- [DC] We would still need to connect to IDM even if we had our own datastore for users?
- [AB] If we were starting from scratch, auth would be in LDAP, and group membership would be stored elsewhere. But given we were using SSH, it was very convenient to store tools/group membership in LDAP. That could be in a different LDAP database? I don’t think we need to do that, but we could do that.
- [BD] LDAP is good for both authentication and authorization, SREs are using it and we should not recreate a new tool for doing that.
- [DC] the new ldap shim should be limited to the tools subtrees but should have no logic outside of that.