User:Taavi/EnhancementProposals/Toolforge API OAuth 2 support
Material may not yet be complete, information may presently be omitted, and certain parts of the content may be subject to radical, rapid alteration. More information pertaining to this may be available on the talk page.
This document is a proposal to implement a new OAuth 2.0 compatible system to authenticate requests made to Toolforge APIs[1] independent of the Kubernetes user certificates that are currently used for this purpose. The core OAuth protocol is well-standardized and mature, which means using it instead of a custom implementation means we can re-use existing mature libraries instead of implementing our own.
Background
Toolforge is moving from a Grid Engine based platform to a Kubernetes based platform.[2] As a part of the migration process, we're building new Toolforge components with an API based design, where the user-facing interfaces (such as CLI tools) interact with a custom-built API service which in turns interacts with Kubernetes and other software doing the actual work.[3] Decoupling the logic from the CLI tools unlocks an opportunity to build alternative frontends to manage tools, for example implementing a web-based management interface to the Toolforge admin console.
Currently there is an API gateway, implemented in an earlier enhancement proposal, which authenticates the requests using client certificates issued by maintain-kubeusers. These certificates were originally created to authenticate connections to the Kubernetes API directly, and are not easily available outside NFS-connected Toolforge hosts. In order to use the APIs for non-CLI use cases, the authentication system needs to be improved.
Proposal
TODO, see also https://phabricator.wikimedia.org/T332478
Auth server
There will be a new Toolforge infrastructure component, the auth server (working title, exact name TBD due to [1]). The auth server implements an OAuth 2.0 (RFC:6749) compatible interface to request a Bearer token (RFC:6750) that can be used with requests to the individual APIs. Initially, the only way to authenticate the request to get a Bearer token will be to authenticate using the existing Kubernetes client certificates, but the intention is that to support future use cases we expand this authentication mechanism.
The auth server will require some custom business logic (for example, to map tool maintainers to tools, etc). It's otherwise going to be quite simple, so I propose we build it ourselves instead of using one of the existing 'cloud-native' solutions that are usually quite complicated, need a fair bit of configuration for these custom use cases and end up being more complicated than you need.
Essentially the auth server needs to do three things:
- Authenticate incoming requests and match which user is making that request
- Check if the user making the request can access that specific tool and the specific scopes being requested
- Issue a token
Bearer tokens
The backend api services will need to be able to verify the bearer token and get the necessary information from it. There are essentially two options for token formats:
- Make the token contents random and meaningless, store the valid tokens somewhere and use Token Introspection (RFC:7662) in the services to verify if the tokens are valid.
- Pros: Tokens can be easily revoked
- Cons: Auth server needs to track active tokens, and becomes a SPOF. More complicated to implement.
- Use a signed format like JWTs to include all of the information in the token itself.
- Pros: Easier to implement, Auth server does not need to verify each request
- Cons: Tokens can't be issued once revoked, we would possibly need to store renewal tokens (if we end up needing those) anyways.
Scopes
Each token will be limited to specific services it needs to access using OAuth scopes. This is to ensure various service accounts only have access to services that they really need to.
TBD details
Deployment details
TBD, depending on its database and networking needs
Workflow flowcharts
-
Current workflow
-
Proposed workflow
Footnotes
- ↑ For example the jobs-framework-api, the Build Service API and the envvars API.
- ↑ News/Toolforge Grid Engine deprecation
- ↑ Wikimedia Cloud Services team/EnhancementProposals/Toolforge API gateway