Jump to content

User:Taavi/EnhancementProposals/Toolforge API authentication decoupling MVP

From Wikitech
This page is currently a draft.
Material may not yet be complete, information may presently be omitted, and certain parts of the content may be subject to radical, rapid alteration. More information pertaining to this may be available on the talk page.

This document is a proposal to implement a service to authenticate requests made to Toolforge APIs[1] independent of the Kubernetes user certificates that are currently used for this purpose.

Background

Toolforge has recently migrated from a Grid Engine based platform to a Kubernetes based platform.[2] As a part of the migration process, we're building new Toolforge components with an API based design, where the user-facing interfaces (such as CLI tools) interact with a custom-built API service which in turns interacts with Kubernetes and other software doing the actual work.[3] Decoupling the logic from the CLI tools unlocks an opportunity to build alternative frontends to manage tools, such as a web-based management interface to the Toolforge admin console or a CLI tool that users can install on their local workstations.

Currently there is an API gateway, implemented in an earlier enhancement proposal, which authenticates the requests using client certificates issued by maintain-kubeusers. These certificates were originally created to authenticate connections to the Kubernetes API directly, and are not easily available outside NFS-connected Toolforge hosts. In order to use the APIs for non-Toolforge-hosted-CLI use cases, the authentication system needs to be improved.

As of time of writing, there is an ongoing decision request discussing interfaces to the LDAP directory that acts as the canonical data store for tool account and tool membership information. The proposal outlined in this document should be implemented only after the changes decided in that decision request has been implemented.[4]

Proposal

The primary goal of this proposal is to build something to remove the direct dependency on Kubernetes API certificates. It does not aim to satisfy every use case; rather the idea here is to build a solid foundation that can then easily be extended for those use cases.

New service

There will be a new Toolforge infrastructure component, the auth server (working title, exact name TBD due to [1]).

The initial service will implement a single HTTP endpoint to authorize a request. This request is made by the API gateway to authorize a single request and contains two pieces of information:

  • The Authorization HTTP header forwarded from the original HTTP request, with the format [Scheme] [Scheme-specific data]. The authentication schemes will be specified separately, with the section #Infrastructure tokens of this document defining the only scheme supported in the initial implementation. Authorization schemes are responsible for extracting the developer/tool account the request is coming from (with shared code determining if that user/tool can act on the target specified in the request context).
  • Context for the request, which the API gateway will create internally. In the initial implementation, this context will include the tool name the request is acting on.[5]

Based on that information, the new service will make a decision on whether the request is permitted or not.

Infrastructure tokens

An infrastructure token is a scheme to authenticate requests from trusted clients that can act as any tool account.[6] They are intended for infrastructure components not acting on a specific user request (e.g. writing new tool-specific database credentials as envvars), as well as other trusted code that uses some other method to securely identify the client (e.g. Striker).

Infrastructure tokens are implemented as short-lived JSON Web Tokens that encode information about the tool or developer that the infrastructure client is acting on behalf of.

Backwards compatibility with certificate-based authentication

The API gateway will use an infrastructure token to authenticate requests that use Kubernetes client certificates for authentication.

Client code

toolforge-weld should be updated to support the new authentication methods described here if required by its users.

Other options considered

idp.wikimedia.org

Implement this logic directly in the API gateway

Internal user catalog

This section is a partial list of changes required for the various infrastructure components using services behind the Toolforge API gateway.

replica-cnf-api-service
This service runs on the NFS server and is responsible for writing newly generated Wiki Replicas and ToolsDB credentials to a file on the NFS system as well as as envvar. The envvar writing logic could be migrated from using the Kubernetes API certificates in the tool home directory to using an infrastructure token.
Components API
The components API currently uses a Kubernetes client certificate with special "superuser" handling in the API gateway. It would be migrated to use an infrastructure token.
Toolforge CLIs
No changes are required for now, although these changes allow adding an option for users with admin certificates to select a tool to act on without having to use become.
Striker
Striker does not currently call any of the Toolforge APIs. The improvements proposed in this document are largely seen as a blocker for implementing such features, and those features would use an infrastructure token to do so.

Further development options

User tokens

OAuth web authentication flow

Components API push-to-deploy tokens

Tool account management APIs

The write APIs exposed by the new tool account management server should be migrated to use this API so that non-Striker users can manage tools as well.

Striker APIs

We may wish to expose APIs for data canonically managed in Striker (tool information, Phabricator projects, GitLab repositories). Such APIs would use this new service to authenticate requests.

Footnotes

  1. For example the Jobs framework API, the Build service API and the envvars API.
  2. News/Toolforge Grid Engine deprecation
  3. Wikimedia Cloud Services team/EnhancementProposals/Toolforge API gateway (2023)
  4. Based on current discussion, this document is written with the assumption that Option Purple or a variant of that would be implemented.
  5. The API gateway code already parses the target tool name from the request URL.
  6. Think of user impersonation in Kubernetes