Help talk:Toolforge/Build Service

From Wikitech
Jump to navigation Jump to search
The Toolforge Build Service is currently in Beta state, this means that it is likely to change in non-backward compatible ways and/or suffer from outages and bugs. Although we encourage you to try it out and give feedback, please do not use it for critical tools until it becomes more stable and this banner is removed.

Comments from Taavi

Some feedback in no particular order:

  • If pack is something you're intended to be able to run locally, it needs installation instructions (or at least a link to one).
    • related, does the version of pack matter? I'd assume the standard is mostly backwards-compatible, but worth double-checking.
  • Is tools-harbor.wmcloud.org/toolforge/heroku-builder-classic:22 ever expected to change? Is the last number a version or something else?
  • About converting a tool to use buildpacks:
    • Needs instructions on converting a tool to use buildpacks. Currently the guide starts from an app that's possible to build with pack which is not helpful.
    • Also needs details on supported languages and their specifics like installing dependencies.
    • What is a "stage" or a "multi-stage build"? For example if I have a PHP web tool, would that require multiple "stages"? (In general PHP apps use an external web server.)
  • This will need details for the differences existing tool maintainers will see and planned changes to that. Things like sssd differences, NFS plus the future plans of removing it at least partially.
  • I think it's worth adding a very large banner warning that this is experimental and things might be broken for a while. And an explicit mention about the jobs framework not being supported yet. And maybe clarify how the existing possibilities (grid and the current k8s offerings) will be supported when buildpacks will be introduced.
  • Workflow-wise, if this intended for relative newcomers, I think it'd be better to write this around having a tool that's in a Striker-created GitLab repo instead of http://url_to_your_git_repo.
    • and for more widespread adoption, Git itself might be something we need to explain and document.
  • Link to Help:Toolforge/Quickstart for creating a tool and setting up access. You can take inspiration for the wording from the "My first X tool" tutorials.
  • Clarify which commands need to be executed locally on your own machine and which should be executed at a bastion.

Majavah (talk!) 09:17, 3 May 2023 (UTC)Reply[reply]

Thanks for the feedback!
> If pack is something you're intended to be able to run locally, it needs installation instructions (or at least a link to one).
Ack, though it's not required in any way, only for local testing (if you want), for reference when editing the page the link is https://buildpacks.io/docs/tools/pack/
> related, does the version of pack matter? I'd assume the standard is mostly backwards-compatible, but worth double-checking.
Older versions of pack might not work (as buildpacks use newer features), but anything newer will support the buildpacks/builder for a while as the standard is quite backwards compatible focused (as you guessed).
> Needs instructions on converting a tool to use buildpacks. Currently the guide starts from an app that's possible to build with pack which is not helpful.
> Also needs details on supported languages and their specifics like installing dependencies.
Working on it: https://phabricator.wikimedia.org/T335359
> This will need details for the differences existing tool maintainers will see and planned changes to that. Things like sssd differences, NFS plus the future plans of removing it at least partially.
Working on it: https://phabricator.wikimedia.org/T335357
> Is tools-harbor.wmcloud.org/toolforge/heroku-builder-classic:22 ever expected to change? Is the last number a version or something else?
The tag is the current version of the upstream builder image (we are not building our own for the beta), that will change with time yes, as upstream changes.
We can introduce a `:latest` tag for the image that always points to the latest version as a workaround, but I don't expect that to change too often, and probably we will change the name of the builder sooner than that.
Note that building the image locally is not needed for it to work, it's just an easier way to test (specially because we don't provide detailed build logs yet).
> What is a "stage" or a "multi-stage build"? For example if I have a PHP web tool, would that require multiple "stages"? (In general PHP apps use an external web server.)
I guess you mean stacks, this is being able to have python and php on the same image (ex. if you have a php code that runs a python script, or viceversa), or a more common usecase, having a react + python app, and being able to just pass the source code. As of now, you'll have to first compile the react app, commit the compiled code, and then build the image (npm/js and python are different stacks).
If you want to install packages, same, it's not supported yet and under discussion (https://phabricator.wikimedia.org/T325799)
> I think it's worth adding a very large banner warning that this is experimental and things might be broken for a while. And an explicit mention about the jobs framework not being supported yet.
+1 for the big notice.
> And maybe clarify how the existing possibilities (grid and the current k8s offerings) will be supported when buildpacks will be introduced.
This is still under discussion, for sure the grid will not change at all from the current state, k8s might change but there's no clear direction yet, I'm hoping that the beta (as an addition to the current offering) will show if they are useful or not, if they are not we will probably scrub the project, if they are useful, then decide how to continue (of course, including all toolforge roots, probably in a mixture of workgroup meetings/tasks/decision requests/...)
Sorry, became too long, xd, short answer, current offerings stay as they were, there's just a new offering (webservices built by the build service).
Hmm, so yep, maybe some clarification is due :)
> Workflow-wise, if this intended for relative newcomers, I think it'd be better to write this around having a tool that's in a Striker-created GitLab repo instead of http://url_to_your_git_repo.
+1 for that, though should be clear that any public git repo is supported.
> Link to Help:Toolforge/Quickstart for creating a tool and setting up access. You can take inspiration for the wording from the "My first X tool" tutorials.
The idea is to change/add a full tutorial there too, while this page is just for the first beta adopters yes: https://phabricator.wikimedia.org/T324816
> Clarify which commands need to be executed locally on your own machine and which should be executed at a bastion.
Definitely David Caro (talk) 09:47, 3 May 2023 (UTC)Reply[reply]

Comments from trying it out

Now testing with a tool of my own (db-names, plain PHP):

  • The docker run command from the tutorial fails with unknown flag: --port. Swapping --port--port to -p works.
  • The PHP buildpack runner defaults to picking a random port. Hopefully that's not going to be an issue?
  • The columns on toolforge build show don't align up which makes it confusing to read.
  • Please ensure the tool- prefix to Harbor namespaces gets added before people starting using this.
  • My build failed!
    • Please include instructions on how to debug failed builds.
    • The tool account doesn't seem to have access to read the logs: toolforge build logs db-names-buildpacks-pipelinerun-jqfng failed with failed to get logs for task build-from-git : task build-from-git failed: pods "db-names-buildpacks-pipelinerun-jqfng-build-from-git-pod" is forbidden: User "db-names" cannot get resource "pods" in API group "" in the namespace "image-build". Run tkn tr desc db-names-buildpacks-pipelinerun-jqfng-build-from-git for more details.
    • The actual failure seems to be this: ERROR: failed to initialize analyzer: validating registry read access: ensure registry read access to tools-harbor.wmcloud.org/db-names/db-names:latest. This seems like an infrastructure issue?

Majavah (talk!) 11:41, 3 May 2023 (UTC)Reply[reply]

One more thing: running pack locally printed out a bunch of [restorer] Warning: Buildpack 'heroku/php@0.0.0' requests deprecated API '0.4' warnings. Should we worry about those? It also seems like at least the local build relies on Docker Hub images, does it also do that on the real build pipeline? Majavah (talk!) 11:45, 3 May 2023 (UTC)Reply[reply]
And the build finished successfully now after updating toolforge-cli. Another comment and another issue:
  • The build status is shown as 'ok' when it's building. That's confusing, I'd expected that 'ok' would mean 'complete'.
  • Trying to start the webservice fails with an error asking me to set --buildservice-image. I think this is supposed to be fixed in this commit, but the the latest version (0.94) is not in the buster-tools repo. I don't know what value to provide it in the meantime.
Majavah (talk!) 13:42, 3 May 2023 (UTC)Reply[reply]
> The docker run command from the tutorial fails with unknown flag: --port. Swapping --port--port to -p works.
I think that the docker option name is actually --publish xd, will change
> The PHP buildpack runner defaults to picking a random port. Hopefully that's not going to be an issue?
Hmm, that might be an issue, as of right now, the ingress/service is hard-codding the port 8000, so the container image has to be listening on that port :/
Will investigate, will open a task for it. Though the main focus for the beta is python, so will prioritize those, but will take a look to php too.
> The columns on toolforge build show don't align up which makes it confusing to read.
Same.
> Please ensure the tool- prefix to Harbor namespaces gets added before people starting using this.
This should have been shipped already yes, will confirm with Raymond and check.
> Please include instructions on how to debug failed builds.
Unfortunately, the logs can't be shown for everyone without the API, so will take a bit to be able to allow live debugging.
For now the only way to debug is locally for the built image, and after that from the few logs you can get.
> The tool account doesn't seem to have access to read the logs: toolforge build logs db-names-buildpacks-pipelinerun-jqfng failed with failed to get logs for task build-from-git : task build-from-git failed: pods "db-names-buildpacks-pipelinerun-jqfng-build-from-git-pod" is forbidden: User "db-names" cannot get resource "pods" in API group "" in the namespace "image-build". Run tkn tr desc db-names-buildpacks-pipelinerun-jqfng-build-from-git for more details.
You should have gotten an error like "logs are not yet available" or similar :/ is this the bastion that was running an older version of the cli?
> The build status is shown as 'ok' when it's building. That's confusing, I'd expected that 'ok' would mean 'complete'.
It should say 'running' also, doesn't it? Can you pass a screenshot or similar?
> Trying to start the webservice fails with an error asking me to set --buildservice-image. I think this is supposed to be fixed in this commit, but the the latest version (0.94) is not in the buster-tools repo. I don't know what value to provide it in the meantime.
Looking, should be deployed. The value is the image url that you built, the example/default should be informative enough, maybe it's in the newer code. Let me try after upgrading. David Caro (talk) 13:46, 3 May 2023 (UTC)Reply[reply]
> The tool account doesn't seem to have access to read the logs: toolforge build logs db-names-buildpacks-pipelinerun-jqfng failed with failed to get logs for task build-from-git : task build-from-git failed: pods "db-names-buildpacks-pipelinerun-jqfng-build-from-git-pod" is forbidden: User "db-names" cannot get resource "pods" in API group "" in the namespace "image-build". Run tkn tr desc db-names-buildpacks-pipelinerun-jqfng-build-from-git for more details.
It's shown in the help, but not when trying to run it, will open a task. David Caro (talk) 13:58, 3 May 2023 (UTC)Reply[reply]
> Trying to start the webservice fails with an error asking me to set --buildservice-image. I think this is supposed to be fixed in this commit, but the the latest version (0.94) is not in the buster-tools repo. I don't know what value to provide it in the meantime.
Upgraded the cli, should have been fixed (and a nice example shown in the help). David Caro (talk) 13:58, 3 May 2023 (UTC)Reply[reply]
It does say running on build show, but not on the list: https://phabricator.wikimedia.org/F36973640. Majavah (talk!) 14:06, 3 May 2023 (UTC)Reply[reply]
  • The status issue seems to be tracked as task T332099 already.
  • This page will need documentation on how to read the logs from the running webservice pod.
  • In the logs I see sed: couldn't open temporary file .heroku/python/lib/python3.11/site-packages/sed9j7dNe: Permission denied. Will that cause any issues?
Majavah (talk!) 14:33, 3 May 2023 (UTC)Reply[reply]
> The status issue seems to be tracked as task T332099 already.
I think it fell through the cracks xd, added the current iteration tag so it's in the radar.
> In the logs I see sed: couldn't open temporary file .heroku/python/lib/python3.11/site-packages/sed9j7dNe: Permission denied. Will that cause any issues?
Is this still happening? Is this at build time or run time? I'd like to debug it (feel free to dump it into a task). David Caro (talk) 16:41, 3 May 2023 (UTC)Reply[reply]
Seems to happen every time a Python-based buildpack starts. It's visible in the k8s pod logs. Majavah (talk!) 16:47, 3 May 2023 (UTC)Reply[reply]
Saw it, just opened task T335980 David Caro (talk) 16:42, 4 May 2023 (UTC)Reply[reply]

Notes from testing all Heroku getting started templates

I decided to test all of the Heroku getting started templates locally to see how they work with our setup with a non-standard user and group and without the PORT environment variable.

  • Ruby: does not work as it launches on port 3000. Updating the port setting or passing PORT=8000 manually makes it work.
  • Node.js: does not work as it launches on port 5001. Updating the port setting or passing PORT=8000 manually makes it work.
  • Clojure: does not work as it launches on port 3000. Updating the port setting or passing PORT=8000 manually makes it work.
  • Python: does not work as it binds to 0.0.0.0, adding --bind 0.0.0.0 to Procfile fixes the issue. The permission error message phab:T335980 is also logged.
  • Java: does not work as it launches on port 5000. Updating the port setting or passing PORT=8000 manually makes it work.
  • Gradle: does not work as it launches on port 5000. Updating the port setting or passing PORT=8000 manually makes it work.
  • Scala: crashes with a permission error (phab:T335865). If ran with the default UID, crashes due to the lack of the PORT environment variable. Setting it to 8000 finally makes the template work.
  • PHP: no getting started template as far as I can see, but we already know the buildpack is broken with the user id setup (phab:T335865).
  • Go: fails as $PORT is not set. Works fine after setting it to 8000.

The main reason for these tests was to see if more buildpacks had permission issues, so I'm pleasantly surprised only PHP and Scala are totally broken. I also now think we should set PORT=8000 on runtime for all buildservice tools since that seems to be the industry standard to indicate which port something should run on. Majavah (talk!) 14:35, 11 May 2023 (UTC)Reply[reply]

This is amazing! Thanks a lot!
Yes, I agree that using the PORT env var is the way to go instead of hard-codding the port 8000, I'll open a task to not forget (it was in my head only).
Should be fairly easy to set it on the webservice side. I'll do that right away as it unblocks many langs. David Caro (talk) 14:38, 11 May 2023 (UTC)Reply[reply]