Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deploy should only upload a single tarball #192

Open
boegel opened this issue Jun 28, 2023 · 5 comments
Open

deploy should only upload a single tarball #192

boegel opened this issue Jun 28, 2023 · 5 comments

Comments

@boegel
Copy link
Contributor

boegel commented Jun 28, 2023

I think it makes sense if the bot would combine all tarballs that should be uploaded into a single tarball before it does the upload, which would limit the overhead in the staging repo quite a bit.

@trz42
Copy link
Contributor

trz42 commented Jun 28, 2023

To me this sounds like jumping to a solution without describing the problem. There may be pros and cons with the current way tarballs are uploaded as well as with a different method.

@boegel
Copy link
Contributor Author

boegel commented Jun 29, 2023

Sorry for being a bit brief, I mostly opened this issue as a way to start the discussion on this. ;)

This is mainly motivated by the "storm" of pull requests to the staging repo every time a bot:deploy label is added in a software-layer PR.
If this happens for two or more PRs at the same time, you get a bit of a mess of PRs in the staging repo.

Moreover, getting separate PRs in the staging repo for each CPU target may quickly lead to "review/merge fatigue", which could result in us just hitting merge rather than actually taking a peek at the contents of the tarball first, which will eventually lead to mistakes.

That said, I do admit there are downsides to only having a single tarball being uploaded as well, so I'm not sure what the best option is.

To me, a single tarball would be better than 8 separate ones, or at least it's worth the try - we can always switch back.

@trz42
Copy link
Contributor

trz42 commented Jun 29, 2023

I see. The "storm" of PRs is created by the automated_ingestion.py script running on the Stratum-0.

What we do in NESSI/staging where we get only 6 PRs per software-layer PR, because we only build for 6 CPU architectures, we manually label PRs with something like 'NESSI/2023.06' and 'SW/version'. The former is a bit duplication (version is in a PRs title), the latter adds information that currently cannot be easily seen in the PR overview.

I think it's worthwhile to discuss possible improvements. However, I'd also want to see the whole picture. How does a change affect other steps in the overall procedure of creating-(re)building-(re)testing-(re)uploading-ingesting-testing a PR.

A first assessment of the suggested change would have:

Pros:

  • fewer PRs to staging repo
  • easier to keep overview for those managing PRs in staging repo
  • fewer REST API calls to GitHub when processing tarballs

Cons:

  • larger tarballs may take longer to transfer (thus more likely to fail)
  • they also take longer to process on the Stratum-0 (verify checksum, analyse contents, ingestion)
  • probably more cumbersome to reject an ingestion of a tarball for a single architecture
    • though not clear if we want/need this ...
    • in NESSI, in the past we have had such situations frequently on purpose, because we built for the same architecture on different clusters ...
    • particularly for generic this might still be interesting to detect potential issues
    • could be made obsolete with a bot command deploy [FILTER]* that instructs the bot to only upload specific builds
  • testing on a different node than the build node/cluster is made harder because it can only download a tarball (from S3 bucket) that includes all builds for all architectures although it would only test for a single one

@boegel
Copy link
Contributor Author

boegel commented Jul 4, 2023

During the bot sync meeting this morning, @bedroge pointed out that the issue isn't so much multiple tarballs, but multiple PRs to the staging repo.
So if we could somehow only make a single PR to the staging repo for a collection of tarballs (that arise from the same PR), that would already be a big help I think (and it also counters some of the cons raised by @trz42)

@bedroge
Copy link
Contributor

bedroge commented Jul 5, 2023

An advantage of having a single tarball is that the ingestion is more atomic: it will be added in one transaction for all architectures (or fail for all in case of an issue). With the current approach the repository is in some "weird" state for a while, where the software may be available for certain architectures, but not yet for others. In case of some issue with the Stratum 0 / ingestion, it could take even longer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants