Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cirrus: use fastvm for builds #24120

Merged
merged 8 commits into from
Oct 14, 2024

Conversation

Luap99
Copy link
Member

@Luap99 Luap99 commented Oct 1, 2024

Builds now take over 10 mins, given golang compilation is parallelized by default we can give more cores to speed it up.

Does this PR introduce a user-facing change?

None

@Luap99 Luap99 added the No New Tests Allow PR to proceed without adding regression tests label Oct 1, 2024
@openshift-ci openshift-ci bot added release-note-none approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Oct 1, 2024
@Luap99
Copy link
Member Author

Luap99 commented Oct 1, 2024

This seems to safe around 3mins, not a lot but at least something. I will look to see what parts of the build jobs are slow to see if there are some ways to make it faster because AFAICT most time does not seem to be spend compiling so I think there is some room in our docs/validation process.

@Luap99 Luap99 force-pushed the cirrus-build-speed branch from 898332b to 5812f8e Compare October 7, 2024 12:56
@Luap99 Luap99 marked this pull request as draft October 7, 2024 13:24
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 7, 2024
@Luap99 Luap99 force-pushed the cirrus-build-speed branch from 5812f8e to 5d2390e Compare October 7, 2024 13:44
Copy link

Cockpit tests failed for commit 5d2390e07c1bfde20bc48295974897e66f371e06. @martinpitt, @jelly, @mvollmer please check.

@Luap99 Luap99 force-pushed the cirrus-build-speed branch 7 times, most recently from 8ac4299 to d9e0107 Compare October 8, 2024 13:25
Copy link

Cockpit tests failed for commit d9e0107f9610b01eb890196e64e8276be63182c6. @martinpitt, @jelly, @mvollmer please check.

@Luap99 Luap99 force-pushed the cirrus-build-speed branch from d9e0107 to 2e5ed0b Compare October 8, 2024 16:32
Luap99 added a commit to Luap99/automation_images that referenced this pull request Oct 10, 2024
Windows does not have zstd by default so we need to install it. In
particular I am looking at switching the repo archive to zstd as this
makes things much faster (over 1min in podman)[1] but the windows
testing is unable to extract that. While archiver added zstd support a
while back it is not in the version that is on chocolatey which seems a
bit out of date.

[1] containers/podman#24120

Signed-off-by: Paul Holzinger <[email protected]>
@Luap99 Luap99 force-pushed the cirrus-build-speed branch 3 times, most recently from 258f33f to 3f3f597 Compare October 10, 2024 15:50
@Luap99
Copy link
Member Author

Luap99 commented Oct 10, 2024

This should now safe around more than 4min of total CI time now.

Another PR on main with a Finished time of 10:22 from cirrus:
image

This PR with a Finished time of 05:51 from cirrus:
image

@Luap99
Copy link
Member Author

Luap99 commented Oct 10, 2024

@edsantiago PTAL (low prio) I think this is good to review, of course this will fail CI until #24227 is merged then I will rebase and mark as ready.

Builds now take over 10 mins, given golang compilation is parallelized
by default we can give more cores to speed it up.

Signed-off-by: Paul Holzinger <[email protected]>
This target runs several scripts in serial but they do not have any
dependencies so we can split them all into their own target so that make
-j can run the targets in parallel to speed this up.

Signed-off-by: Paul Holzinger <[email protected]>
The doc generation and the validate-binaries target can be run in
parallel as they do not depend on each other and a specific ordering. As
such we pass -j $(nproc) but also --output-sync=target to ensure the
output is not intermixed between several targets which could be harder
to read in case of errors.

Hower dus the complex podman-release target we can run podman-release
and validate-binaries at the same time as the dependencies are not right
and we run podman-release first in order to get the correct binaires
build.

Signed-off-by: Paul Holzinger <[email protected]>
The current podman-release-%.tar.gz target does a lot more then just
checking if we can build for the given arch, in particular it first
builds a local podman-remote for the remote-docs.sh script. This makes
things slow as we compile several things and then builda and package the
docs. Given the docs are not arch specific there is realy no point in
doing all that work. All we care about is if the bianries can build on
other arches to catch compile issue for otherwise untested arches.

This should make the CI Alt Arch. tasks much faster.

Signed-off-by: Paul Holzinger <[email protected]>
In particular the main build task already did a make vendor and a
regeneration of the completion scripts. This means the first tre_status
would pick up both changes so the suggestion would be off. And rerunning
the same thing again here just makes thing slower than they need to be.
In particular there was the bug that make completion even rebuild podman
because generate-bindings obviously updates the timestamps of the files
as they are overwritten.

We do however must run generate-bindings as it was not run before.

Signed-off-by: Paul Holzinger <[email protected]>
The script for aarch is exactly the same so there doesn't seem to be a
reason to duplciate it.

Signed-off-by: Paul Holzinger <[email protected]>
The repo tar process took over 1:20 min, with zstd it takes less than
10s so we safe over a minute by doing this.

Signed-off-by: Paul Holzinger <[email protected]>
@Luap99 Luap99 force-pushed the cirrus-build-speed branch from 3f3f597 to 9e35fea Compare October 11, 2024 09:25
@Luap99 Luap99 marked this pull request as ready for review October 11, 2024 09:26
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 11, 2024
Copy link
Member

@lsm5 lsm5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

.cirrus.yml Outdated
time tar xjf /tmp/repo.tbz -C $GOSRC
echo "$ARTCURL/Build%20for%20${DISTRO_NV}/repo/repo.tar.zst"
time $ARTCURL/Build%20for%20${DISTRO_NV}/repo/repo.tar.zst
time tar --zstd -xf /tmp/repo.tar.zst -C $GOSRC
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From info tar:

Reading compressed archive is even simpler: you don't need to specify
any additional options as GNU ‘tar’ recognizes its format automatically.

I have a slight predilection for eliminating the --zstd, and just letting tar figure it out itself. WDYT?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can try that and see if it works, the one thing that really messed me up is that we extract on macos as well and there is no gnu tar there so I am not sure if that can figure it out too. If not we end up with a case where one tar command has it and the other does not which might cause more confusion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh good point. I hate macs so much.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add this as extra commit let's see if it works or not, bsd tar (used on macos) can do that to per google search in theory

I tried to use tar on my local mac to test it but I just got this error and did not want to spend more time on it to figure out what is wrong there...

tar: Can't launch external program: zstd --no-check -3

If doesn't work I can just revert the commit and if it does work I can squash it into the zstd change if you prefer that.

Copy link
Member

@edsantiago edsantiago left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I'm curious about the tree_status changes, especially how much we can rely on the state being what you describe, but will go along with it.

Copy link
Contributor

openshift-ci bot commented Oct 14, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: edsantiago, lsm5, Luap99

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [Luap99,edsantiago,lsm5]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

tar should be smart enough to check the magic byte and use the correct
decompression algo based on that so there is no need to spell it out
explictly.

Signed-off-by: Paul Holzinger <[email protected]>
@edsantiago
Copy link
Member

/lgtm

no need to squash. Thank you!

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Oct 14, 2024
@openshift-merge-bot openshift-merge-bot bot merged commit 11ab0b7 into containers:main Oct 14, 2024
81 checks passed
@Luap99 Luap99 deleted the cirrus-build-speed branch October 14, 2024 16:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. No New Tests Allow PR to proceed without adding regression tests release-note-none
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants