Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decrease linter's memory usage #1194
Decrease linter's memory usage #1194
Changes from 8 commits
c79d7b7
f874627
d2aec0a
3e46cfa
513ebf1
4e0e596
21ff2c2
767ba19
6975fb6
ed39e84
11deaa7
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Can we consider increasing the Concurrency?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've not done extensive tests here. We already know the default concurrency of 4 on Github hosted runners is not working for us even when using the build constraints, it still consumes too much memory. Decreasing the concurrency to 1 without employing the build constraints also did not allow us to fit the linter into a hosted runner with 16 GB of physical memory. But using the build constraints and limiting the concurrency to 1 made it possible to fit onto a 16 GB machine. So, looks like we can try increasing the concurrency to 2 or 3.
The full linting is still done with the maximum available concurrency (in our case, with a concurrency of 4) with the constructed analysis cache. This concurrency limitation is only for the first cache construction phase.
Let me try with a concurrency of 2 and 3. We were initially doing our observations with a relatively larger resource provider (
ec2
). Switching toaccount
may have helped.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A concurrency of 4 for the initial analysis cache build phase on cold cache has failed here:
https://github.com/upbound/provider-aws/actions/runs/8246986803/job/22554209938?pr=1194
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's a run with a concurrency of 1. It took ~21 min and ~14 GB of peak memory to complete the initial phase:
And here's another run with a concurrency of 3. It took ~14 min ~19 GB of peak memory to complete the initial phase:
Let's switch to a concurrency of 3 for the initial phase (concurrency of 4 has failed). The second phase already uses the max available concurrency.
If we observe any memory issues for the initial phase, we can decrease it later again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I remember correctly, this file was brought up in the context of the fragmentation of the registration phase. It will be needed when we want to add all the resources to the schema. I can't find any use of this function in the repo. Am I missing something, or do we really need this file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As explained here, the code that the linter runner is processing (analyzing) must compile. The API registration code (
apis/zz_register.go
) is one of the hot spots, which imports all the API packages and is thus costly to analyze. Theapis/linter_run.go
keeps the linter happy by providing a definition ofapis.AddToScheme
&apis.AddToSchemes
and the original definitions fromapis/zz_register.go
are excluded by the supplied build constraints to the linter. So, the definitions in this file are only used by the linter and the actual build uses the definitions fromapis/zz_register.go
.This also means that the generated file
apis/zz_register.go
is not analyzed. We can in theory distribute API registration across the individual resource provider (API group) modules but the cross-resource references necessitate a form of dependency resolution as discussed under theImplemented Strategy in this PR
section of the PR description. So avoiding these hot spots at the expense of not linting them is a trade-off we make here.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha! Thanks for the clarification. This part can also be a bit tricky for other contributors. So, I am leaving this comment unresolved for documenting purposes.