Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] [CI] Cache build artifacts to reduce build times #2188

Open
Swiddis opened this issue Oct 1, 2024 · 4 comments
Open

[FEATURE] [CI] Cache build artifacts to reduce build times #2188

Swiddis opened this issue Oct 1, 2024 · 4 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@Swiddis
Copy link
Collaborator

Swiddis commented Oct 1, 2024

Is your feature request related to a problem?
Coming from #2187, build times are very slow because of time taken to repeatedly compile the same modules. These build artifacts are stored in various /target directories around the OSD directory. A quick search shows 4 general glob patterns (relative to /OpenSearch-Dashboards):

  • src/core/target
  • src/plugins/*/target
  • packages/*/target
  • plugins/*/target <-- We are here

Somehow, we should cache these build artifacts.

What solution would you like?
The tricky thing about a global cache that matches all of these directories' source is that, in aggregate, these directories change very often, multiple times per day. This means trying to put all of them in one cache is going to be mostly fruitless. But individually, these different packages don't get changed that often: lots of the core plugins and packages have been stable for months or sometimes years, and the "hotspots" for code changes drift over time.

Ideally, we would be able to store one separate cache for each of these modules. This isn't supported very well by the cache action, however. I generally see two options in this direction:

  • Write a script to code-generate a large cache action YML that will cache all of these directories individually
  • Try to write a bash script directly in the action (I tried it, but it didn't work for reasons I haven't fully debugged Add compile step before Cypress runs in CI #2187)

What alternatives have you considered?

  • No caching: what we've been doing so far. It's worked historically, but the build times are continuing to rack up. The last time I tried to cache builds our builds were at 5 minutes, now (just over a year later) they're at 12.
  • One mega-cache: may work in principle, just inefficient, and the cache would be missed more often than not (i.e., for every code change). The issue I saw last time from trying this is that eventually so many differences accumulate from the cache that building all the changes ends up taking longer than just a clean build (Remove Yarn caching in CI and switch to Retry #965).
  • Only cache the core targets (skip plugins/*): also works in principle, but only marginally more efficient than the above since core also changes very often.

Do you have any additional context?
N/A

@Swiddis Swiddis added enhancement New feature or request untriaged help wanted Extra attention is needed and removed untriaged labels Oct 1, 2024
@Swiddis
Copy link
Collaborator Author

Swiddis commented Oct 1, 2024

I'm looking at my cache files for osd-optimizer locally -- I see that it's actually relying on the modified times of all inputs. That means caching is probably not going to work in GHA since pulling files will always update the modified time, even if we save the target files. Since pulling will always update the mtimes, that means we can only really cache the node_modules dependencies here (and make an issue upstream to use content hashes instead of timestamps). GHA cache will persist endtimes so we may still be able to cache all the dependency builds.

image

Since dependencies depend on the package locks, we probably can get away with just storing based on the same key we use for the Yarn cache, and collecting all /targets in the existing cache. I'll see about making a PR for that when I get a moment.

@Swiddis
Copy link
Collaborator Author

Swiddis commented Oct 2, 2024

Experimented in #2190 with just caching the modules and targets, the results aren't great -- the total build time is basically the same (though interestingly has a completely different shape, slower start of the build and faster end). Probably need to get attention on opensearch-project/OpenSearch-Dashboards#8428 to get any noticeable improvements here.

osd_build

Interestingly, from debug output on a cache it, it looks like the caching is based on the bootstrap hashes, which explicitly say "These are for debugging, don't use these". Wonder what's up with that.

image

@Swiddis Swiddis changed the title [FEATURE] [CI] Cache /targets [FEATURE] [CI] Cache build artifacts to reduce build times Oct 2, 2024
@Swiddis
Copy link
Collaborator Author

Swiddis commented Oct 2, 2024

In the upstream issue, I developed a POC that fixes the cache reuse problem. The other side of this is figuring out when to refresh the cache: we still have the issue that over time more code changes will accumulate and cause the cache to lose its potency. I don't want to just regenerate the cache on every code change, but OTOH just depending on yarn.lock will probably not refresh as fast as we'd like.

My proposal:

  • Instead of a file hash, consider anchoring the caches to a time window. We can use current-date-time to generate a new cache weekly at worst. This is probably the most reliable form of "Let's keep the cache sort-of fresh without needing to rebuild everything on every commit."
  • In addition, I think we should still hash yarn.lock since this has the potential to cause sweeping cache invalidations (in particular if any transient dependency of osd-optimizer gets updated).

Alternatively: Figure out how to check if the compile time is taking longer than some threshold (15s?) and force-save the cache if so. Less straightforward than the above but more foolproof.

@Swiddis
Copy link
Collaborator Author

Swiddis commented Oct 3, 2024

Looking again at the core codebase, it looks like yarn.lock is updated semi-regularly there, ~2 times per week on average. It's probably effective enough to just cache based on yarn.lock as we have been. We can always update the strategy if we see specific hot packages causing cascading recompilation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

1 participant