Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default env & startup.jl loading pattern can cause excessive precompilation #56766

Open
IanButterworth opened this issue Dec 6, 2024 · 6 comments
Labels
packages Package management and loading

Comments

@IanButterworth
Copy link
Member

IanButterworth commented Dec 6, 2024

Problem

A common usage pattern is for a user to have, say Revise.jl in their default env, and loaded in their startup.jl

So

$ julia
# Revise & deps are loaded from the @1.11 Manifest
pkg> activate Foo
pkg> add Bar
# Bar & deps are precompiled based on Foo/Manifest.toml only, ignoring already loaded deps
julia> using Bar
# Because some deps were already loaded by Revise, the only way to load Bar here is to precompile a new ad-hoc version with the different loaded deps
[ Info:Precompiling Bar [xxx] (cache misses: wrong dep version loaded (1))

To the user they have just had to sit through their newly added packages precompile twice, without clear explanation for why.

The same can be true if the user is simply switching env after startup, but the fix there is easier to teach: start julia with --project to make loading Revise use the common dep versions from the target env.

Potential mitigations

  1. Julia could inform the user why the package is being precompiled again.
    It already does this with the debug info in serial precompile logs i.e. wrong dep version loaded (1) above, but that's not the clearest messaging as it's often accompanied with other cache miss reasons that aren't clear to people unfamiliar with loading mechanics. So julia should probably add a clearer message.

  2. Provide a mechanism for packages like Revise to absorb deps as internal code, rather than a regular shared dep.
    Something like @internalize using OrderdCollections which would expand to

Base.allow_load_indirect_deps(true) # new function
include(Base.find_package("OrderedCollections"))
using .OrderedCollections
Base.allow_load_indirect_deps(false)

but that would be harder to do for the deps of deps.
Also if the packages people are loading in their startup.jl are more mixed usage, i.e. Plots.jl, then it's not a complete solution

  1. Unload & reload packages when needed. Perhaps prompt the user first?

  2. Some loading multiverse where sessions can fork into different loading states and have different versions loaded at the same time. Seems hard to do while maintaining interop that a user might expect.

@IanButterworth IanButterworth added the packages Package management and loading label Dec 6, 2024
@KristofferC
Copy link
Member

Alternatively precompile should take loaded packages into account with the assumption that the user will setup their state in the same way next time they load the package. It would also ensure that loading the added package in the same session will not cause another precompile.

@IanButterworth
Copy link
Member Author

Maybe precompile could cover both scenarios. Current state and manifest state.

@IanButterworth
Copy link
Member Author

IanButterworth commented Dec 6, 2024

Or if different deps are unexpectedly loaded already the load just skips precompilation and loads the package directly without generating .ji etc.

Update: That's proposed here #56769

@Moelf
Copy link
Contributor

Moelf commented Dec 8, 2024

this problem also occurs if user starts a jupyter notebook and runs a setup cell with something like:

using Pkg
Pkg. activate(...)

@IanButterworth
Copy link
Member Author

IanButterworth commented Dec 10, 2024

The messaging on master is a little clearer around this now, with JuliaLang/Pkg.jl#4109 merged

Image

@IanButterworth
Copy link
Member Author

Maybe precompile could cover both scenarios. Current state and manifest state.

I looked into precompiling for both clean and any dirty load state, but to do it fully in parallel I think would take a lot of additional complication, because the job needs to figure out which dep job it's waiting for, which isn't simple because I don't think the version of the dep is enough to figure that out

Some attempts at that master...IanButterworth:julia:ib/precompile_clean_and_dirty

However, I think it'd be pretty simple to check if each dep needed to be precompiled for dirty state after the clean state version is done. But that'd mean doubling the precompile jobs tail.. i.e. precompiling Makie twice in a row. etc. which might be annoying

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
packages Package management and loading
Projects
None yet
Development

No branches or pull requests

3 participants