Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the cabal-install solver multilibs-aware #6039

Open
fgaz opened this issue May 7, 2019 · 35 comments
Open

Make the cabal-install solver multilibs-aware #6039

fgaz opened this issue May 7, 2019 · 35 comments

Comments

@fgaz
Copy link
Member

fgaz commented May 7, 2019

The Cabal part is done, but cabal-install's solver is not aware of the multiple public libraries feature yet, leading to late errors and issues like #6038

@23Skidoo
Copy link
Member

23Skidoo commented May 7, 2019

/cc @grayjay @kosmikus -- How hard do you think this would be to implement?

@grayjay
Copy link
Collaborator

grayjay commented May 10, 2019

I think this would be pretty straightforward, at least for source packages, because it could be implemented similarly to enforcing build tool dependencies. The visibility field could probably be implemented similarly to the current handling of unbuildable components. I have a few questions about how visibility should behave, though.

  • Should the solver enforce visibility now? I wasn't sure if the comments about disabling the visibility field in Prevent dependency on private library #5848 applied here.
  • Should the solver just check the visibility fields in Library and InstalledPackageInfo? Can the main library be private?

There may be an issue with enforcing visibility for installed packages, because there is a hack where the solver treats installed internal libraries as separate packages with munged names (

-- Note [Index conversion with internal libraries]
-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-- Something very interesting happens when we have internal libraries
-- in our index. In this case, we maybe have p-0.1, which itself
-- depends on the internal library p-internal ALSO from p-0.1.
-- Here's the danger:
--
-- - If we treat both of these packages as having PN "p",
-- then the solver will try to pick one or the other,
-- but never both.
--
-- - If we drop the internal packages, now p-0.1 has a
-- dangling dependency on an "installed" package we know
-- nothing about. Oops.
--
-- An expedient hack is to put p-internal into cabal-install's
-- index as a MUNGED package name, so that it doesn't conflict
-- with anyone else (except other instances of itself). But
-- yet, we ought NOT to say that PNs in the solver are munged
-- package names, because they're not; for source packages,
-- we really will never see munged package names.
--
-- The tension here is that the installed package index is actually
-- per library, but the solver is per package. We need to smooth
-- it over, and munging the package names is a pretty good way to
-- do it.
). I'm not sure how difficult it will be to distinguish between inter- and intra-package dependencies.

@fgaz
Copy link
Member Author

fgaz commented May 10, 2019

Should the solver enforce visibility now? I wasn't sure if the comments about disabling the visibility field in #5848 applied here.

Yes. In #5848 we're talking about disabling it at the syntax level, but internally it's still used

Should the solver just check the visibility fields in Library and InstalledPackageInfo? Can the main library be private?

Yes and no, but I'll have to check the second one.

grayjay added a commit to grayjay/cabal that referenced this issue May 13, 2019
This commit tracks dependencies on sub-libraries by extending the functionality
for tracking executables that was added in
e86f838.  It also starts adding support for
library visibility, though it currently only works for source packages.  There
is a TODO for handling installed packages.
@grayjay
Copy link
Collaborator

grayjay commented May 13, 2019

@fgaz Thanks. I made a PR (#6047), but it only checks the visibility of source packages so far. It reads the libVisibility field for all libraries, though I can easily change it to only check sub-libraries if that is the correct behavior.

grayjay added a commit to grayjay/cabal that referenced this issue May 13, 2019
This commit tracks dependencies on sub-libraries by extending the functionality
for tracking executables that was added in
e86f838.  It also starts adding support for
library visibility, though it currently only works for source packages.  There
is a TODO for handling installed packages.

Fixes haskell#6038.
@fgaz fgaz mentioned this issue Jun 26, 2019
31 tasks
grayjay added a commit to grayjay/cabal that referenced this issue Nov 21, 2019
This commit tracks dependencies on sub-libraries by extending the functionality
for tracking executables that was added in
e86f838.

It also starts adding support for library visibility, though it currently only
works for source packages.  There is a TODO for handling installed packages.

This commit handles visibility similarly to the way that the buildable field is
handled currently.  It only checks whether a component is made private by the
current environment and flag constraints at the start of dependency solving.
This means that the solver can treat a component as visible when the visibility
is controlled by an automatic flag, and the build can fail later, depending on
the value that is chosen for the flag.

Fixes haskell#6038.
@phadej phadej added this to the Triaged milestone Nov 28, 2019
grayjay added a commit to grayjay/cabal that referenced this issue Dec 17, 2019
This commit tracks dependencies on sub-libraries by extending the functionality
for tracking executables that was added in
e86f838.

It also starts adding support for library visibility, though it currently only
works for source packages.  There is a TODO for handling installed packages.

This commit handles visibility similarly to the way that the buildable field is
handled currently.  It only checks whether a component is made private by the
current environment and flag constraints at the start of dependency solving.
This means that the solver can treat a component as visible when the visibility
is controlled by an automatic flag, and the build can fail later, depending on
the value that is chosen for the flag.

Fixes haskell#6038.
grayjay added a commit to grayjay/cabal that referenced this issue Jan 20, 2020
This commit tracks dependencies on sub-libraries by extending the functionality
for tracking executables that was added in
e86f838.

It also starts adding support for library visibility, though it currently only
works for source packages.  There is a TODO for handling installed packages.

This commit handles visibility similarly to the way that the buildable field is
handled currently.  It only checks whether a component is made private by the
current environment and flag constraints at the start of dependency solving.
This means that the solver can treat a component as visible when the visibility
is controlled by an automatic flag, and the build can fail later, depending on
the value that is chosen for the flag.

Fixes haskell#6038.
grayjay added a commit to grayjay/cabal that referenced this issue Jan 21, 2020
This commit tracks dependencies on sub-libraries by extending the functionality
for tracking executables that was added in
e86f838.

It also starts adding support for library visibility, though it currently only
works for source packages.  There is a TODO for handling installed packages.

This commit handles visibility similarly to the way that the buildable field is
handled currently.  It only checks whether a component is made private by the
current environment and flag constraints at the start of dependency solving.
This means that the solver can treat a component as visible when the visibility
is controlled by an automatic flag, and the build can fail later, depending on
the value that is chosen for the flag.

Fixes haskell#6038.
grayjay added a commit to grayjay/cabal that referenced this issue Jan 25, 2020
This commit tracks dependencies on sub-libraries by extending the functionality
for tracking executables that was added in
e86f838.

It also starts adding support for library visibility, though it currently only
works for source packages.  There is a TODO for handling installed packages.

This commit handles visibility similarly to the way that the buildable field is
handled currently.  It only checks whether a component is made private by the
current environment and flag constraints at the start of dependency solving.
This means that the solver can treat a component as visible when the visibility
is controlled by an automatic flag, and the build can fail later, depending on
the value that is chosen for the flag.

Fixes haskell#6038.
grayjay added a commit to grayjay/cabal that referenced this issue Feb 1, 2020
This commit tracks dependencies on sub-libraries by extending the functionality
for tracking executables that was added in
e86f838.

It also starts adding support for library visibility, though it currently only
works for source packages.  There is a TODO for handling installed packages.

This commit handles visibility similarly to the way that the buildable field is
handled currently.  It only checks whether a component is made private by the
current environment and flag constraints at the start of dependency solving.
This means that the solver can treat a component as visible when the visibility
is controlled by an automatic flag, and the build can fail later, depending on
the value that is chosen for the flag.

Fixes haskell#6038.
grayjay added a commit to grayjay/cabal that referenced this issue Feb 16, 2020
This commit tracks dependencies on sub-libraries by extending the functionality
for tracking executables that was added in
e86f838.

It also starts adding support for library visibility, though it currently only
works for source packages.  There is a TODO for handling installed packages.

This commit handles visibility similarly to the way that the buildable field is
handled currently.  It only checks whether a component is made private by the
current environment and flag constraints at the start of dependency solving.
This means that the solver can treat a component as visible when the visibility
is controlled by an automatic flag, and the build can fail later, depending on
the value that is chosen for the flag.

Fixes haskell#6038.
grayjay added a commit to grayjay/cabal that referenced this issue Mar 1, 2020
This commit tracks dependencies on sub-libraries by extending the functionality
for tracking executables that was added in
e86f838.

It also starts adding support for library visibility, though it currently only
works for source packages.  There is a TODO for handling installed packages.

This commit handles visibility similarly to the way that the buildable field is
handled currently.  It only checks whether a component is made private by the
current environment and flag constraints at the start of dependency solving.
This means that the solver can treat a component as visible when the visibility
is controlled by an automatic flag, and the build can fail later, depending on
the value that is chosen for the flag.

Fixes haskell#6038.
@Mikolaj
Copy link
Member

Mikolaj commented Nov 17, 2022

I wonder if a volunteer could easily help with the investigation. How would one inspect the "solver's view of the installed package index"? Run with -v3 or look at dist-newstyle/cache/plan.json or is there some special debug method (e.g., decoding any of the binary dist-newstyle/cache/*-plan files)?

@grayjay
Copy link
Collaborator

grayjay commented Dec 1, 2022

The most direct way to see how the solver handles multiple installed instances is to look up the package by name in the Index returned by convPIs and print all returned instances. I think that -v3 would also work, but the test case would need to cause the solver to reject all versions of the installed package (e.g., with an unsatisfiable version constraint) in order to make the solver print them all.

I noticed that cabal has a --shadow-installed-packages flag, so the behavior may actually be configurable.

If the index returned by convPIs contains both instances of the package in the test, then I think it would also be useful to see whether it is possible to depend on either one of them.

@andreabedini
Copy link
Collaborator

I have done some preparatory work and I am planning to work on this issue this week. @grayjay @Mikolaj Let me know if there's been any progress not reported here.

@andreabedini
Copy link
Collaborator

Regarding package shadowing:

✦ ❯ rg solverSettingShadowPkgs
cabal-install/src/Distribution/Client/ProjectConfig/Types.hs
422:     --solverSettingShadowPkgs        :: Bool,

cabal-install/src/Distribution/Client/ProjectConfig.hs
258:  --solverSettingShadowPkgs        = fromFlag projectConfigShadowPkgs

cabal-install/src/Distribution/Client/ProjectPlanning.hs
1086:   -- . setShadowPkgs solverSettingShadowPkgs

@Mikolaj
Copy link
Member

Mikolaj commented May 15, 2023

I don't know of any unreported progress. There is certainly reported interest in related things on #hackage, in https://discourse.haskell.org/t/new-hackage-server-features/2621/42, in haskell/hackage-server#1119.

@grayjay
Copy link
Collaborator

grayjay commented May 16, 2023

@andreabedini Thanks for volunteering to work on this issue! I don't know of any additional progress, but I can give my current thoughts about test cases and the desired behavior.

I mentioned two test cases in #6039 (comment). For the first case, I'm assuming that regardless of the value of the flag --shadow-installed-packages, the solver won't choose an install plan that uses two instances of a non-multilibs package as dependencies of the same component. (I don't think we need to test this, due to the single instance restriction mentioned in #6459.) In my opinion, this means that the multilibs feature should enforce similar consistency and should prevent two instances of a multilibs package from being used as dependencies of the same component.

For the second test case in #6039 (comment), I wanted to give a more concrete example:

  1. Install package A-1.0 in the global package database. Package A-1.0 only contains one component, sublibrary B.
  2. Modify the source code of package A-1.0 so that it instead has sublibrary C as its only component. Install it in the global package database, too.
  3. Try to build package D-1.0 using "cabal build". Package D-1.0 has build-depends dependencies on both sublibrary B and sublibrary C. The build command should fail, since there is no instance of A-1.0 that contains both components.

Alternatively, the test case could avoid modifying source code by giving package A-1.0 a build flag that makes one sublibrary unbuildable when it is true and the other unbuildable when it is false and then installing both configurations.

@andreabedini
Copy link
Collaborator

Here some notes from today. Some of this might be well-know but it wasn't for me :)

First question:

Install two instances of a non-multilibs package and see how they appear in the solver's view of the installed package index. If cabal already doesn't handle two installed instances of a non-multilibs package, then there is less reason to fully support multiple instances of multilibs packages.

❯ cat cabal.project 
active-repositories: :none
package-dbs: clear, global, /home/andrea/.local/state/cabal/store/ghc-9.2.7/package.db
extra-packages: zlib
constraints: zlib <0
❯ cabal build -v3 --dry-run zlib | grep '^\['
...
[__0] rejecting:
zlib-0.6.3.0/installed-6e56dc43f3ec2be12bd5546861b15931b096e743f56e443fc2faae3bceb1327a,
zlib-0.6.3.0/installed-727383b633c94a5c0b336dffa9dc4d80869f0f864348ed507f9050a29cacf41b,
zlib-0.6.3.0/installed-e3964c6e976321c89432925b0aa9fc494f0907ce8b7202161f602167c20533c2
(constraint from project config /home/andrea/tmp/two-cases/prj/cabal.project
requires <0)
...
❯ ghc-pkg --package-db /home/andrea/.local/state/cabal/store/ghc-9.2.7/package.db field zlib-0.6.3.0 id 
id: zlib-0.6.3.0-6e56dc43f3ec2be12bd5546861b15931b096e743f56e443fc2faae3bceb1327a
id: zlib-0.6.3.0-e3964c6e976321c89432925b0aa9fc494f0907ce8b7202161f602167c20533c2
id: zlib-0.6.3.0-727383b633c94a5c0b336dffa9dc4d80869f0f864348ed507f9050a29cacf41b

Multiple instances do not seem to be a problem.

Interlude:

The solver represents each library in the package as a separate PInfo. Intra-package and inter-package dependencies are both represented similarly, as dependencies between PInfos. This behavior seems correct, but it is misleading, because PInfo is short for package info and was initially designed to represent a whole package. There is currently no distinction between main libraries and sublibraries or intra-package dependencies and inter-package dependencies.

Indeed sublibraries are installed in the packagedb with one entry per component, with a mangled name. E.g.

❯ ghc-pkg --package-db /home/andrea/.local/state/cabal/store/ghc-9.2.7/package.db list | grep plutus-core
    plutus-core-1.1.1.0
    (z-plutus-core-z-index-envs-1.1.1.0)

where index-envs is a private sub-library of plutus-core package and the main library depends on the sublibrary (via its unit-id). The two package-db entries look like this:

❯ ghc-pkg --package-db /home/andrea/.local/state/cabal/store/ghc-9.2.7/package.db field plutus-core-1.1.1.0 name,version,id
name: plutus-core
version: 1.1.1.0
id: plutus-core-1.1.1.0-d85fe72bf2ff0728e98538cf83f2240595161f111577c50ccb9d9e9c4f3eff21
❯ ghc-pkg --package-db /home/andrea/.local/state/cabal/store/ghc-9.2.7/package.db field z-plutus-core-z-index-envs-1.1.1.0 name,version,package-name,lib-name,id
name: z-plutus-core-z-index-envs
version: 1.1.1.0
package-name: plutus-core
lib-name: index-envs
id: plutus-core-1.1.1.0-l-index-envs-6b7f40b957785c8930efb43ea13c63e65f84ebf44c61618f8ce55506cdd8acbb

(Note that the sub-library has package-name and lib-name while the main library does not.)

From the solver point of view, all entries in ghc-pkg are separate instances. The index (result of convPIs) indeed shows two entries:

plutus-core-1.1.1.0 Inst (UnitId "plutus-core-1.1.1.0-d85fe72bf2ff0728e98538cf83f2240595161f111577c50ccb9d9e9c4f3eff21")
z-plutus-core-z-index-envs-1.1.1.0 Inst (UnitId "plutus-core-1.1.1.0-l-index-envs-6b7f40b957785c8930efb43ea13c63e65f84ebf44c61618f8ce55506cdd8acbb")

Second question:

For the second test case in #6039 (comment), I wanted to give a more concrete example:

Despite the clear instructions this test turned out a bit trickier for me. I made a package with two sub-libraries and installed them in the user packagedb (I had to use cabal act-as-setup -- install --user but that's ok). There's a bit of a problem because Setup.hs install calls ghc-pkg update which overrites entries with the same name, so I could not easily install more instances.

That said. Since each instance contains only one component, it looks meaningless to say "instance of A-1.0 that contains both components".

Plan for tomorrow:

Looking at the solver index it looks like it's not demangling the component names. IIRC Distribution.Client.IndexUtils.getInstalledPackages does the demangling correctly. I need to double check.

The original problem I am running into is that the solver does not seem to be able to re-use public sublibraries in the packagedb, and ends up recompiling anything that depends on them. This is the behaviour I described in input-output-hk/haskell.nix#1662 (comment). Tomorrow I'll investigate how the solver sees the situation. IIRC the solver was complaining it could not use the installed package because some component was missing. Other thing to double check :)

@andreabedini andreabedini self-assigned this May 17, 2023
@grayjay
Copy link
Collaborator

grayjay commented May 17, 2023

Thanks for investigating. I didn't realize that sublibrary names were still mangled in the package db.

Despite the clear instructions this test turned out a bit trickier for me. I made a package with two sub-libraries and installed them in the user packagedb (I had to use cabal act-as-setup -- install --user but that's ok). There's a bit of a problem because Setup.hs install calls ghc-pkg update which overrites entries with the same name, so I could not easily install more instances.

That said. Since each instance contains only one component, it looks meaningless to say "instance of A-1.0 that contains both components".

I'm not sure I understand this part, so I wanted to clarify that the definition of "instance" that I was using is a single installation of a given package and version. Instances can differ by having different build flags, versions of dependencies, etc. If the package has more than one library, then the instance will be split across multiple entries in the package db. A major part of the remaining work for this issue is allowing the solver to group installed libraries into instances.

The reason that package D depends on both sublibrary B and sublibrary C in the test case is that I wanted to ensure that cabal doesn't incorrectly group the two installed sublibraries from Package A-1.0 together as one instance when they are actually from two different instances.

The original problem I am running into is that the solver does not seem to be able to re-use public sublibraries in the packagedb, and ends up recompiling anything that depends on them. This is the behaviour I described in input-output-hk/haskell.nix#1662 (comment). Tomorrow I'll investigate how the solver sees the situation. IIRC the solver was complaining it could not use the installed package because some component was missing. Other thing to double check :)

If I understand correctly, this is the problem I described in #6039 (comment). The solver doesn't currently know how to satisfy dependencies from source packages to installed sublibraries. I think that the best way to solve this issue is to represent each installed package as a single PInfo, similarly to source packages. Then the solver could use the PInfo's map of components to track the availability of specific sublibraries.

@andreabedini
Copy link
Collaborator

I'm not sure I understand this part, so I wanted to clarify that the definition of "instance" that I was using is a single installation of a given package and version. Instances can differ by having different build flags, versions of dependencies, etc. If the package has more than one library, then the instance will be split across multiple entries in the package db. A major part of the remaining work for this issue is allowing the solver to group installed libraries into instances.

Thank you for clarifying. I was indeed a bit confused about the meaning of "instance" in this discussion but I ended up adopting the one used in the solver (which identifies instances with packagedb entries). Now I understand you are using the term in a more general way and the difference is exactly what we need to teach to the solver.

This note is also relevant.

-- Note [Index conversion with internal libraries]
-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-- Something very interesting happens when we have internal libraries
-- in our index. In this case, we maybe have p-0.1, which itself
-- depends on the internal library p-internal ALSO from p-0.1.
-- Here's the danger:
--
-- - If we treat both of these packages as having PN "p",
-- then the solver will try to pick one or the other,
-- but never both.
--
-- - If we drop the internal packages, now p-0.1 has a
-- dangling dependency on an "installed" package we know
-- nothing about. Oops.
--
-- An expedient hack is to put p-internal into cabal-install's
-- index as a MUNGED package name, so that it doesn't conflict
-- with anyone else (except other instances of itself). But
-- yet, we ought NOT to say that PNs in the solver are munged
-- package names, because they're not; for source packages,
-- we really will never see munged package names.
--
-- The tension here is that the installed package index is actually
-- per library, but the solver is per package. We need to smooth
-- it over, and munging the package names is a pretty good way to
-- do it.
-- | Convert dependencies specified by an installed package id into
-- flagged dependencies of the solver.
--
-- May return Nothing if the package can't be found in the index. That
-- indicates that the original package having this dependency is broken
-- and should be ignored.

@andreabedini
Copy link
Collaborator

The reason that package D depends on both sublibrary B and sublibrary C in the test case is that I wanted to ensure that cabal doesn't incorrectly group the two installed sublibraries from Package A-1.0 together as one instance when they are actually from two different instances.

Thinking about this, are we sure this is a problem? two installed instances of Package A-1.0 will have all their dependencies fixed. Even if they end up being split into multiple libraries in the packagedb, their dependencies will either match or they won't.
This only regards dependencies, I am not sure what other degrees of freedom we have and what coherence properties we expect.

@grayjay
Copy link
Collaborator

grayjay commented May 19, 2023

Thank you for clarifying. I was indeed a bit confused about the meaning of "instance" in this discussion but I ended up adopting the one used in the solver (which identifies instances with packagedb entries). Now I understand you are using the term in a more general way and the difference is exactly what we need to teach to the solver.

This note is also relevant.

The solver was originally designed without sublibraries, so it expected there to be one library per installed package. Then that hack was added for internal (private) libraries. It worked because it isn't possible for a source package to depend on an installed library that is private. Now I think we should change the solver's installed package index to act as a map of packages, each with one or more libraries.

Thinking about this, are we sure this is a problem? two installed instances of Package A-1.0 will have all their dependencies fixed. Even if they end up being split into multiple libraries in the packagedb, their dependencies will either match or they won't.
This only regards dependencies, I am not sure what other degrees of freedom we have and what coherence properties we expect.

I don't think that we should mix libraries from different installations of a package, since they could differ in more ways than dependencies, such as cabal build flags, compiler flags, or even source code. In my opinion, it would be similar to mixing libraries from different versions of the package.

I also think that recording the instance of an installed library (with the InstanceUnitId idea from above) would simplify forcing the dependencies to match.

@angerman
Copy link
Collaborator

A large question around this seems to be: "what does compatible" mean?
From a purely linking perspective, we shouldn't be mixing GHC ways. You can't have a profiling library work with a vanilla one.

From a dependency perspective, any "consumer" of a package find that package compatible (with any other instance of that package) as long as the exposed API (symbols, and relevant signatures) match; anything else is a blackbox to the consumer.

This ignore the semantic part where the symbols and signatures could stay the same, but the behaviour/meaning is different. That's what we usually have the PVP for?

@grayjay
Copy link
Collaborator

grayjay commented Nov 28, 2023

@angerman Is this comment for a different issue? This issue doesn't relate to GHC ways, except that they may both relate to InstalledPackageInfo.

@angerman
Copy link
Collaborator

@grayjay possible. @andreabedini had been flooding me with issues 🙈

@andreabedini
Copy link
Collaborator

This is the right issue, but maybe things got a bit confused. Let me try to remember and summarise the discussion. The solver being aware of multiple libraries means that the it will have to decide what to do in the situation where

  1. pkg-a-1.2.3.4:lib1 is preinstalled
  2. pkg-a-1.2.3.4:lib2 depends on pkg-a:lib1 and has to be compiled

Under what conditions the preinstalled package can be re-used?

The current answer is never, thanks to the name mangling introduced with private sublibs, the solver does not even see the preinstalled pkg-a-1.2.3.4:lib1 so it always recompiles it.

The precise information about how pkg-a-1.2.3.4:lib1 was build has been lost but we could hash this information into the unit-id.

The solver already knows that pre-installed packages have their dependencies fixed. So the preinstalled pkg-a-1.2.3.4:lib1 will never be used if its dependencies are incompatible with pkg-a-1.2.3.4:lib2.

Flags is something we could pack into the unit-id (and maybe we do already, Cabal and cabal-install use two different schemas and I always forget what goes into Cabal's one).

@grayjay suggests other things can go wrong:

I don't think that we should mix libraries from different installations of a package, since they could differ in more ways than dependencies, such as cabal build flags, compiler flags, or even source code. In my opinion, it would be similar to mixing libraries from different versions of the package.

Am I wrong understanding that in the above quote you imply that current behaviour is the only safe one? That is, we should never re-use a preinstalled sibiling library? I belive this would be too conservative.

@grayjay
Copy link
Collaborator

grayjay commented Dec 2, 2023

@angerman Now I understand. GHC ways is just an example of how installed packages can vary by installation. I think that we need to capture everything that you mentioned in the field that we use to identify an instance.

@andreabedini I actually hadn't considered the case that you described, where a package is only partially installed (only some of its sublibraries are installed). Do you think that that is likely to be a common case? I think that cabal should be able to handle it, but it seems like an edge case to me, like handling a broken installation.

I also realized that I don't know exactly what information goes into the unit ID. The InstanceUnitId design above relies on the unit ID of the main library capturing everything about how the whole package was built, including information about the other components. If the unit ID of the main library doesn't change when other components change, then maybe the unit ID isn't the best way to identify an instance.

The InstanceUnitId design could allow cabal to use two components of a package that were installed at different times, if the components were built in a compatible way (having the same unit ID for the main library).

The safest design would be for cabal to create a random unique ID for the instance whenever it installs a package, but I don't know if that is practical.

@andreabedini
Copy link
Collaborator

andreabedini commented Dec 8, 2023

@andreabedini I actually hadn't considered the case that you described, where a package is only partially installed (only some of its sublibraries are installed). Do you think that that is likely to be a common case? I think that cabal should be able to handle it, but it seems like an edge case to me, like handling a broken installation.

From my POV, this whole issue was always about this but now I see the distinction you make.
Following up my own comment:

Am I wrong understanding that in the above quote you imply that current behaviour is the only safe one? That is, we should never re-use a preinstalled sibiling library? I belive this would be too conservative.

Yes, I was wrong indeed. If we make public sublibraries visible to the solver and use a random (but common) unit id like you say, the solver will be able to pick a pre-installed sublibrary (while now it cannot) with absolutely zero risks.

Edit: no, the solver does not look at any id when it comes to pre-installed packages. It only matches on package name and version. Perhaps the first step is to make the solver aware of the configuration of pre-installed packages.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment