Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move many GHC.* modules to a new CLC-controlled library, base-legacy #295

Open
Ericson2314 opened this issue Oct 8, 2024 · 29 comments
Open

Comments

@Ericson2314
Copy link
Contributor

Ericson2314 commented Oct 8, 2024

We've had a little over a year of split base, it's a good start, but I don't believe we are yet on track for the big goal:

(Stated two ways, each implies the other)

Upgrading GHC should not force one to use a breaking-version-higher base.

It should be possible to have proper tight PVP version bounds on base, and build code with a new GHC on the day it comes without --allow-newer

Problems

Here are some things that are necessary but not sufficient to solve the above main problem.

Solution

A new library is created base-legacy.

Contents

GHC.* modules with the slightest whiff of instability (low threshold to clear) should be moved from base to base-legacy.

According to @bgamari's report, these are the currently-exposed modules in base that should not be exposed.
Thus, these modules would be moving to base-legacy.

Module "Current Indicated Stability" Desired visibility Stability risk Hackage Uses Action Notes
GHC.Arr internal internal 2 74 hide Available from array
GHC.ArrayArray internal internal 1 0 hide Available from GHC.Exts, deprecated legacy interface
GHC.Base internal internal 1 353 hide
GHC.Bits stable internal 1 0 hide Exists only to break import cycles
GHC.Constants internal internal 1 0 hide Now simply a re-export for GHC.Base
GHC.Event.TimeOut stable internal 3 0 hide
GHC.Exception.Type internal internal 2 2 hide Exists only to break import cycles within base
GHC.ExecutionStack.Internal internal 3 0 hide
GHC.Exts internal see document ? 1229 see document
GHC.Fingerprint.Type internal 2 18 hide Exists only to break import cycles within base
GHC.Float.ConversionUtils internal internal 1 0 hide
GHC.GHCi.Helpers internal internal 3 0 hide
GHC.IO.StdHandles internal internal 1 0 hide Exposed via System.IO
GHC.IOPort internal internal 3 0 hide Internal mechanism used by WinIO
GHC.Maybe internal 0 0 hide
GHC.TopHandler internal internal 2 6 hide
GHC.TypeLits.Internal internal 2 0 hide
GHC.TypeNats.Internal internal 2 0 hide
GHC.Char internal 1 6 internalize Everything useful available from Data.Char, exists only to break cycle with GHC.Enum
GHC.Conc.IO internal internal 2 3 internalize
GHC.Desugar internal internal 3 1 internalize Exports various bindings needed by GHC’s desugaring logic
GHC.Encoding.UTF8 internal internal 2 0 internalize Provides the UTF-8 decoder used by GHC internally
GHC.Fingerprint external 1 49 internalize Exists to support Typeable
GHC.Float.RealFracMethods internal internal 3 2 internalize Used by constant folding rules
GHC.GHCi internal internal 3 2 internalize
GHC.IO.FD internal internal 1 41 internalize
GHC.IO.Handle.Internals internal internal 3 34 internalize
GHC.IO.Handle.Text internal internal 2 5 internalize
GHC.IO.Handle.Types internal internal 3 52 internalize Exists to break import cycles
GHC.IO.SubSystem internal internal 3 6 internalize Implementation detail of GHC’s IO subsystem
GHC.IOArray internal internal 1 3 internalize This is exposed via the `array`` package
GHC.Ix internal internal 0 1 internalize Exposed via Data.Ix
GHC.RTS.Flags internal 2 10 internalize "Intended to mirror RTS's RtsFlags structure
GHC.Stats internal 2 47 internalize Keeping this unstable seems wise so that we can readily expose new RTS statistics
GHC.Storable internal internal 1 1 internalize
GHC.Weak.Finalize internal 1 0 internalize Exposed via GHC.Weak, should expose via System.Mem.Weak?
System.Posix.Internals internal internal 3 68 internalize
Type.Reflection.Unsafe 6 Internalize Used in generated code
GHC.Show internal internal 0+1 105 internalize, export Show from new Data.Show module Show class doesn't have a home outside of GHC.Show
GHC.Enum internal internal 0 45 internalize, expose via a new Data.Enum module
GHC.Conc.Signal internal 1 3 internalize, expose via GHC.Conc

Additionally, for preexisting versions of base, base-legacy should reexport those modules from base.

Ownershjp

base-legacy is controlled by the CLC, just like base, but with the expectation that will be more of a joint effort with GHC developers than base.

The reason for it being controlled by the CLC is that (as described below in migration scheme) initially a very large number of packages should depend on base-legacy.

The reason for it being more a joint project with the GHC developers is it contains a bunch of misc stuff that is naturally unstable. base-legacy definitions in general deserve a better home: either they should be reworked/wrapped into something more stable deserving to live in a properly stable library (including base) or they should be jettisoned to something completely unstable like ghc-internal entirely.

Stability policy

base-legacy is explicitly not intended to match the original stability goal.

  • It contains things which are deemed too unstable to be precisely supported with the same interface across GHC versions.
  • It contains things which are also, for better or worse, in wide use, like GHC.Exts.

Beyond having breaking version bumps every GHC release due to misc incompatibilities, how stable base-legacy is to be is a matter between the CLC and GHC developers.

I hope that that due to the creation of small single-purpose libraries with freshly-vetted designs, as appears to be the outcome of #283, there will be fewer users of base-legacy over time. But this is just the author's personal hope. It is not a normative part of this proposal. The proposal should still be able to succeed even if base-legacy is in use for years to come.

Overall the stability of base-legacy should not be an "interesting" part of this proposal: the focus should instead be on the newly-slimmed down base becoming much more stable.

Migration scheme

This is the heart of the proposal!

Because the intent is to liberally move modules from base to base-legacy, it is fully expected that a huge number of packages (hundreds? thousands?) will be broken by this change.

CLC policy requires (see PROPOSALS.md) that the a breaking change comes with patches for each affected package. In order to meet bar, that we have to make the migration very, very easy per package.

There is only one thing "easy" enough to meet that bar, and that is hackage revisions. We need to revise all these broken packages so that

build-depends: base XXX
``
becomes 
```cabal
build-depends: base XXX
if impl(ghc >= 9.14) -- exact version tbd
  build-depends: base-legacy YYY

This will ensure that the packages in fact do keep building with the new GHC, and we don't have a breakage apocalypse.

(I don't know if such a hackage revision is legal today. If it is not, we need to revise the policy / modify the Hackage server so that it is legal.)

Intended outcome

With this change, base should be a good bit smaller, and a good bit more stable.

This, on its own, should get us drastically closer to the main goal:

  1. The remaining interface between ghc-internal and base should be much smaller

  2. It should be possible to audit that subset much more closely, and not get hammered on reexport-causing breakage in base.

  3. The next version of GHC after this happens should ship with a base that is only a minor version higher --- i.e. the main problem we set out to fix is solved.

A few more things to say on point 3:

  • There may be other impediments to the minor version bump of base I propose above. But fixing them should be easier and less daunting than the massive breaking change I propose here. This step should be the hardest/scariest part, and everything afterwards easier/less scary in comparison.

    If what remains after is the hardest part, then I rescind this proposal in great sadness. We have sadly made too little progress and there are other big problems to fix before contemplating such a big breaking change.

  • The CLC may want to purposefully introduce breaking changes (as opposed to the accidental ones that we assume exist by bumping the base version majorly each release today). If they wish to do so, they can release a second base version compatible with the new GHC / ghc-internal. It is suggested, but not required, that both base versions have weird ghc-internal bounds so users have the option of upgrading base independently from upgrading GHC.

As a final note, it is intended that this change not shift the "balance of power" between GHC devs and CLC. The fact is that these two groups are still somewhat distressful of one another --- let's not sugarcoat it --- so I assume anything that moves hundreds of definitions from the preview of one group to the other I assume will be too controversial. That's fine. Having things go to base-legacy allows us to sort out these finer questions of proper interfaces and ownership on a more leisurely time scale.

Alternatives

I don't think there are viable alternatives to this plan

  • If we try argue whether individual modules belong in base, the process will take too long. We'll never get to solving the main problem above. Splitting base will have been in vain.

  • If we move widely used things to ghc-experimental or ghc-internal, we will make those packages too widely used. This will defeat the intended purpose of them being truly unstable and people being afraid to use them. This will just recreate the problems we have with base all over again. Only a new package that "contains nasty not-so-stable things that we nonetheless are at least temporarily trying to avoid breaking" fits the bill.

  • It would be nice to do this last, after all other impediments to only bumping the base version minorly in the next GHC have been fixed. But I don't think we know what the other problems fully are. I think we have to do this first to muster the resolve to do the remaining misc, easier steps.

    If we do agree to this, we can at least try to look ahead and anticipate what those problems are. If that becomes a significant amount of work, I can take this issue and turn it into an HF proposal.

@Ericson2314 Ericson2314 changed the title Move many GHC.* modules to a new CLC-controlled library, base legacy Move many GHC.* modules to a new CLC-controlled library, base-legacy Oct 8, 2024
@hasufell
Copy link
Member

hasufell commented Oct 8, 2024

I think this would be the best outcome for the end user.

However:

  • it will double the work for CLC (process wise, the chair has to manage two packages now, members have to vote on two packages, follow discussions on two packages,... the amount of code isn't the issue)
  • it nullifies the advantage that GHC HQ gains from not being under CLC purview anymore... it will basically be so forever for anything that's in base right now... I'm not sure that will actually help with the base split and the separation of concerns. It seems like almost conflicting goals. And even if we had base-legacy: GHC HQ has no obligation to add e.g. new primops there, potentially causing more confusion for users of old API.

We'll basically be forking base. I can't see how that won't end in a disaster in terms of API, process and effort.

@TeofilC
Copy link

TeofilC commented Oct 8, 2024

GHC.* modules with the slightest whiff of instability (low threshold to clear) should be moved from base to base-legacy.

Which modules in particular do you have in mind?

This is a bit of work, but it would also be good to compare recent breaking changes to modules to see which modules are de facto causes of breaking changes. It might very well be that certain modules that are closely coupled to GHC happen to be very stable in practice. And many modules that had a lot of recent additions might not have those in the future now that things can be exported from places that aren't base.

@doyougnu
Copy link

doyougnu commented Oct 8, 2024

Yes please! I would be more than happy to throw some cycles at this.

@AndreasPK
Copy link

If the main goal is to avoid version bumps I'm not convinced this proposal is worth the cost. In that regard I agree with bodigrim when he wrote:

That said, judging from my experience as a Hackage trustee, I disagree with your costs analysis here. Assuming nothing was broken for real, even if a maintainer does not have two minutes per year to make a revision, a client can say --allow-newer and carry on. Pure version bump is never a blocker, but breakage is.

It's also really unclear to me how much this proposal as written would help to avoid major version bumps. Since there is no analysis of recent changes and their impact on the need for major version bumps based on them.


If the goal is to create a smaller base there should be a clear vision for what the end result should look like. I don't think simply drawing the line at "everything in GHC.*" will lead to the best outcome there. Making a plan based on the stability assessment you linked seems like a better start then trying to move GHC.* out of base without closer inspection. The stability assessment linked might be a good start for a plan but definitely not the end.


I don't disagree with the spirit of the proposal but I think it would be much enhanced if:

  • It included an analysis for the reasons why major bumps have historically been necessary. Otherwise this proposal is essentially a shot in the dark.
  • It should not downplay the amount of breakage and follow ups required from such a change going forward. Yes it's "only" add a new dependency for most projects. But adding a new dependency for most projects is a lot.
  • It should go into detail about what a "a slight whiff of instability" means in practice. Is this based history, how user facing the interface of a module was intended to be, personal judgement, the linked assessment or something else?
  • How do we avoid an outcome where most projects now now just depend on ghc-legacy additionally while having all the same problems. But with even weaker stability guarantees for the code in ghc-legacy.
  • It seems like the proposal hopes for ghc-legacy use to be phased out and replaced by other things over time. Is this the case? If not ghc-legacy would be a horrible name. If it's intended as one step towards some goal there should be more detail on what that goal is beyond this proposal when it comes to the parts of base this proposal intends to move. Which would also help with coming up for a plan how ghc-legacy should be maintained.

@hasufell
Copy link
Member

hasufell commented Oct 8, 2024

Stability is just one concern. I find it much more important to draw better lines in terms of API and maintenance.

I don't see why any GHC specific module should live in base. It is the standard library. Why is it providing an interface to GHC (e.g. primops)?

Base is not a monorepo and as I see it it's partly misused to impose stronger guarantees (stability, etc.) on things that just happen to be there today, but simply don't belong there.

If primops users are scared their maintenance experience goes out the window as soon as we let GHC HQ handle is freely, then we might have a bigger problem here.

Why is there so little trust? Do those users feel they won't be heard or have to clean up after GHC devs?

I want to see this resolved, so that we can have good boundaries without suddenly regressing large parts of the ecosystem.

The zig-zag between GHC and CLC should be reduced. We achieve that through those efforts:

  • base split
  • better boundaries and less friction
  • gaining a shared understanding of what developer experience is expected of APIs that are candidates for deprecation and continue to maintain a certain standard even after dropping them from base

@Ericson2314
Copy link
Contributor Author

@hasufell

it will double the work for CLC (process wise, the chair has to manage two packages now, members have to vote on two packages, follow discussions on two packages,... the amount of code isn't the issue)

I disagree, but I don't know how to convince you.

The simplest argument is that the CLC should put less effort into base-legacy, because we don't want anything to be there long term --- it's just a compat bandaid while we try to find better homes for things.

it nullifies the advantage that GHC HQ gains from not being under CLC purview anymore... it will basically be so forever for anything that's in base right now... I'm not sure that will actually help with the base split and the separation of concerns. It seems like almost conflicting goals.

I agree more than you think in that I don't want base-legacy to live forever. But @Bodigrim and others made I think a very fair point that e.g. GHC.Exts cannot just be chucked to ghc-internal today, because it is (unfortunately) too widely used.

This is supposed to be a compromise where things that are bad interfaces but widely used are kept around stable-ish without sullying base.

And even if we had base-legacy: GHC HQ has no obligation to add e.g. new primops there, potentially causing more confusion for users of old API.

They shouldn't put new things there, and that is good! We don't actually want people using base-legacy, if we put new things in better places, then it will incentivize people to stop using base-legacy.


Bottom line is, if you want to just remove a bunch of stuff from base and kick it back to ghc-internal and ghc-experimental, I would not complain in the slightest, but I don't think you would be able to convince everyone this is worth the breakage / ossifying those libraries against their intended goals. Prove me wrong please! But if you cannot, please reconsider this proposal as the next-base option,

@Ericson2314
Copy link
Contributor Author

@TeofilC

Which modules in particular do you have in mind?

I am hoping we can rely entirely on @bgamari's analysis for this.

@Ericson2314
Copy link
Contributor Author

@AndreasPK

I am a bit confused to what extent your suggestions beyond just incorporating @bgamari's stability report. The idea is simply that if we move out all the modules which Ben judged have either caused stability problems in the past, or are likely to do in the future, then the remaining base should be a lot more stable.

I think there is a good chance the remaining base also makes "more sense" thematically, but I don't think we need that to be true.

It included an analysis for the reasons why major bumps have historically been necessary. Otherwise this proposal is essentially a shot in the dark.

I thought @bgamari's stability column did this? Also remember that we don't need to solve for every breaking change. Breaking changes that live in base itself (are not due to a reexport of a breaking thing in ghc-internal are fine, because reinstallable base makes them not GHC-version-depedent.

It should not downplay the amount of breakage and follow ups required from such a change going forward. Yes it's "only" add a new dependency for most projects. But adding a new dependency for most projects is a lot.

I don't know what this means. We're doing hackage revisions of hundreds of packages; yes that's scary on the face of it! But we're adding a dependency that cannot break install plans. What do we actually think will go wrong?

It should go into detail about what a "a slight whiff of instability" means in practice. Is this based history, how user facing the interface of a module was intended to be, personal judgement, the linked assessment or something else?

I want to just start with Ben's assessment, and then CLC and GHC devs can debate individual modules. I don't personally want to get involved in the exact set of modules --- I more care that many things are moved, than exactly what is moved. Does that make sense?

How do we avoid an outcome where most projects now now just depend on base-legacy additionally while having all the same problems. But with even weaker stability guarantees for the code in base-legacy.

base-legacy is supposed to have roughly the same stability guarantee of base today: things can be deprecated and removed from base and base-legacy. There is an expectation that base-legacy stuff is more likely to be deprecated, but this is just an expectation.

Insofar the base-legacy deprecations do happen, is that the problem you are worried about? Well, that "problem" is also the solution! If base breakages need a migration plan, then base-legacy packages need a migration plan too. Packages that need use some to-be-removed part of base-legacy will be migrated to use some other library instead. Fewer and fewer things will use base-legacy over time. Eventually, we can retire it.

Likewise, if that doesn't happen, awe shucks, but also that means that base-legacy is more stable than we expected, and not posing a stability risk to users.

If it's intended as one step towards some goal there should be more detail on what that goal is beyond this proposal when it comes to the parts of base this proposal intends to move.

I am wary to do that because I don't want to debate the specific fates of specific modules. I hope that once @bgamari's analysis is turned into a provisional list of modules to move (I am down to do at least that much), we can agree that the mentioned modules are "suspicious" and that something should happen to them.

If we cannot agree some module is even dubious or likely to cause breakage, then we can just leave it in base.

@Bodigrim
Copy link
Collaborator

Bodigrim commented Oct 8, 2024

The migration requirement outlined by @Bodigrim was "proposer has prepared patches for affected packages", so to meet that bar, we have to make the migration very, very easy per package.

To be clear, this is not my personal preference, this requirement is part of CLC policy as detailed in PROPOSALS.md.

(I don't know if such a hackage revision is legal today. If it is not, we need to revise the policy / modify the Hackage server so that it is legal.)

It is not: neither adding a conditional nor adding a new dependency is a legal revision. I'm not sure that it's a good idea to allow such revisions in general, it's too close to giving Hackage trustees powers to rewrite entire Cabal file at their whim.

@Ericson2314
Copy link
Contributor Author

@Bodigrim thanks for these clarifications

To be clear, this is not my personal preference, this requirement is part of CLC policy as detailed in PROPOSALS.md.

I revised the issue to reflect that.

It is not: neither adding a conditional nor adding a new dependency is a legal revision. I'm not sure that it's a good idea to allow such revisions in general, it's too close to giving Hackage trustees powers to rewrite entire Cabal file at their whim.

I would not want to generalize the exception to the usual policy then: only adding a base-legacy dependency where there was already base dependency, and only gated on new enough GHC. No there modifications would be newly allowed.

@hasufell
Copy link
Member

hasufell commented Oct 9, 2024

The simplest argument is that the CLC should put less effort into base-legacy, because we don't want anything to be there long term --- it's just a compat bandaid while we try to find better homes for things.

This sounds very counter-intuitive for end users. The reason they would depend on base-legacy is because they don't want to deal with the new changes.

But suddenly the package is deprecated or abandoned?

That's even more breakage.

I can't see the bigger picture here and I'm definitely not a fan of temporary packages.

@TeofilC
Copy link

TeofilC commented Oct 9, 2024

I'm not sure if this bulk revision idea can work because it would be invalidated by the next release from the package maintainers. You can't get around the fact that you need to make these PRs and they need to be accepted by the maintainers.

@Bodigrim
Copy link
Collaborator

Bodigrim commented Oct 9, 2024

This sounds very counter-intuitive for end users. The reason they would depend on base-legacy is because they don't want to deal with the new changes.

But suddenly the package is deprecated or abandoned?

That's even more breakage.

I agree. Why break everything twice? We can slowly deprecate and remove things from base in place, I don't see how an additional package helps with it.

Is there an assumption that CLC applies same stability requirements to all modules in base (and thus to reduce such requirements one has to move a module outside)? This is not so, see #146 (comment) and especially comments immediately above.

What I would suggest is to mark all modules, which were marked as unstable in #146 (comment), as deprecated (at this point without any particular schedule for removal). Then after a couple of GHC releases we will be in good position to decide whether and when to remove them from base.

@Ericson2314
Copy link
Contributor Author

OK I pulled the module list from @bgamari's report. I think this is needed to make the questions of cost and timeline a bit more quantitative.

@Bodigrim How long do you think it would take to deprecate all these modules via the normal process?

My hunch is that it will take quite a long time to find permanent homes for everything, so the two-step move allows us to clean up base sooner and asynchronously find better permanent homes for these things (or decide they should only be gotten from ghc-internal).

But by all means, if you think the regular process can go fast enough, claim an ambitious timeline and prove me wrong.

@Bodigrim
Copy link
Collaborator

Bodigrim commented Oct 9, 2024

The list of modules in #146 (comment) is so internal-internal, that I would not hesitate to deprecate them all in one go, suggesting to use ghc-internal or disabling -Wno-deprecations if one really needs them (my imagination fails me to see why).

@Ericson2314
Copy link
Contributor Author

Ericson2314 commented Oct 9, 2024

@Bodigrim OK, glad to hear it! Formally deprecating the modules is a good start.

What about removing those deprecated modules? Or (I guess this works for my stated goal) declaring that PvP will be calculated ignoring changed modules?

I am worried that that the amount of code using some of those modules, especially GHC.Exts, is large enough that deprecation warnings won't work to "organically" drive down usage in a timely matter.

@Bodigrim
Copy link
Collaborator

Bodigrim commented Oct 9, 2024

What about removing those deprecated modules?

Once they are deprecated for at least one GHC release, someone would have to analyze Stackage and determine how widespread is remaining usage. If it's only a few packages which can be easily patched, we might be in a good position to remove.

Or (I guess this works for my stated goal) declaring that PvP will be calculated ignoring changed modules?

No, it does not work for me.

especially GHC.Exts

GHC.Exts is not a part of #146 (comment).

@Ericson2314
Copy link
Contributor Author

Even excluding GHC.Exts, we still have some nasty high numbers like 353 packages for GHC.Base.

Per https://edit.smart-cactus.org/EjfMUGhOSzCnGpEsGLR4iA?view which accompanies the spreadsheet, GHC.Exts does contain numerous unstable things, so something must be done about it. And that brings us to the highest 1229 packages number.

These are some serious challenges. Many of those packages are unmaintained or barely maintained. Do we wait indefinitely for maintainers to respond to deprecation wornings? Do we write hundreds of patches? Do we want to make tough calls about which ones of those packages "actually matter"?

A new library + cabal revision seems unrivalled to me in terms of quickly and safely getting this stuff out of base.

@hasufell
Copy link
Member

Many of those packages are unmaintained or barely maintained.

This is not our problem.

Deprecation periods are an opportunity. They are not a guarantee that maintainers follow or care about them. But sticking to them gives maintainers who do care a powerful and graceful process to adjust in time.

It's good form.

@TeofilC
Copy link

TeofilC commented Oct 10, 2024

Per https://edit.smart-cactus.org/EjfMUGhOSzCnGpEsGLR4iA?view which accompanies the spreadsheet, GHC.Exts does contain numerous unstable things, so something must be done about it. And that brings us to the highest 1229 packages number.

My understanding is that new primops are now going to be exported from ghc-experimental: https://gitlab.haskell.org/ghc/ghc/-/issues/25242

That should ameliorate some of the instability of GHC.Exts as most of the instability comes from adding new things (right?).

With its very high usage, I think the best thing we can do is deprecate and freeze the interface. (Same goes for GHC.Base imo)

@Ericson2314
Copy link
Contributor Author

@hasufell I having nothing against some sort of standard deprecation period, but then what? What do we do if many of those packages little attention and they are still using unstable GHC-isms from base?

If miraculously a bunch of people come out of the woodwork and upgrade their packages: horray! That's fantistic! We don't need this! But I don't want to depend on miracles, I want a plan that works either way.

@Ericson2314
Copy link
Contributor Author

Ericson2314 commented Oct 10, 2024

@TeofilC

That should ameliorate some of the instability of GHC.Exts as most of the instability comes from adding new things (right?).

I don't think so, adding new things does not bump the PVP. Primops and other such implementation details are supposed to be free to come and go. We might be able to write shims in the case that they go, but I don't want to depend on that always being the case.

With its very high usage, I think the best thing we can do is deprecate and freeze the interface. (Same goes for GHC.Base imo)

Yeah for sake of argument I would be fine if base-legacy just contained those two modules. (Let's focus on the hard ones which drive the policy, and not worry about the easier ones which can be dealt with more different ways.)

To recap again

  1. I do want to deprecate it
  2. I am OK with attempting to freeze it, but I don't think that freeze will by fully successful for the reasons I described above
  3. As @Bodigrim says in Move many GHC.* modules to a new CLC-controlled library, base-legacy #295 (comment) (and I think the CLC previously agreed), "trying is not good enough", any time a best-effort freeze fails, even on a deprecated module, we have to bump the major version number
  4. Given the high likelihood of failing to freeze the interface, and also the necessity of bumping the version number if that happens, I am only confident that we'll hit stable base in a timely matter if we move the modules out before they are not in-use
  5. The only feasible way I can image doing that is new library + hackage revisions.

I don't want sly these sly tricks to be our only option to getting to a stable base, but given the situation I see no other option. If other people said "we see what you mean, but don't think stable base is worth doing tricks", well at least we would be disagreeing just on the level of opinions, not facts.

@TeofilC
Copy link

TeofilC commented Oct 10, 2024

@Ericson2314 My main point of contention is that I'm not sure how this mass hackage revision plan can work.

What happens when the package maintainer next uploads a fresh version?
It sounds like they will be forced to deal with this breaking change, but then I'm not sure if the hackage revision buys us much. Ultimately the patch has to land in the upstream repository.

@Ericson2314
Copy link
Contributor Author

Ericson2314 commented Oct 10, 2024

@TeofilC OK fair enough.

I am mainly worried about the abandoned packages. If a maintainer actually uploads a new version, I think it decently likely that they have in fact tested with the latest GHC / read deprecation warnings / etc.

In addition, we can add a warning if a new version seemingly reverts a hackage revision. (Really, this is functionality we ought to have anyways.) It might be hard to do in general, but it should be easy enough for "you used to have a base-legacy dep, and now you don't and you didn't otherwise change your deps".

@AndreasPK
Copy link

I agree with @hasufell that the goal should be a base that makes sense (API and maintenance wise). I think he's also right about a temporary package being a bad idea. This just breaks things twice for people depending on those things.


Base is not a monorepo and as I see it it's partly misused to impose stronger guarantees (stability, etc.) on things that just happen to be there today, but simply don't belong there.

At least from my pov there have been breaking changes the CLC initially approved that GHC devs argued against but little in the other way. I don't remember a situation where the CLC "imposed" stability on anything GHC related in the recent past. For additions to base it's a very different story but most of these changes wouldn't have been breaking according to the PvP or by any other meaningful metric.

So higher stability for primops or similar is not a good reason to keep those in base imo.


It should not downplay the amount of breakage and follow ups required from such a change going forward. Yes it's "only" add a new dependency for most projects. But adding a new dependency for most projects is a lot.

I don't know what this means. We're doing hackage revisions of hundreds of packages; yes that's scary on the face of it! But we're adding a dependency that cannot break install plans. What do we actually think will go wrong?

You can do a hackage revision to avoid immediate breakage if trustees change the policy just for this. But this doesn't remove the need to eventually update the cabal file of every package affected.

And then there are also non-hackage related breakages that will be needed. As proposed this change will require for example updates to fix benchmarks in nofib that have worked as they are for over a decade without change.

Hackage revisions are only a temporary relief, a lot of the cost will be incured after that. This shouldn't mean we should automatically reject work along those lines. Just that the benefit needs to be good enough to warrant it.


It included an analysis for the reasons why major bumps have historically been necessary. Otherwise this proposal is essentially a shot in the dark.

I thought @bgamari's stability column did this?

It did not. really just uses 2: Exports internal implementation details which are likely to change as a metric where "likely to change" is based on bens (informed) opinion. While helpfull in other ways this tells us nothing about how many major version bumps could have been avoided without those modules.

If the main goal is to remove those modules from base in some fashion because it makes sense for various reasons we should just say so. But if the main goal is strictly about major version bumps, and that's how the proposal is currently written, then starting by removing modules from base, requiring thousands of cabal file updates from users just to potentially later find out that it has been in vain seems like a unproductive approach.

Also remember that we don't need to solve for every breaking change. Breaking changes that live in base itself (are not due to a reexport of a breaking thing in ghc-internal are fine, because reinstallable base makes them not GHC-version-depedent.

I don't think it will be as easy but I can still see your point in distinguishing "prevents reinstallable base" breakage and other breakage.


The simplest argument is that the CLC should put less effort into base-legacy, because we don't want anything to be there long term --- it's just a compat bandaid while we try to find better homes for things.

Personally I hate the idea of a temporary package like that. Either it's worth having a ghc-coupled package that's more strictly controlled via some council (clc or otherwise) let's call that ghc-base. If so this should be a permanent package and should get additions as ghc adds features.

Or the lower stability expectation makes the oversight and CLC process too burdensome. Then those interfaces should just be exposed by ghc in some fashion ghc devs see as reasonable. I think ghc-experimental would fit the bill.

Breaking things twice, once to move it into a clc controlled library and later moving it somewhere else seems like the worst of both worlds to me.

I agree with concept of this proposal (even though not with all the reasoning). Personally I would like to see a variation of this plan where:

  • Some problematic modules are identified (I agree with the list for the most part) and a decision is made to deprecate them.
  • Once the decision is made it is announced. Interfaces without modern purpose like ArrayArray are deprecated with a deprecation cycle starting at that poing. Everything else is made available through other packages like ghc-base/ghc-experimental or other packages
  • One a new place for all the things to be moved out of base has been found and established they are deprecated in base.
  • After some period of deprecation these changes are made:
    • The deprecated modules are no longer be part of PvP versioning.
    • Things that are very unstable or can be dropped with very low impact on the ecosystem are dropped at this point.
    • Things that have been stable or widely used should remain in base but deprecated unless they significantly impede efforts on base stability/reinstallability.
  • After an extended period of deprecation a decision should be made about dropping the remaining deprecated parts in base completely, or leaving them in place while deprecated until they cause trouble.

@Ericson2314
Copy link
Contributor Author

@AndreasPK

One a new place for all the things to be moved out of base has been found and established they are deprecated in base.

How long do you expect this to take? If we can do it quickly (entirely within one release cycle, let's say) I am OK with that.

I was conservatively assuming it might take longer for things to unwind, but I'd be glad to be wrong.

The deprecated modules are no longer be part of PvP versioning.

@Bodigrim and others on the CLC have said before don't want to do that.

@hasufell told me privately he is open to moving things out after the deprecation cycle regardless of whether people obeyed the warnings, but I am not sure whether the others are or the GHC devs are OK with that.

What I really want to here is not establish that my plan above is the only way, but get everyone on the board that we need to pick our poison.

One of

  1. PVP exceptions for deprecated modules
  2. Remove of deprecated things (after a priori fixed migration period) that are still in use in (probably barely-maintained) Hackage packages
  3. Hackage revision tricks

Is needed to deal with monsters like GHC.Exts and GHC.Base --- I don't believe there is a "nice solution" that doesn't involve sacrificing something for dealing, and no one else has proposed one either.

Maybe it would just be good to vote on which method (or any others proposed) is the most preferred?

@AndreasPK
Copy link

AndreasPK commented Oct 11, 2024

How long do you expect this to take? If we can do it quickly (entirely within one release cycle, let's say) I am OK with that.

I was conservatively assuming it might take longer for things to unwind, but I'd be glad to be wrong.

Well for it to happen people need to agree on how to do it (what package(s), and how exactly that will look like in terms of maintenance and some inevitable naming bikeshedding for the new module structure if applicable and the package name.

Once people agreed on a package, and if the package is intended to live in the ghc-repo and be under ghc maintenance I think it's realistic to be accepted on this timescale.

But I can't possible know how long the decision making will take or promise that someone on the ghc team will step up to write the required patch once the decision has been made. So that's the big questionmark.

In terms of decision making I kicked of a discussion on ghcs side that's highly relevant in https://gitlab.haskell.org/ghc/ghc/-/issues/25326 on how ghc should expose functionality. Which seems pretty relevant here.

A optimistic but imo still reasonable timeline imo would be to move things into ghc-experimental (or other packages) for 9.14 which establishes the new home for things, and then in 9.16 we could deprecate things in base. But it's always possible that we run into blockers.

@Bodigrim and others on the CLC have said before don't want to do that.

@hasufell told me privately he is open to moving things out after the deprecation cycle regardless of whether people obeyed the warnings, but I am not sure whether the others are or the GHC devs are OK with that.

Ultimately the goal (imo) seems to be to get into a position where it's possible to make it very stable and re installable. Once the "unstable" parts have been deprecated inside base and have been deprecated for a while there are multiple ways to deal with this.

  • Keep deprecated things in base but stop versioning them.
  • Move things out of base "at all cost"
  • Possible other not yet unveiled approaches.

But these things can easily be decided/voted on independently of the main payload of turning base into something that can reasonably can be made re installable/stable.

Hackage revision tricks

I will just repeat that I don't think this is more than a thin bandaid that will come off the moment packages without updated cabal files are uploaded to hackage.


Not sure if we want to use this approach, but I had this terrifying idea today that if we:

  • Move these modules to ghc-experimental
  • Made ghc imply ghc-experimental as a dependent package implicity by default unless instructed others.
  • Made cabal include ghc-experimental in the build plan implicitly

Then there would be no breakage. For direct invocations of ghc I don't see a downside.

For cabal this could be reasonably gated behind a cabal spec. So when using Cabal-Version: >=X.Y a dependency on base would only be a dependency on base. But with a cabal version less than that Cabal would use a legacy mode where base is implied to mean base and whatever ghc-experimental version the ghc compiler used ships with.

This way if you upgrade your cabal file to a new spec you have to adjust things, but if you just compile your 20 year old blog post software that uses Haskell98 then things will just keep working.

This does seem very unprincipled, yet it seems like it should just work with no obvious downside?

@Ericson2314
Copy link
Contributor Author

#299 I tried to extract the uncontroversial part to here.

@hasufell
Copy link
Member

told me privately he is open to moving things out after the deprecation cycle regardless of whether people obeyed the warnings

I don't want to derail into another topic, but I view deprecation periods as a courtesy process.

It should be followed as strict as possible, so that end users can adjust their expectations and processes.

I don't care at all whether half of hackage breaks because they disabled all warnings and didn't read ChangeLogs.

I'd go so far to say that CLC impact analysis is kinda moot, as long as we have a proper deprecation process. Then we don't really need to constantly decide case by case and try to be smart. Hackage isn't even a good metric of overall ecosystem breakage, imo.

But when it comes to API that by design is unstable, I don't see much value of deprecations. Ultimately I want all of this API out of base.

However, adhering to PVP is non-negotiable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants