-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move many GHC.*
modules to a new CLC-controlled library, base-legacy
#295
Comments
GHC.*
modules to a new CLC-controlled library, base
legacyGHC.*
modules to a new CLC-controlled library, base-legacy
I think this would be the best outcome for the end user. However:
We'll basically be forking base. I can't see how that won't end in a disaster in terms of API, process and effort. |
Which modules in particular do you have in mind? This is a bit of work, but it would also be good to compare recent breaking changes to modules to see which modules are de facto causes of breaking changes. It might very well be that certain modules that are closely coupled to GHC happen to be very stable in practice. And many modules that had a lot of recent additions might not have those in the future now that things can be exported from places that aren't |
Yes please! I would be more than happy to throw some cycles at this. |
If the main goal is to avoid version bumps I'm not convinced this proposal is worth the cost. In that regard I agree with bodigrim when he wrote:
It's also really unclear to me how much this proposal as written would help to avoid major version bumps. Since there is no analysis of recent changes and their impact on the need for major version bumps based on them. If the goal is to create a smaller base there should be a clear vision for what the end result should look like. I don't think simply drawing the line at "everything in I don't disagree with the spirit of the proposal but I think it would be much enhanced if:
|
Stability is just one concern. I find it much more important to draw better lines in terms of API and maintenance. I don't see why any GHC specific module should live in base. It is the standard library. Why is it providing an interface to GHC (e.g. primops)? Base is not a monorepo and as I see it it's partly misused to impose stronger guarantees (stability, etc.) on things that just happen to be there today, but simply don't belong there. If primops users are scared their maintenance experience goes out the window as soon as we let GHC HQ handle is freely, then we might have a bigger problem here. Why is there so little trust? Do those users feel they won't be heard or have to clean up after GHC devs? I want to see this resolved, so that we can have good boundaries without suddenly regressing large parts of the ecosystem. The zig-zag between GHC and CLC should be reduced. We achieve that through those efforts:
|
I disagree, but I don't know how to convince you. The simplest argument is that the CLC should put less effort into
I agree more than you think in that I don't want This is supposed to be a compromise where things that are bad interfaces but widely used are kept around stable-ish without sullying
They shouldn't put new things there, and that is good! We don't actually want people using Bottom line is, if you want to just remove a bunch of stuff from |
I am a bit confused to what extent your suggestions beyond just incorporating @bgamari's stability report. The idea is simply that if we move out all the modules which Ben judged have either caused stability problems in the past, or are likely to do in the future, then the remaining I think there is a good chance the remaining
I thought @bgamari's stability column did this? Also remember that we don't need to solve for every breaking change. Breaking changes that live in
I don't know what this means. We're doing hackage revisions of hundreds of packages; yes that's scary on the face of it! But we're adding a dependency that cannot break install plans. What do we actually think will go wrong?
I want to just start with Ben's assessment, and then CLC and GHC devs can debate individual modules. I don't personally want to get involved in the exact set of modules --- I more care that many things are moved, than exactly what is moved. Does that make sense?
Insofar the Likewise, if that doesn't happen, awe shucks, but also that means that
I am wary to do that because I don't want to debate the specific fates of specific modules. I hope that once @bgamari's analysis is turned into a provisional list of modules to move (I am down to do at least that much), we can agree that the mentioned modules are "suspicious" and that something should happen to them. If we cannot agree some module is even dubious or likely to cause breakage, then we can just leave it in |
To be clear, this is not my personal preference, this requirement is part of CLC policy as detailed in
It is not: neither adding a conditional nor adding a new dependency is a legal revision. I'm not sure that it's a good idea to allow such revisions in general, it's too close to giving Hackage trustees powers to rewrite entire Cabal file at their whim. |
@Bodigrim thanks for these clarifications
I revised the issue to reflect that.
I would not want to generalize the exception to the usual policy then: only adding a |
This sounds very counter-intuitive for end users. The reason they would depend on But suddenly the package is deprecated or abandoned? That's even more breakage. I can't see the bigger picture here and I'm definitely not a fan of temporary packages. |
I'm not sure if this bulk revision idea can work because it would be invalidated by the next release from the package maintainers. You can't get around the fact that you need to make these PRs and they need to be accepted by the maintainers. |
I agree. Why break everything twice? We can slowly deprecate and remove things from Is there an assumption that CLC applies same stability requirements to all modules in What I would suggest is to mark all modules, which were marked as unstable in #146 (comment), as deprecated (at this point without any particular schedule for removal). Then after a couple of GHC releases we will be in good position to decide whether and when to remove them from |
OK I pulled the module list from @bgamari's report. I think this is needed to make the questions of cost and timeline a bit more quantitative. @Bodigrim How long do you think it would take to deprecate all these modules via the normal process? My hunch is that it will take quite a long time to find permanent homes for everything, so the two-step move allows us to clean up But by all means, if you think the regular process can go fast enough, claim an ambitious timeline and prove me wrong. |
The list of modules in #146 (comment) is so internal-internal, that I would not hesitate to deprecate them all in one go, suggesting to use |
@Bodigrim OK, glad to hear it! Formally deprecating the modules is a good start. What about removing those deprecated modules? Or (I guess this works for my stated goal) declaring that PvP will be calculated ignoring changed modules? I am worried that that the amount of code using some of those modules, especially |
Once they are deprecated for at least one GHC release, someone would have to analyze Stackage and determine how widespread is remaining usage. If it's only a few packages which can be easily patched, we might be in a good position to remove.
No, it does not work for me.
|
Even excluding Per https://edit.smart-cactus.org/EjfMUGhOSzCnGpEsGLR4iA?view which accompanies the spreadsheet, These are some serious challenges. Many of those packages are unmaintained or barely maintained. Do we wait indefinitely for maintainers to respond to deprecation wornings? Do we write hundreds of patches? Do we want to make tough calls about which ones of those packages "actually matter"? A new library + cabal revision seems unrivalled to me in terms of quickly and safely getting this stuff out of |
This is not our problem. Deprecation periods are an opportunity. They are not a guarantee that maintainers follow or care about them. But sticking to them gives maintainers who do care a powerful and graceful process to adjust in time. It's good form. |
My understanding is that new primops are now going to be exported from That should ameliorate some of the instability of With its very high usage, I think the best thing we can do is deprecate and freeze the interface. (Same goes for GHC.Base imo) |
@hasufell I having nothing against some sort of standard deprecation period, but then what? What do we do if many of those packages little attention and they are still using unstable GHC-isms from If miraculously a bunch of people come out of the woodwork and upgrade their packages: horray! That's fantistic! We don't need this! But I don't want to depend on miracles, I want a plan that works either way. |
I don't think so, adding new things does not bump the PVP. Primops and other such implementation details are supposed to be free to come and go. We might be able to write shims in the case that they go, but I don't want to depend on that always being the case.
Yeah for sake of argument I would be fine if To recap again
I don't want sly these sly tricks to be our only option to getting to a stable |
@Ericson2314 My main point of contention is that I'm not sure how this mass hackage revision plan can work. What happens when the package maintainer next uploads a fresh version? |
@TeofilC OK fair enough. I am mainly worried about the abandoned packages. If a maintainer actually uploads a new version, I think it decently likely that they have in fact tested with the latest GHC / read deprecation warnings / etc. In addition, we can add a warning if a new version seemingly reverts a hackage revision. (Really, this is functionality we ought to have anyways.) It might be hard to do in general, but it should be easy enough for "you used to have a |
I agree with @hasufell that the goal should be a base that makes sense (API and maintenance wise). I think he's also right about a temporary package being a bad idea. This just breaks things twice for people depending on those things.
At least from my pov there have been breaking changes the CLC initially approved that GHC devs argued against but little in the other way. I don't remember a situation where the CLC "imposed" stability on anything GHC related in the recent past. For additions to base it's a very different story but most of these changes wouldn't have been breaking according to the PvP or by any other meaningful metric. So higher stability for primops or similar is not a good reason to keep those in base imo.
You can do a hackage revision to avoid immediate breakage if trustees change the policy just for this. But this doesn't remove the need to eventually update the cabal file of every package affected. And then there are also non-hackage related breakages that will be needed. As proposed this change will require for example updates to fix benchmarks in nofib that have worked as they are for over a decade without change. Hackage revisions are only a temporary relief, a lot of the cost will be incured after that. This shouldn't mean we should automatically reject work along those lines. Just that the benefit needs to be good enough to warrant it.
It did not. really just uses If the main goal is to remove those modules from base in some fashion because it makes sense for various reasons we should just say so. But if the main goal is strictly about major version bumps, and that's how the proposal is currently written, then starting by removing modules from base, requiring thousands of cabal file updates from users just to potentially later find out that it has been in vain seems like a unproductive approach.
I don't think it will be as easy but I can still see your point in distinguishing "prevents reinstallable base" breakage and other breakage.
Personally I hate the idea of a temporary package like that. Either it's worth having a ghc-coupled package that's more strictly controlled via some council (clc or otherwise) let's call that ghc-base. If so this should be a permanent package and should get additions as ghc adds features. Or the lower stability expectation makes the oversight and CLC process too burdensome. Then those interfaces should just be exposed by ghc in some fashion ghc devs see as reasonable. I think ghc-experimental would fit the bill. Breaking things twice, once to move it into a clc controlled library and later moving it somewhere else seems like the worst of both worlds to me. I agree with concept of this proposal (even though not with all the reasoning). Personally I would like to see a variation of this plan where:
|
How long do you expect this to take? If we can do it quickly (entirely within one release cycle, let's say) I am OK with that. I was conservatively assuming it might take longer for things to unwind, but I'd be glad to be wrong.
@Bodigrim and others on the CLC have said before don't want to do that. @hasufell told me privately he is open to moving things out after the deprecation cycle regardless of whether people obeyed the warnings, but I am not sure whether the others are or the GHC devs are OK with that. What I really want to here is not establish that my plan above is the only way, but get everyone on the board that we need to pick our poison. One of
Is needed to deal with monsters like Maybe it would just be good to vote on which method (or any others proposed) is the most preferred? |
Well for it to happen people need to agree on how to do it (what package(s), and how exactly that will look like in terms of maintenance and some inevitable naming bikeshedding for the new module structure if applicable and the package name. Once people agreed on a package, and if the package is intended to live in the ghc-repo and be under ghc maintenance I think it's realistic to be accepted on this timescale. But I can't possible know how long the decision making will take or promise that someone on the ghc team will step up to write the required patch once the decision has been made. So that's the big questionmark. In terms of decision making I kicked of a discussion on ghcs side that's highly relevant in https://gitlab.haskell.org/ghc/ghc/-/issues/25326 on how ghc should expose functionality. Which seems pretty relevant here. A optimistic but imo still reasonable timeline imo would be to move things into ghc-experimental (or other packages) for 9.14 which establishes the new home for things, and then in 9.16 we could deprecate things in base. But it's always possible that we run into blockers.
Ultimately the goal (imo) seems to be to get into a position where it's possible to make it very stable and re installable. Once the "unstable" parts have been deprecated inside base and have been deprecated for a while there are multiple ways to deal with this.
But these things can easily be decided/voted on independently of the main payload of turning base into something that can reasonably can be made re installable/stable.
I will just repeat that I don't think this is more than a thin bandaid that will come off the moment packages without updated cabal files are uploaded to hackage. Not sure if we want to use this approach, but I had this terrifying idea today that if we:
Then there would be no breakage. For direct invocations of ghc I don't see a downside. For cabal this could be reasonably gated behind a cabal spec. So when using This way if you upgrade your cabal file to a new spec you have to adjust things, but if you just compile your 20 year old blog post software that uses Haskell98 then things will just keep working. This does seem very unprincipled, yet it seems like it should just work with no obvious downside? |
#299 I tried to extract the uncontroversial part to here. |
I don't want to derail into another topic, but I view deprecation periods as a courtesy process. It should be followed as strict as possible, so that end users can adjust their expectations and processes. I don't care at all whether half of hackage breaks because they disabled all warnings and didn't read ChangeLogs. I'd go so far to say that CLC impact analysis is kinda moot, as long as we have a proper deprecation process. Then we don't really need to constantly decide case by case and try to be smart. Hackage isn't even a good metric of overall ecosystem breakage, imo. But when it comes to API that by design is unstable, I don't see much value of deprecations. Ultimately I want all of this API out of base. However, adhering to PVP is non-negotiable. |
We've had a little over a year of split
base
, it's a good start, but I don't believe we are yet on track for the big goal:(Stated two ways, each implies the other)
Problems
Here are some things that are necessary but not sufficient to solve the above main problem.
base
still contains too muchGHC.*
stuff that is not stable. See what @bgmari wrote in Expose the new primopsisByteArrayWeaklyPinned#
andisMutableByteArrayWeaklyPinned#
from GHC.Exts #283 (comment) for example:base
currently exports so much less-than-stable stuff, it currently an exact version bound onghc-internal
: https://github.com/ghc/ghc/blob/9c9c790dbca89722080f47158001ac3920f11606/libraries/base/base.cabal.in#L33 prevents a non-trivially-reinstallablebase
which means that only way to satisify the original problem is for the CLC to never make a breaking change again, which is too high a bar..The CLC is (IMO reasonably) unwilling to jettison a bunch of modules from
base
without a viable migration strategy that can be executed on all affected public packes. See in Expose the new primopsisByteArrayWeaklyPinned#
andisMutableByteArrayWeaklyPinned#
from GHC.Exts #283 (comment) @Bodigrim writesSolution
A new library is created
base-legacy
.Contents
GHC.*
modules with the slightest whiff of instability (low threshold to clear) should be moved frombase
tobase-legacy
.According to @bgamari's report, these are the currently-exposed modules in
base
that should not be exposed.Thus, these modules would be moving to
base-legacy
.GHC.Arr
array
GHC.ArrayArray
GHC.Exts
, deprecated legacy interfaceGHC.Base
GHC.Bits
GHC.Constants
GHC.Base
GHC.Event.TimeOut
GHC.Exception.Type
GHC.ExecutionStack.Internal
GHC.Exts
GHC.Fingerprint.Type
GHC.Float.ConversionUtils
GHC.GHCi.Helpers
GHC.IO.StdHandles
System.IO
GHC.IOPort
GHC.Maybe
GHC.TopHandler
GHC.TypeLits.Internal
GHC.TypeNats.Internal
GHC.Char
Data.Char
, exists only to break cycle with GHC.EnumGHC.Conc.IO
GHC.Desugar
GHC.Encoding.UTF8
GHC.Fingerprint
GHC.Float.RealFracMethods
GHC.GHCi
GHC.IO.FD
GHC.IO.Handle.Internals
GHC.IO.Handle.Text
GHC.IO.Handle.Types
GHC.IO.SubSystem
GHC.IOArray
GHC.Ix
Data.Ix
GHC.RTS.Flags
GHC.Stats
GHC.Storable
GHC.Weak.Finalize
GHC.Weak
, should expose viaSystem.Mem.Weak
?System.Posix.Internals
Type.Reflection.Unsafe
GHC.Show
Data.Show
moduleGHC.Enum
Data.Enum
moduleGHC.Conc.Signal
GHC.Conc
Additionally, for preexisting versions of
base
,base-legacy
should reexport those modules frombase
.Ownershjp
base-legacy
is controlled by the CLC, just likebase
, but with the expectation that will be more of a joint effort with GHC developers thanbase
.The reason for it being controlled by the CLC is that (as described below in migration scheme) initially a very large number of packages should depend on
base-legacy
.The reason for it being more a joint project with the GHC developers is it contains a bunch of misc stuff that is naturally unstable.
base-legacy
definitions in general deserve a better home: either they should be reworked/wrapped into something more stable deserving to live in a properly stable library (includingbase
) or they should be jettisoned to something completely unstable likeghc-internal
entirely.Stability policy
base-legacy
is explicitly not intended to match the original stability goal.GHC.Exts
.Beyond having breaking version bumps every GHC release due to misc incompatibilities, how stable
base-legacy
is to be is a matter between the CLC and GHC developers.Overall the stability of
base-legacy
should not be an "interesting" part of this proposal: the focus should instead be on the newly-slimmed downbase
becoming much more stable.Migration scheme
This is the heart of the proposal!
Because the intent is to liberally move modules from
base
tobase-legacy
, it is fully expected that a huge number of packages (hundreds? thousands?) will be broken by this change.CLC policy requires (see
PROPOSALS.md
) that the a breaking change comes with patches for each affected package. In order to meet bar, that we have to make the migration very, very easy per package.There is only one thing "easy" enough to meet that bar, and that is hackage revisions. We need to revise all these broken packages so that
This will ensure that the packages in fact do keep building with the new GHC, and we don't have a breakage apocalypse.
(I don't know if such a hackage revision is legal today. If it is not, we need to revise the policy / modify the Hackage server so that it is legal.)
Intended outcome
With this change,
base
should be a good bit smaller, and a good bit more stable.This, on its own, should get us drastically closer to the main goal:
The remaining interface between
ghc-internal
andbase
should be much smallerIt should be possible to audit that subset much more closely, and not get hammered on reexport-causing breakage in
base
.The next version of GHC after this happens should ship with a
base
that is only a minor version higher --- i.e. the main problem we set out to fix is solved.A few more things to say on point 3:
There may be other impediments to the minor version bump of
base
I propose above. But fixing them should be easier and less daunting than the massive breaking change I propose here. This step should be the hardest/scariest part, and everything afterwards easier/less scary in comparison.If what remains after is the hardest part, then I rescind this proposal in great sadness. We have sadly made too little progress and there are other big problems to fix before contemplating such a big breaking change.
The CLC may want to purposefully introduce breaking changes (as opposed to the accidental ones that we assume exist by bumping the
base
version majorly each release today). If they wish to do so, they can release a second base version compatible with the new GHC /ghc-internal
. It is suggested, but not required, that bothbase
versions have weirdghc-internal
bounds so users have the option of upgradingbase
independently from upgrading GHC.As a final note, it is intended that this change not shift the "balance of power" between GHC devs and CLC. The fact is that these two groups are still somewhat distressful of one another --- let's not sugarcoat it --- so I assume anything that moves hundreds of definitions from the preview of one group to the other I assume will be too controversial. That's fine. Having things go to
base-legacy
allows us to sort out these finer questions of proper interfaces and ownership on a more leisurely time scale.Alternatives
I don't think there are viable alternatives to this plan
If we try argue whether individual modules belong in
base
, the process will take too long. We'll never get to solving the main problem above. Splittingbase
will have been in vain.If we move widely used things to
ghc-experimental
orghc-internal
, we will make those packages too widely used. This will defeat the intended purpose of them being truly unstable and people being afraid to use them. This will just recreate the problems we have withbase
all over again. Only a new package that "contains nasty not-so-stable things that we nonetheless are at least temporarily trying to avoid breaking" fits the bill.It would be nice to do this last, after all other impediments to only bumping the
base
version minorly in the next GHC have been fixed. But I don't think we know what the other problems fully are. I think we have to do this first to muster the resolve to do the remaining misc, easier steps.If we do agree to this, we can at least try to look ahead and anticipate what those problems are. If that becomes a significant amount of work, I can take this issue and turn it into an HF proposal.
The text was updated successfully, but these errors were encountered: