Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce GHC.X.hackage proposal #27

Closed
wants to merge 6 commits into from

Conversation

bgamari
Copy link
Contributor

@bgamari bgamari commented Mar 7, 2022

For a few years now the GHC team has maintained head.hackage a set of patches to a subset of Hackage allowing Hackage packages to be built against pre-release compilers. While head.hackage has been an invaluable resource for testing GHC, its user-base has been limited to GHC developers.

In the past few months, there have been a few proposals seeking to lower the cost of migrations to new GHC releases. A common theme in these discussions has been the long lead-time for adoption of new releases due to the largely serial nature of library migration. We propose to extend head.hackage as a tool to help users migrate and test their libraries.

Rendered version

@nomeata
Copy link

nomeata commented Mar 7, 2022

@nomeata
Copy link

nomeata commented Mar 7, 2022

Thanks for writing this up!

The proposal envisions one overlay per major GHC version. But a major GHC version may not be the only case where a large scale migration needs to happen - if a library far down in the dependency tree (or up) changes in ways that need adjustments in many libraries, it seems we are in a similar situation. Could we have a hackage overlays for these as well?

Thinking in the other direction: is having multiple overlays the right call? I see the advantage of not worrying about other GHC versions, and the implicit garbage collection, but there is also redundancy (likely the patch for GHC-9.14 is also needed for GHC-9.16). How would a single “compat patches overlay” repo fare?

@tomjaguarpaw
Copy link
Contributor

Sorry to be a petty bureaucrat but What's the process for submitting a HFTP? says

Before submitting a HFTP, it is required that you ... Discuss your idea on the Haskell.org Discourse. ... Create a Discourse topic under the category "Haskell Foundation", that starts with "Pre-HFTP” ... Proposing your ideas on the Discourse is not an optional step

I don't see such a discussion on the Haskell Discourse. If I have missed it could you please link it? If not, could you please start the preliminary discussion there?

@bgamari
Copy link
Contributor Author

bgamari commented Mar 21, 2022

I have started a discussion here.

@TeofilC
Copy link

TeofilC commented Mar 27, 2022

This proposal looks great. I'd be keen to help out once the ball gets rolling.

I have a couple of questions/suggestions:

  1. Keeping track of compatible packages
    I feel like it would be quite helpful to have a website that shows which packages from hackage build with the help of the overlay. If a package can't be built, it could show the error or a list of failing dependencies.
    This would differ from head.hackage's current behaviour as it only tries to build patched packages or those explicitly mentioned, rather than all of hackage.
    I think right now it's hard to tell which packages are actually compatible with new GHCs other than by trying to build them yourself.
    And increasing visibility might lead to more patches, and a better idea of ecosystem compatibility.
    The release of the new stackage nightly that uses GHC-9.2 seemed to have this effect. It became clearer which packages were failing and led to a flurry of patches.

  2. Stack compatibility
    As far as I can tell, the proposal doesn't mention compatibility with stack, and I don't think stack supports a similar additional hackage facility like cabal. Do you have plans for stack support?
    I think one way this could work is to make available a patched version of the latest lts and/or nightly snapshots.

@gbaz
Copy link
Collaborator

gbaz commented Apr 6, 2022

To update: we met today and we look on this favorably. As it appears active work on fleshing out implementation is being undertaken by the stability working group, we'll hold off on a formal approval until everything shapes up.

@Ericson2314
Copy link
Contributor

In the (very long) discussion thread in haskell/core-libraries-committee#22 it came up that doing library impact assessments today is also difficult.

I would like to see GHC.X.hackage also help with that, In particular, it should be very easy to make changes base on a a branch, make neccesary changes to the corresponding hackage overlay on a branch, and kick off a "rebuild the world" integration test with those two branches.

In particular, a proposed litmus test for pulling the plug and doing the Data.List breaking change was first having patches (or merged changed) in all Stackage libraries bringing them into compliance. I think that is a very fine hurdle for breaking changes to clear, and if this proposal makes good infrastructure for that such as I just mentioned, it won't even be a hard hurdle to clear.

@Ericson2314
Copy link
Contributor

@Bodigrim left a very nice description in ghc-proposals/ghc-proposals#287 (comment) how the Simplified Subsumption regression test was botched. That's a bummer, but also makes crystal clear how this proposal can help.

When the way to do such a test is well documented/automated, and the baseline control is well-maintained (so unrelated failures don't ruin the results), doing such experiments will be much easier and more reliable.

@chreekat
Copy link
Member

Hi! Since I'm the "HF resource" (devops engineer) who might work on this, I thought I'd check on status. I don't yet have a personal opinion one way or the other yet since I'm not really plugged in, but if everybody else has a consensus, maybe we could finish crossing the t's and dotting the i's.

@simonpj
Copy link
Contributor

simonpj commented May 11, 2022

Hi Bryan, thank you.

I suggest:

  1. You review the proposal, with Ben, in the light of whatever feedback we have gotten so far -- and of course your own views.

  2. Explicitly consult, via direct email, with appropriate folk from

    • stack,
    • cabal,
    • hackage
    • core libraries committee

    to check that they are on board.

  3. Revise and polish in the light of this feedback.

  4. Broadcast a call to library authors to invite their feedback. "Here's a plan, we'd like to consult you". In principle they could be contributing now, but there are thousands of library authors, and few of them will be following this repo.

In this phase I'd suggest also writing personally to the maintainers of a few dozen key libraries (i.e. ones with many dependencies).

Actually doing all this takes a bit of bandwidth, which we are always short of, but which your presence will help with a lot.

It's as much about building consensus as about techincal content!

All this is just my suggestion... the rest of the Stability Working Group may have other views.

@bgamari
Copy link
Contributor Author

bgamari commented May 11, 2022

Indeed, this is a project which we hope to have you work on, @chreekat.

@ndmitchell
Copy link

I have attempted to use head.hackage in the past, and it didn't work for me. The specific problem I encountered was that a package I depended on had a policy of not releasing to real Hackage, only head.hackage, until a stable GHC was released. That package required a changed API to be compatible with the new GHC, so I was left with the choices:

  1. Land the patch in the GitHub repo, and break my ability to make releases if anything else comes up.
  2. Don't land the patch in the GitHub repo and make releases to head.hackage instead.
  3. Use branches, cherry pick etc. adding lots of cost to my development workflow.

All of those options were unpleasant. The specific package in question was very low down in the tree. I appreciate that head.hackage can parallelise updating dependencies, but uploading to Hackage must be serial. I think it is very important that head.hackage doesn't cause people to delay on uploading to Hackage, which it most definitely has in the past.

@chshersh
Copy link

As much as I want to reduce the churn around upgrading to a newer GHC, I'm afraid this proposal is a step in the wrong direction. It will waste lots of time by many people, potentially increase the fragmentation of the already fragmented Haskell community, and increase frustration in some areas (even if reducing it in others). I'm going to elaborate in detail on why I think so.

Let's start with flaws in the described workflow:

  1. The maintainer of C must wake up, fix their package, and make a Hackage release. The maintainers of A and B can do nothing at this point.

This process is terrible in lots of ways:

  • It is utterly serial. The maintainer of B cannot lift a finger until the maintainer of C has not only fixed package C, but also uploaded a new release to Hackage.

This is not true. Maintainers A and B can submit fixes to package C instead of doing nothing. Both cabal-install and stack allow depending on specific commits of packages, not only on Hackage versions. So people can contribute patches directly to packages and test those patches in their own packages without waiting for the Hackage release.

Each step has multiple serial parts: often the maintainer will merge a patch (perhaps in response to prompting) but not do a release, blocking further progress.

We can help maintainers by simplifying Hackage upload workflow or providing necessary CI integrations (e.g. in form of documented CI workflows with examples). This is immediately helpful to everyone and will improve the situation straightaway without paying the high cost of this proposal.

If a maintainer is unavailable for any reason, the entire dependency tree of that package is blocked. In an ecosystem with hundreds of widely used packages, the chances of every single maintainter being available in a timely fashion are close to zero.

The GHC.X.hackage proposal doesn't solve this problem. First of all, this is not a problem in Haskell at the moment since everyone can contribute patches upstream and packages can depend on specific commits. Secondly, if by saying that "the entire dependency tree of that package is blocked" the proposal authors mean "fixes are not on Hackage", the GHC.X.hackage proposal won't help here because it's, well, not Hackage either.

It is not clear to even a willing and available maintainer when they need to wake up and do some work. Often it is up to a motivated individual (say the maintainer of an application that uses A) to sequentially bug the maintainers of C and B and A in sequence so they know that they are able to do something.

Having a more predictable GHC release schedule helps with that. If maintainers know that they need to wake up e.g. every February 18 to update to a newer GHC, they can sleep in peace for the rest of the year. Another proposal about tick-tock release cycle for GHC helps here better and moves in the right direction.


So to me, it looks like the only feature GHC.X.hackage allows is the ability to write a single line in the package configuration — an alias of a bunch of patches — instead of enumerating all patches one by one. This is a nice benefit but I don't think it's worth the cost.

  1. GHC.X.hackage simply creates double work. In addition to creating a patch directly to the used package, a contributor also needs to open a patch to GHC.X.hackage (if they're using GHC.X.hackage). Not to mention, that GHC.X.hackage maintainers need to spend time reviewing those patches in addition to the original maintainer reviewing the patch.
  2. GHC.X.hackage widens the fault horizon. What will happen is that maintainers will always use GHC.X.hackage and we'll effectively have two Hackages. And everyone must look in two places to understand what's going on when things go wrong. And they will go wrong. There's no way to guarantee that the patch proposed to GHC.X.hackage will be the same patch merged to the original package. And when this will happen, it'll create lots of frustrations.

The main explanation of why the Haskell community has delayed GHC support (besides having breaking changes in GHC) is the fact that widely-used packages are maintained by single volunteers. And volunteers don't have to do what you want them to do, they'll do what they want and whenever they want.

If a maintainer of a widely-used library disappeared from the Haskell community forever, then GHC.X.hackage made its way to perpetuity by having this patch and requiring everyone to always use GHC.X.hackage. Otherwise, GHC.X.hackage just asked lots of people to do lots of redundant work.


In conclusion, I feel that this proposal tries to solve the problem of having scarce volunteering resources by requiring volunteers to do even more which obviously doesn't work.

@tfausak
Copy link

tfausak commented May 27, 2022

I was in the midst of writing my own comment, but @chshersh said everything I was going to say. (And said it better!)

For what it's worth, I've never used head.hackage. When upgrading GHC, I've used the Git source feature of both Cabal (source-repository-package) and Stack (extra-deps) to work around problems.

@parsonsmatt
Copy link

(I have not read the proposal itself yet, so my thoughts here are entirely untainted with any idea of what we're talking about)

Having just done a bit of work developing compatibility with GHC 9.4 for the yesod ecosystem, I actually found the process to be quite smooth. I started with persistent, since I maintain that library, and I used ghcup to acquire the 9.4 prerelease and begin compiling it. I setup a cabal.project and set --allow-newer, and traced the build failures. When a package failed to build, I forked it, fixed it, made a PR, and then referred to my fork using cabal.project's source-repository-package feature.

It took about two days of labor to get yesod's tests passing with GHC 9.4.

The primary difficulty, IMO, is updating the strict version upper bounds. I wonder if there's a way to automate that process - ie, if cabal test --allow-newer completes successfully with a base that is above the bounds, we can bump the bounds. Or if there's a way to "flag" these packages in a semi-automated fashion.

After that, actually implementing the changes needed in base and ghc-prim were relatively straightforward. The pre-release notes and Migration Guide answered all of my questions.

The main advantage that a GHC.X.hackage would have is that it would share this work. I have a PR to yesod that lists a cabal.project that makes it work with GHC 9.4, which you'd copy/paste into your own project, along with depending on that commit of yesod, and then you can test your app. But presumably many such projects have happened or will happen - not just for yseod, but for servant or snap or beam etc. And then every one of those will run into the same build issues with cereal, foundation, vector, etc that need patches. And - unless they see my PRs ot those repos and save themselves the work - may end up duplicating it.

So, if GHC.X.hackage allows us to collaboratively provide these patches - then that's great! But if I'm dependent on an absent maintainer (ie foundation) to upload to this Hackage overlay, then we're in a bad spot. But why should I (some rando) get to upload to a Hackage overlay, for some package I'm just doing a drive-by contribution for?

Makes me think that the 'proper' fix is to have an index of Haskell packages that are known-to-not-build (either via constraints, base < 4.17 or via attempted builds that fail), linked to a set of potential fixes (that can possibly be community submitted, without guarantee of correctness/completeness - maybe even just links to GitHub PRs or commits). So when I go to build shakespeare, it says "The Hackage version is known-to-not-build with GHC 9.4. There's a git commit available here. Want to add this to your cabal.project?"

@hasufell
Copy link
Contributor

I think there's a plethora of issues:

  1. having clear visibility of upcoming API breakages, build failures and automatically notifying hackage package maintainers about those (can we detect this automatically? Can this be done semi-automatic as part of GHC release process? How do we notify?)
  2. having a central place for communication/coordination of migrations. Currently the closest we have to this is stackage issue tracker, e.g. wrt aeson-2: aeson-2.0.0.0 commercialhaskell/stackage#6217
  3. an easy way for people to start their work based on the currently available patchset

I think the main issue is figuring out build failures and having clear public visibility about those. This is something that should be part of hackage itself. And even point 2 could be argued to be part of hackage.

This proposal somewhat fixes point 3, but I think that's not the biggest issue of all. head.hackage could simply just be a cabal.project.local pointing to all the upstream PRs instead of another hackage repo. The reason it's a hackage repo is probably because for GHC development it might be unrealistic or unfeasible to create upstream PRs for every single patch. But that is exactly what we want.

@parsonsmatt
Copy link

OK, reading the proposal and learning what head.hackage even is as a 'Hackage Overlay',

The source of GHC.X.hackage is held in a Git repository, and accepts patches from community members who need not be the package maintainer. For example, the maintainer of package A might submit patches to fix B and C. This will use the same infrastructure as the existing head.hackage repository: here is an example of a PR for head.hackage.

The workflow is a bit clunky. I'd really prefer to just say "Here's a GitHub pull request, please make the relevant patch stuff for me." I'm already going to be making a PR upstream with my fixes, might as well share the work here.

One problem with integrating this into my workflow is that many packages I maintain are part of multi-package repositories. So a change to persistent means I need to actually, like, figure out how to use git in some way and cherry-pick the changes over, for each package that needs to be changed. Implementing a 1:many repo:package relationship support would make this trivially easy: eg, git clone my sweet persistent fork, checkout a branch, and generate relevant patch files for the relevant packages that need to change.

The inclusion criteria for a patch are:
2. The patch should represent a patch-version level change according to the PVP, in particular it should not change the package’s API.

Hm. Some packages re-export things from ghc-prim or base in a way that make this impossible. Or, the change to base etc require a breaking change or addition to the API. What's the expected solution there?

For the cabal.project and extra-source-repositories approach, it's fine - I just depend on what I need and fix what needs fixing, making patches as-we-go. A new minor/major version bump as part of upgrading to a new GHC is somewhat expected, IMO.

(ah, this is covered slightly further down in the doc)

6.3 Precognitive releases

This whole sections seems really dicey. I think the discussion is mostly correct - the system is too error-prone to be worth any investment of energy. I'm happy having a PR up that links to dependent PRs that require release before a Hackage version can be made, and providing a cabal.project or stack.yaml that can be used in the interim

But perhaps B requires no updates at all to work with GHC X (this is a common case). Then this message would be over-conservative. Maybe Hackage could proactively set the Tested-With field, by building the package and running its test suite? Or maybe we need two fields: a manual one and an automatic one.

I've repeated over-and-over again that we have a problem with upper bounds, and that is that we conflate "This is known-not-to-work" and "This is not-known-to-work."

For a bound "This is not-known-to-work," we can always just test it and see if it works when the relevant version comes out. Then we either update it to "This is known-to-work" or "This is known-to-not-work."

If Hackage had some support for automagically bumping upper bounds on GHC boot packages (eg base, ghc-prim, etc) that would be pretty sweet. It seems much more intractable if we're trying to globalize that, though, since you'd need a huge matrix...

The presence of allow-newer has made this significantly easier, at least.

hmmmmm 🤔

My overall impression is that this is a lot more work than my current workflow. It'd be nice to share some of the work (I'm sure I'm not the only one to patch cereal, even if I'm the only one to have made a PR), but if folks aren't willing to make PRs upsream with their patches, they're probably also not willing to make a PR to a Hackage overlay.

Really, I just want to be able to share extra-source-repository stanzas, such that it's trivial for me to contribute one and it's trivial for me to find them.

@tfausak
Copy link

tfausak commented May 27, 2022

Really, I just want to be able to share extra-source-repository stanzas, such that it's trivial for me to contribute one and it's trivial for me to find them.

With Stack, you can create and share a custom snapshot that includes a bunch of extra-deps using Git sources. In fact we use this custom snapshot approach to have a unified set of dependencies among internal packages. (Our snapshot is public, but you probably don't want to depend on it.) The community or the Haskell Foundation or GHC developers could provide a snapshot to rally around when upgrading GHC.

With Cabal, I don't think it's quite as nice. As @hasufell mentioned, you can cram all the same stuff in a cabal.project.local. But I don't think there's a good way to use such a shared config from Cabal directly. People would have to download the file and include it in their project. Not the end of the world, but not as seamless as throwing resolver: some-url in your stack.yaml file. (Also I think Cabal preemptively downloads all Git sources, whereas Stack only grabs them when needed.)

For Cabal developers, head.hackage (and this proposal) provide a developer experience that's similar to Stack's. You just add a new repository to your Cabal configuration and you're good to go. That makes me wonder: Is it possible to make an ad hoc Hackage overlay from a cabal.project file? That would allow someone to produce a cabal.project using source-repository-packages for, say, GHC 9.4; and then share that configuration as an overlay.

I'm just thinking out loud here, so feel free to ignore all this. I mainly wanted to point out that you can already get something like GHC.X.hackage using a custom snapshot with Stack.

@hasufell
Copy link
Contributor

But I don't think there's a good way to use such a shared config from Cabal directly. People would have to download the file and include it in their project.

No, they won't. The next cabal release will support include directives for cabal.project including fetching them from remotes.

These may also include resolver constraints, so it's much more ergonomic than using stack for this.

@tfausak
Copy link

tfausak commented May 27, 2022

The next cabal release will support include directives for cabal.project including fetching them from remotes.

Great! That was news to me, so I hunted down the pull request: haskell/cabal#7783

@brandonchinn178
Copy link

This is not true. Maintainers A and B can submit fixes to package C instead of doing nothing. Both cabal-install and stack allow depending on specific commits of packages, not only on Hackage versions. So people can contribute patches directly to packages and test those patches in their own packages without waiting for the Hackage release.

I want to echo this. Upgrading my company's codebase to GHC 9 was a not-difficult process, as I can just make forks to upstream repos and reference the new commits in extra-deps.

I'd rather see updates to this issue: haskell/cabal#7821

@gbaz
Copy link
Collaborator

gbaz commented May 27, 2022

It sounds like the main concern is duplicate work. Perhaps the process for "constructing" the head overlay can make use of people submitting pointers to patches automatically, so as to reduce duplication here?

@hasufell
Copy link
Contributor

It sounds like the main concern is duplicate work.

I'm not sure it is. As I pointed out I'm not convinced this solves the core issue, which is visibility of required work, overview of current patches and a communication platform.

head.hackage seems to be somewhat specific to GHC workflow.

What makes us think it will help engage less active maintainers? It only talks about the patching workflow.

@Ericson2314
Copy link
Contributor

Ericson2314 commented Jun 3, 2022

@david-christiansen I would extend the problem statement to also mention that head.hackage is underutilized because it is only useful for GHC developers and not the community at large. Without trying to predestine a head.hackage-inspired solution, we can still say the situation today is thus "balkanized", and this is a tragedy that leaves both GHC HEAD and newly-released GHCs less battle-tested than they might be otherwise.

@Ericson2314
Copy link
Contributor

Ericson2314 commented Jun 3, 2022

Accordingly, this is why I am not really worried about what technical form the initial GHC.X.Hackage takes.

To the extent it is hard to contribute to, regular users will use it without contributing back, but that is fine! Merely having more consumers of the thing will inspire a few people that don't mind the technical hurdles to work on it more anyways --- we can also argue it's firstly GHC team's responsibility to ensure GHC.X.Hackage is good enough so that upon release there is proof to the community that the new GHC is usuable / isn't too disruptive.

If later the GHC team feels overburdened keeping the thing up to date themselves, and the community is clamoring to help out, then we can refine the technical details to make it more convenient, but for the initial step I'm quite happy to just worry about the social problem of head.hackage only benefiting one group of people, GHC devs, not regular users.

@hasufell
Copy link
Contributor

hasufell commented Jun 3, 2022

Accordingly, this is why I am not really worried about what technical form the initial GHC.X.Hackage takes.

Yes. I'm also leaning towards "just do it" without a proposal. A proper proposal that tries to put the pieces together can happen later.

@chshersh
Copy link

chshersh commented Jun 6, 2022

I personally wouldn't benefit from GHC.X.hackage explicitly (but I might implicitly, only time will show) as I won't be using this overlay by myself. Already existing workflow works well for me. I also like how I can easily see the libraries from my cabal.project that still have pending patches. I don't see an easy way to achieve such level of visibility with GHC.X.hackage.

However, this proposal being the HF Tech Proposal, if accepted, may actually make my life more difficult (I can speak only for myself) by:

  • Introducing community fragmentation
  • Requiring to look in more places when investigating issues
  • Having already busy volunteers and maintainers to do more work and making them less productive
  • Creating opportunities for people to blame me in poor due diligence if I don't review patches to GHC.X.hackage for my libraries and don't want to engage in this opt-in process
  • and much more other fun stuff

I always have this feeling in Haskell that people are eager to dive into complicated tech solutions just to avoid dealing with humans. Yes, dealing with humans is not easy but the entire Haskell ecosystem is built by those humans. It's not sustainable for the entire community if you find it easier to maintain an entire overlay of patches and implement various strategies for retiring patches just to avoid asking maintainers "How could we help you?" or simply understanding that people might disappear.


I don't like derailing the conversations around the proposal by suggesting different solutions. But if the ultimate goal is to release new GHC versions faster, then a proposal like the below one helps to achieve the goal in a more community-friendly way:

If I'm on my 2-week vacation and someone has a burning need to support new GHC in my library ASAP, I'd rather let them merge their patches directly to my project and release it to Hackage without my involvement at all instead of contributing patches to some external overlay because I'm not available.


Addressing some specific details,

Centralising a way for people to work together to generate patch sets across hundreds of packages might unlock that suppressed potential. And it might make less work for the package authors too, because they just have to merge an already-tested PR, rather than generate it.

I feel like Stackage already does this and it's quite successful. I don't benefit from this process as it requires to use stack build tool (which is fair) and I'm using cabal-install for my personal projects. And I agree that having a centralised place to track all upstream changes would be a good thing to do and I indeed would benefit from this. However, this feels like a different proposal.

the current situation is not too bad for a maintainer who:

  • Is competent and adept at navigating the Haskell ecosystem (can find broken upstream packages, diagnose them, make and submit fixes, and thinks this is not a big deal)

This can be solved by having better and more visible documentation. Lots of Haskell developers simply don't know they can depend on specific commits of packages even though this feature is available for more than 6 years in the Haskell ecosystem!

  • Does not have too big a dependency footprint (so that fixing their broken upstream packages is feasible)

The assumption here is that the maintainer of a package with lots of dependencies will wake up first and went on a crusade to fix upstream packages. But this is not always the case. I personally wait for at least several months before I even try to upgrade to a newer GHC. Usually, existing GHC works fine for me and I don't have an urgency to upgrade to it.

  • Has the time to do the work (two days of labour for Matt!)

The same is true for GHC.X.hackage. Someone has to contribute a patch. It can be anyone who has free time and enough competency to do the patch. Not necessary the person who needs it. Describing a workflow for contributing patches upstream (e.g. in a form of a blog post but better official guides section) will help lots of people and will significantly decrease the threshold for being an active member of the Haskell community.

Yes. I'm also leaning towards "just do it" without a proposal

Having HF support for this proposal puts it into completely different perspective. For instance, one of the primary HF goals is to be the glue that connects the entire ecosystem. I don't feel this proposal supports this goal and, in fact, goes against it.

Without the proposal and having official HF support - go for it, you have my blessing 👼🏻

I'm not in a position to tell volunteers what to do in their free time. If some people want to implement GHC.X.hackage and others want to use it -- good for them 👍🏻 As always, people can do whatever they want in their free time. If it doesn't hurt me and makes life easier for someone else -- I'm happy for them and it doesn't bother me at all 👏🏻

@Ericson2314
Copy link
Contributor

@chshersh Well, let's start with the purpose of this proposal, which is is simple:

Foster a way for the community as a whole to collaborate on patched packages so GHC releases are maximally usable on day 1.

How this is accomplished is completely secondary. If the means are too controversial I would advocate splitting the proposal so we can first agree on the goal and then decide on the means. Would that address your concerns?

I always have this feeling in Haskell that people are eager to dive into complicated tech solutions just to avoid dealing with humans.

Following what I said above, this proposal should instead be at getting us to start dealing with other humans more so we don't just suffer in isolation redundantly creating the same patched package workarounds. It should be pro human interaction!

@hasufell
Copy link
Contributor

hasufell commented Jun 6, 2022

How this is accomplished is completely secondary. If the means are too controversial I would advocate splitting the proposal so we can first agree on the goal and then decide on the means. Would that address your concerns?

#27 (comment)

@bgamari
Copy link
Contributor Author

bgamari commented Jun 13, 2022

Thanks to @david-christiansen for summarizing the current state of the discussion. This is extremely useful.

To reiterate a few of his points:

Maintainers have objected that this overlay process would increase their workload, rather than decrease it. The fear is that they are expected to both fix the library and send a patch to GHC.X.Hackage. I'm not sure whether this is the intent of the proposal - from what I can see, the intention was that non-maintainers could contribute to GHC.X.Hackage, while maintainers could do a release on Hackage instead.

David's interpretation is precisely right. GHC.X.Hackage is intended as a means for managing the problem of slow maintainers, which arise for at least two reasons. First, there is time availiability: maintainers have lives, go on vacations, and have busy spells. Such is life and currently we have no good mechanism for managing this fact.

The second reason is a bit trickier: some maintainers are (understandably) reluctant to merge patches updating their packages for a new GHC release until the release is final. Working around this fact was the original motivation for developing head.hackage, since otherwise testing GHC prereleases was extraordinarily difficult and required significant duplication of effort.

A Git-centric workflow would rule out packages that aren't in Git. In principle, Hackage doesn't impose a choice of version control system. Supporting some packages may still require arbitrary patch files.

Indeed this is why we use patch files: they are the least-common-denominator when it comes to representing changes to packages. That being said, I should note that the existing scripts for managing head.hackage largely hide patches from the user. Specifically, adding a patch is typically as easy as:

$ patch-tool unpack-patch acme-missiles
$ cd packages/acme-missiles
$ # make your change
$ git commit
$ patch-tool update-patches

There has been a critique that this proposal will not create a common work-list of tasks to be done to update the world. The analogy is to a Linux distribution, where updates trigger massive rebuilds with failures reported, so that volunteers have an idea of what works and what doesn't.

It's true that the proposal doesn't attempt to solve the "work-list" problem. However, I would argue that this is an orthogonal problem and one that is to some extent already addressed by the Hackage matrix builder (although it seems that it's fallen into quite a state of disrepair, sadly).

@chshersh says,

If I'm on my 2-week vacation and someone has a burning need to support new GHC in my library ASAP, I'd rather let them merge their patches directly to my project and release it to Hackage without my involvement at all instead of contributing patches to some external overlay because I'm not available.

The Haskell Party Proposal is a nice proposal, but it doesn't address the same problem that we are attempting to address here. It requires that maintainers opt-in, which many maintainers will be understandably reluctant to do. This means that the serial nature of ecosystem evolution is essentially unchanged.

@chshersh
Copy link

@bgamari

The Haskell Party Proposal is a nice proposal, but it doesn't address the same problem that we are attempting to address here. It requires that maintainers opt-in, which many maintainers will be understandably reluctant to do. This means that the serial nature of ecosystem evolution is essentially unchanged.

Apologies for misinterpreting the main goal behind this proposal.

I thought the main goal is to "allow maintainers to upload their libraries to Hackage with the full support of the latest GHC and all dependencies so not-so-advanced Haskell users can use them after the new GHC release as quickly as possible without the need to learn about Hackage overlays" (which, obviously, the GHC.X.Hackage proposal doesn't solve) and not "solve the serial nature of the ecosystem evolution" (which is not solved by the Haskell Party proposal, as you correctly noticed).

In that case, maybe this goal can be highlighted better in the motivation section? E.g. a separate paragraph of bold text after a phrase like "The proposal solves the following problem: ...". Currently, it's too ambiguous in the "Motivation" section and can be misinterpreted as this GitHub thread shows.

@hasufell
Copy link
Contributor

hasufell commented Jun 14, 2022

It's true that the proposal doesn't attempt to solve the "work-list" problem. However, I would argue that this is an orthogonal problem and one that is to some extent already addressed by the Hackage matrix builder (although it seems that it's fallen into quite a state of disrepair, sadly).

There really seems to be a misunderstanding about the motivation then.

If the motivation is to have quicker adoption on hackage of new GHC releases, then the workflow problem (which is not addressed by hackage matrix at all) is far more important than patch gathering.

To me it seems that this proposal is more focussed on GHC and less on ecosystem. In that case probably no one disagrees with the proposal, but it should be made clear then that it doesn't attempt to solve the larger ecosystem issues.

@simonpj
Copy link
Contributor

simonpj commented Jun 14, 2022

To me it seems that this proposal is more focussed on GHC and less on ecosystem. In that case probably no one disagrees with the proposal, but it should be made clear then that it doesn't attempt to solve the larger ecosystem issues.

I would love us to focus our scarce attention and resources on the most important problems first. For me, the motivation is

  • To make it as frictionless as possible for anyone to use a new GHC release with a set of libraries that work with it.

As it stands, the proposal is about de-serialising the workflow. But if there is an even more pressing issue around workflow, maybe someone can sketch out what the problem there is, and how we might address it? To be concrete, here is what I think you may be saying. We need a centralised way to see:

  • What packages have been patched to work with GHC X
    • In a Hackage release
    • As a patch held in GHC.X.Hackage
  • Which packages need to be patched
  • Among those that need to be patched, which are most urgent: their dependencies have already been patched, and they have a lot of dependents.

So a kind of public work-list. Is that what you mean? Or maybe something else. You know far more than I do about the issues here.

@hasufell
Copy link
Contributor

hasufell commented Jun 14, 2022

As it stands, the proposal is about de-serialising the workflow. But if there is an even more pressing issue around workflow, maybe someone can sketch out what the problem there is, and how we might address it? To be concrete, here is what I think you may be saying. We need a centralised way to see:

* What packages have been patched to work with GHC X
  
  * In a Hackage release
  * As a patch held in GHC.X.Hackage

* Which packages need to be patched

* Among those that need to be patched, which are most urgent: their dependencies have already been patched, and they have a lot of dependents.

So a kind of public work-list. Is that what you mean? Or maybe something else. You know far more than I do about the issues here.

Yeah, the way I would envision is it somewhat like this:

Motivation

This proposal attempts to decrease the adoption time of new GHC releases in the community and make contributing patches to the ecosystem easier.

Proposal requirements

  1. the proposal must primarily decrease the work for maintainers (these are the people in charge of making releases, so that should be the absolute priority)
  2. the proposal must show a way on how to connect contributors and maintainers
  3. the proposal must describe how these points improve adoption time

Workflow sketch

  1. There must be a publicly visible list of packages that require patching for a specific GHC version
    • the list must include all boot libraries
    • the list should at least consider all packages existing in the latest stackage release
    • the list should expose information about the exact build failures and the build process (cabal flags)
    • the list must be easily searchable
  2. There must be an easy way to subscribe to such build failures for a specific package for anyone
    • the subscription must be opt-in
    • there must be a way to easily unsubscribe
  3. There must be a central communication platform that is specific to fixing build failures of packages wrt new GHC versions
    • there must be a way to search and filter for package related issues on that communication platform
    • this communication platform should be able integrate with a projects main issue tracker (if any)
  4. There must be a way to easily submit patches
    • the patch workflow must not require users to submit the same patch in two different places
    • it should be ensured that the maintainer is notified of such patches
  5. There must be a way for maintainers to easily merge patches
    • maintainers must be able to understand the context of those changes (e.g. prior discussions, unreleased GHC features, ...)
  6. There should be an easy way for maintainers to approve GHC adoption patches ahead of time
    • this process must be opt-in
    • this process must allow to specify constraints (number of reviewers, patch authors, time frame, reverse dep testing, ...)

Scope

The scope is currently about adopting to newer GHC versions, but could be extended to all ecosystem adoption issues (compare with aeson-2.0 change).


My personal opinion is that as much as possible of these solutions should be part of hackage itself.

Ideally, there would be a way to simply upload patches on hackage, then a bot would submit a PR to the maintainers github repository (if any) or send it via email. Hackage would automatically update list of submitted patches, add references to the pull request and somehow monitor changes to the PR itself (if any).

Those things would be nice to have and the underlying tools to achieve the workflow can easily be switched. Hence the proposal focuses on the workflow, not the tools.

Not all of these things need to be solved at once.

@bgamari
Copy link
Contributor Author

bgamari commented Jun 15, 2022

To me it seems that this proposal is more focussed on GHC and less on ecosystem. In that case probably no one disagrees with the proposal, but it should be made clear then that it doesn't attempt to solve the larger ecosystem issues.

I'm not convinced that there is truly much of a difference between these two sets of requirements; I have personally found head.hackage to be quite useful while updating my own projects for new GHC releases.

For what it's worth, I agree that your proposal would be useful and would solve another set of problems. However, it strikes me as quite ambitious.

This is of course not a bad thing, but the GHC.X.hackage proposal was intended to be minimal in the sense that it takes something which we already have and which already addresses a clear purpose, and extends it to slightly widen that purpose, allowing more people to benefit from it. The investment needed to reap the benefits from GHC.X.hackage is quite small; however, if the consensus is that these benefits are also quite small then I'm happy to shelve the proposal.

@hasufell
Copy link
Contributor

hasufell commented Jun 15, 2022

@bgamari apologies... l wasn't really criticizing the proposal on its own. Only the scope/motivation it describes. I think the proposal is an improvement and should be implemented. Versioning the overlays makes sense for various reasons.

My idea would simply be to

  1. refine the workflow goals and agree on them
  2. prioritize them and figure out which ones can be implemented swiftly to get high return
  3. keep the workflow description as an ultimate goal and slowly work towards implementing a coherent user experience

I believe that the GHC.X.hackage proposal can simply be executed in parallel. We might then later agree that it's part of the workflow solution. And even if it's not, it may still serve a purpose for other use cases (such as GHC, bleeding edge CIs etc.).


As such: I propose to narrow the scope of the proposal or be deliberately more vague about what problems it solves: e.g. "we want to version the head.hackage overlay to allow users to rely on them for different GHC version that are not yet fully supported on hackage, potentially facilitating a patch workflow that allows users to more easily contribute".

@phadej
Copy link

phadej commented Jul 24, 2022

The proposal doesn't seem to mention Non-maintainer uploads. There is clearly an overlap, and the proposal should discuss the differences and which option to pick when.

@david-christiansen
Copy link
Contributor

I think that the revision to the proposal that @bgamari just pushed at least resolves some of the ambiguities, and hopefully in the process addresses some of the concerns expressed here. In particular, it clarifies that:

  • GHC.X.Hackage will contain non-maintainer-contributed patches
  • It describes the relationship between the current widespread practice of local version overrides (e.g. in a stack.yaml or cabal.project file) and GHC.X.Hackage
  • It provides an example workflow to clarify the intended use
  • It specifies that patches should be sent to a package maintainer before including them in GHC.X.Hackage, to minimize divergence over time

Thoughts?

@phadej
Copy link

phadej commented Sep 13, 2022

The revised proposal still doesn't mention NMUs.

@david-christiansen
Copy link
Contributor

@phadej I specifically attempted to address this concern, which makes me think that I have misunderstood you!

I interpret your concern as asking whether patches in the GHC.X.Hackage overlay can be uploaded by non-maintainers. The answer is "yes" - in fact, non-maintainers are expected to be the primary contributors of patches to GHC.X.Hackage.

But, given your comment, I suspect that you in fact have a different concern related to NMUs. Can you expand upon it a bit? Thanks!

@phadej
Copy link

phadej commented Sep 14, 2022

@david-christiansen
Copy link
Contributor

Ah, OK, @gbaz explained to me what the concern is - sorry for being a bit dense!

What do you think about the following:

Hackage has a procedure for non-maintainers uploading package updates in cases where the original maintainer cannot be reached, which is called the non-maintainer upload process (NMU). This process is orthogonal - GHC.X.Hackage is a way for the community to self-organize short-term workarounds to immediate upgrade difficulties, while the NMU process is a way to deal with maintainers who are not reachable at all for a longer period of time. In particular, GHC.X.Hackage can accept patches with a lower degree of scrutiny due to it being a temporary source of workarounds rather than a permanent repository of new versions.

@bgamari
Copy link
Contributor Author

bgamari commented Sep 14, 2022

Oh dear, it looks like @david-christiansen and I raced. I just pushed the following language to address NMUs:

The proposed mechanism is intended to complement Hackage's existing non-maintainer upload (NMU)
policy
,
where non-maintainers can propose source changes to packages in some cases.
Specifically, while Hackage NMUs are intended to be carefully-audited
long-lived releases, GHC.X.Hackage patches are more transient with a
correspondingly lower-overhead process and faster turnaround time.

@david-christiansen, do feel free to amend as you see fit.

@michaelpj
Copy link

Sorry, I've been very out of the loop here, but I wanted to add a useful observation. We've been prototyping something like this for use inside IOHK, and we really wanted to be able to gracefully include patched versions of packages.

What we realised is that PVP versions can have an arbitrary number of components, and it turns out that cabal deals perfectly well with these. So you can adopt a policy whereby a patched version of X-1.2.3 should be included as X-1.2.3.0.0.0.0.1, which has the desired properties of a) being very unlikely to clash with a genuine release and b) being strictly greater than the released version.

I think this scheme can alleviate a lot of the worries about which version number to use. Just pick a scheme like "add four (or N) zeros and then start incrementing".

@gbaz
Copy link
Collaborator

gbaz commented Sep 14, 2022

GHC.X.hackage repositories continue to exist for old versions of GHC in perpetuity. They are not retired, although it is expected that they will become irrelevant in time as proper releases are made in Hackage.

I would want to add a further element here -- while GHC.X.hackage repos will too exist, I would hope there is a "sunset" (perhaps the release of the following ghc) where they stop accepting packages.

In particular, we wish to encourage that these repos are used strictly for testing and compatibility work -- having a cutoff on accepting new patches would encourage that they not become long-term sources of truth for ongoing non-testing and non-compatibility development.

@simonpj
Copy link
Contributor

simonpj commented Sep 14, 2022

I would hope there is a "sunset" (perhaps the release of the following ghc) where they stop accepting packages.

I understand the motivation here but surely the sun can set only when all the packages with patches in GHC.X.Hackage have been updated in Hackage. And that timetable is in the hands of package authors. Perhapes it would help if GHC.X.Hackage was accompanied with some kind of status board showing which packages still needed updating in Hackage? Package authors might find that helpful because they could readily see when their dependencies are all up to date.

If the sun sets (e.g. GHC.X..Hackage is withdrawn for version X) before all packages have been updated then that version X of GHC will abruptly become unavaialble to clients who are depending on those patches. That would not be a good look.

@phadej
Copy link

phadej commented Sep 14, 2022

If the sun sets (e.g. GHC.X..Hackage is withdrawn for version X) before all packages have been updated then that version X of GHC will abruptly become unavaialble to clients who are depending on those patches. That would not be a good look.

I understood what @gbaz said as stop accepting new patches, not that the repository itself would go offline.

@simonpj
Copy link
Contributor

simonpj commented Sep 14, 2022

I understood what @gbaz said as stop accepting new patches, not that the repository itself would go offline.

Ah, I did not understand that. If it goes into the proposal, let's make that clear.

But still, it would mean that a user U would be unable to use GHC.X.Hackage to patch an obscure package P, whose maintainer was absent. Thus GHC version X would be permanently out of reach to U, which again seems undesirable.

@gbaz
Copy link
Collaborator

gbaz commented Sep 14, 2022

But still, it would mean that a user U would be unable to use GHC.X.Hackage to patch an obscure package P, whose maintainer was absent. Thus GHC version X would be permanently out of reach to U, which again seems undesirable.

Oleg's interpretation is correct. But this would not mean that X would be out of reach to U. They could simply fork U and use a cabal.project or the like to point to the repo.

It would mean that GHC.X.Hackage would not be a place to continue to perform "testing and compatibility work" for GHC version X. However, this "sunset" would occur only after GHC version X+1 was released, at which point that is where future testing and compatibility would make sense to be directed to. I believe such a sunset would nudge users to treat GHC.X.Hackage as a testing and compatibility area, and not as a permanent source of truth in a way that has relatively minor costs -- i.e. those most affected would only be those who were tempted to use it for unintended purposes to begin with. Further it relieves the GHC.X.Hackage maintainers of the burden of having to accept and review patches for all past GHCs indefinitely.

By the way, I believe the negative concerns expressed in section 6.3 on "precognitive releases" are too negative. Consider if P depends on Q at version 1.1.0 and that fails to work on GHC X, but is patched in GHC.X.Hackage. I claim that releasing a P that depends on some Q 1.1.x (i.e. with pvp upper bounds) remains safe. In the best case, Q 1.1.1 is released promptly, provides compatibility with X, and all is good. If no such Q is ever released, then P's bounds remain correct, and P just remains unbuildable with GHC X. If a Q is later released that is compatible, but it is Q version 2, then if P still builds with it, then only a bounds revision need be performed -- no new release. Only in the case that if Q is released at 2, and P does not work with it, a new version need be released that updates to be compatible with Q -- and the total cost of having made one incremental release is relatively minor, and has done no harm.

@bgamari
Copy link
Contributor Author

bgamari commented Oct 31, 2022

It seems that the community is generally somewhat lukewarm with respect to GHC.X.Hackage. Based on this feedback, and rather than going through the formal vote process, we're now planning on just doing a little trial to get some experience: we'll do one iteration of GHC.X.Hackage (perhaps GHC.9.6.Hackage), see how it works in practice, and evaluate it based on the reception. If it's a runaway success, great! If there is not widespread adoption, then we'll be in a position to figure out whether the lack of adoption is technical (and can be fixed) or if the solution just doesn't match community needs.

Thank you again for your feedback!

@bgamari bgamari closed this Oct 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.