-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Package registry #24
Comments
I think of
So I'm in favor of tracking this data. Some of it seems to belong in the package set, and some outside. Unfortunately, I don't have time to work on it right now. Let me ask though: given we already decided that |
I guess I'm mainly trying to understand the ideas behind psc-package better, so that I can help work out how Pulp should use it, and I'm also trying to imagine what the purescript ecosystem would look like without Bower and trying to work out what I might be able to do to help there. In addition to that, though, I do consider the things I've written to be problems for psc-package which I would maybe like to fix or investigate through separate projects. Re git: if we can find some space on some server, wouldn't it be simpler to just have a tarball per package per version? Mirroring git repos won't be fun if the history has been rewritten since the last time you pulled from the upstream repo and a fast-forward merge is impossible. We also can't just delete and reclone the whole repo, as tags could have been mutated or removed. To clarify my position re ease of use; I agree that it would be nice to make it easier to publish packages but that's not really what I mean here. My worry is that by only having package sets there will be a social effect that leads people to avoid creating and publishing packages at all. (Yes, of course there is bower, but I am starting to think that bower's lack of a proper solver means that ideally it wants replacing.) |
@hdgarrood I get what you're saying, but as the user of libraries, I really like that package sets give a kind of guarantee of maintenance. There's been plenty of packages on Pursuit which I wanted to use, but I couldn't because they hadn't been updated in months, because the author didn't maintain them. But they were still listed in Pursuit, leading me to believe that they would work (even though they don't). I would rather have a small set of high-quality maintained libraries, rather than a large set of low-quality unmaintained libraries (which is what you get with other package managers like npm). This does run the risk that a library which I am using will be dropped in a future package set, but if anything I consider that a good thing, because it means that people will bug the author to update the library (or fork the library). Providing workarounds for unmaintained libraries just encourages more unmaintained libraries. Having a strict "must maintain" policy encourages maintenance. Yes it puts more pressure on library authors, but since there are far more library users than library authors, I think that's okay. In my opinion, the library user experience is more important than the library author experience, because libraries are useless by themself (they are only useful when used by another library or application). So there is a natural asymmetry which is biased toward library users. So I think the only thing that is necessary is to have the ability for somebody to "take over" an unmaintained package. So that way if the author doesn't maintain the package, somebody else can. |
As someone who is both a library user and a library author, I very much disagree. If there was an expectation of maintenance, as there is with package sets, most of my libraries would never have been published. Even though they do often lag behind, I know that people find them useful. I'm very strongly against any policy that puts unnecessary pressure on library authors or encourages library authors to bug library users. That's not at all the tone I want the PureScript library ecosystem to have. We know from experience that the Stackage model - i.e. a package registry with no maintenance expectations, plus package sets with maintenance expectations for authors who want to commit to it, works well and scales. I also don't buy the argument that it will be hard to find packages that work well. It's easy - just don't search the full registry, instead restrict your search to a package set. I know a tool that can search within a package set doesn't exist now but it easily could do. |
@hdgarrood Just to be clear, when I say "bug the library author" I mean "file a bug report about updating the library", or "make a pull request updating the library", or "send a polite e-mail explaining the situation", that sort of thing. I don't mean waking them up at 2 AM to pester them. |
Ok good, thanks for clarifying. My position is unchanged, though. |
What about using the npm registry for packages which don't want to commit to the package set? With And we will need to use npm anyways, because there are some PureScript packages which use JavaScript packages. So that avoids needing to create a separate registry. |
That does seem like a good option, but I would rather not get into this discussion right now - for now, I am really just hoping to ascertain how psc-package should interact with a centralised registry, if at all, rather than details like which registry we use, whether we make our own, etc. |
I don't have time for a full reply now, but I wanted to just clear up a couple of things:
I think we can implement a centralized package repository, but it fundamentally would be different from |
Ok great, thanks. I can't remember if I've said this elsewhere but I expect having version bounds would help with curation once package sets start to become a bit larger, incidentally, having seen how Stackage operates. When/if you have time I'd be interested to hear about your views on Git as the base for |
(possibly dumb question) @paf31 imagine all the purescript package you need to do something are in package-sets, what would you use bower for? (since js dependencies probably come from npm). @hdgarrood I think Git can be great for availability, because it is inherently distributed. I can imagine e.g. specifying multiple locations for a package in package.json. It also looks like it is very easy to host a 'private repository' - clone the packages repository, add your private packages where merge conflicts are unlikely (e.g. at the bottom). I think in the long run it could grow the ecosystem - if I can easily split my application into a bunch of libraries in git repositories, and ensure they all build together, making the more generic ones available to everyone else is a matter of moving the git repository to github/bitbucket/gitlab and sending a pull request on 'package-sets' once it is far enough along to be interesting to someone else. Maintaining community infrastructure is an expensive and not always thankful experience, from what I see in various communities. Less infrastructure = more time available to develop libraries, the compiler and other more interesting parts of the ecosystem. |
FYI (sorry if this is the wrong place, could not find a more appropriate one at the moment) to make it a bit easier to add existing packages to the package set, I wrote a small shell script https://gist.github.com/mostalive/54dbbf388f6ca58795d6ae37fef22890 that generates most of a I'd be happy to translate this to haskell as a subcommand of The other thing I found useful is a oneliner to extract dependencies from 'bower.json' for use in 'psc-package.json':
|
I know that Git is inherently distributed, but that doesn't actually address the availability issues I have described earlier in this thread and in purescript/purescript#2526. I am still not aware of any good way of handling a case where a package author rewrites the Git history between releases if we continue to use Git as the base. |
I see (I did read that thread, there's a lot to take in). Oversimplifying things, I was thinking of a trade off. which has a greater chance:
The chance of 1. increases with the size of the package set. At the same time so does 2. Why would anyone do 1? What I then understand from your question at 2526 is: how can we limit the amount of work that has to be done by others when a contributed package gets removed or is broken? Or how can we prevent it? In the case of the broken tag: |
This is not a trade off we are necessarily forced to make; if packages are distributed as .tar.gz files via a system like IPFS or BitTorrent, it's conceivable that everything apart from perhaps publishing new versions of packages would still work.
Certainly people should be able to say "I no longer have the inclination or time to maintain this" and we should have a process for allowing someone else to take over. But once version x.y.z of package A is published on a package registry it should be available indefinitely (except for in very, very unusual cases e.g. the contents of the package are likely to cause legal issues). If, as a package author, you're not comfortable with that, then don't publish your package.
No, my question is how can we prevent this from happening in the first place, because it is entirely avoidable.
This has already been suggested and people have already described why it won't work. In summary: we need to be able to reliably obtain package A at version x.y.z. If a package manager just fails with a checksum mismatch error during installation (and all a package manager could reasonably do in that scenario is to just fail with a checksum mismatch error), that's essentially useless from the point of view of the developer trying to install their project's dependencies. |
Thank you for the detailed reply @hdgarrood . I'm mulling it over. |
Following on from purescript/purescript#2526. I am thinking about the architecture of psc-package, and in particular, thinking about how it differs from Stackage in that Stackage is an extra layer in front of Hackage, and Hackage is a centralized package registry which provides:
I also think that having a centralised registry which is separate from curated package sets provides an important option for publishing packages for authors who might struggle to find time to keep their packages up to date; if the only option is submitting to a package set, I think we risk discouraging people from publishing their packages at all.
Another related issue that has just occurred to me: I think it's quite far from ideal that if someone were publishing their packages only through psc-package and also uploading them to Pursuit, the information about dependencies and bounds which would be passed to
purs publish
via--manifest psc-package.json
on Pursuit would essentially be meaningless. Since the package author would not actually be using it in the course of developing their package, I expect in most cases it would quickly go out of date.It is probably obvious by now that I would quite like to have a centralised package registry of some kind. However, I appreciate that this would amount to quite a lot of work. So I'm really opening this issue to ask: do you agree that it is worth addressing these issues by creating a centralised registry and modifying psc-package to use it, and if not, is that because of how much work it would be or because of something else?
The text was updated successfully, but these errors were encountered: