Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Corepack should not append hash to version #316

Open
rotu opened this issue Oct 11, 2023 · 12 comments
Open

Corepack should not append hash to version #316

rotu opened this issue Oct 11, 2023 · 12 comments

Comments

@rotu
Copy link

rotu commented Oct 11, 2023

corepack use and corepack up add build metadata to the version, resulting in a semver string that doesn't match any released version.

e.g. "packageManager": "[email protected]+sha256.c362077587b1e782e5aef3dcf85826399ae552ad66b760e2585c4ac11102243f"

The actual version is simply "10.2.0".

This causes problems for tools that expect to find an existing version: moonrepo/proto#231

Additionally, corepack does not tolerate existing build metadata:

e.g. "packageManager": "[email protected]+some-custom-build"

corepack install
Adding [email protected]+some-custom-build to the cache...
Internal Error: Digest method not supported

Related: npm/cli#1479

@rotu rotu changed the title Corepack creates invalid version Corepack should not append hash to version Oct 11, 2023
@aduh95
Copy link
Contributor

aduh95 commented Oct 11, 2023

Why shouldn't Corepack append hash to the version? I can see you link to the original PR introducing the functionality, you have probably read that it was security concerns that led to that decision, what are your thoughts on that?

Additionally, corepack does not tolerate existing build metadata

If that's a problem for someone, PRs welcome, but I doubt that there's a use case for specifying metadata except a hash.

@rotu
Copy link
Author

rotu commented Oct 12, 2023

Why shouldn't Corepack append hash to the version? I can see you link to the original PR introducing the functionality, you have probably read that it was security concerns that led to that decision, what are your thoughts on that?

There are a couple of reasons:

  1. It makes the packageManager field no longer human-friendly to edit, since it now has a human-readable aspect and a machine-generated aspect.
  2. The version is at the discretion of the author of the package; this hash is an artifact of this tool, not part of the package version.
  3. The hash is not a property of the code; it's a property of the tarball. It depends not just on the version but on the "resolved" url from package-lock.json (which has no analog in corepack and will be a problem for private repositories).

Yes, I think it makes sense to hash the package manager. It doesn't make sense to put it in the version string.

@aduh95
Copy link
Contributor

aduh95 commented Oct 12, 2023

Then don’t push the hash / edit it out, no? If you want human editable, it seems only faire that you have to edit it manually, and it’s completely optional. But I have a hard time seeing why you would not chose to give up the security of the hash. Or do you have a suggestion to store the hash somewhere else?

@rotu
Copy link
Author

rotu commented Oct 12, 2023

The hash could be stored in package.json in a different field. I'm thinking "peerDependenciesMeta" is appropriate. e.g.:

"peerDependenciesMeta": {
  "pnpm": {
    "resolved": "https://registry.npmjs.org/pnpm/-/pnpm-8.9.0.tgz",
    "integrity": "sha512-74hZk44fBTe5/PAwkEQxE5Lzs4s0QXbmzU/e4hsiVSSwrCobCK4q4t3Vs/9LjKSW1neOlQ8+fJ9VW4EyWYJEHA=="
  }
}

An "integrity" field is the approach adopted by npm and by node in analogy with the Subresource Integrity W3C spec.

@arcanis
Copy link
Contributor

arcanis commented Oct 12, 2023

It doesn't really make sense in peerDependencies, as the package manager is not a peer dependency, and even if it was peer dependencies are resolved dynamically based on their ancestors, so it wouldn't make sense for them to have "resolved" or "integrity" fields.

I understand your point that, strictly speaking, a second field could perhaps have been a good idea at the time, but it seems extremely low impact (if it even has any?) to change that now compared to the potential to break tools that already expect Corepack to work this way.

Also worth noting that nothing in the semver spec forbids using the build metadata the way Corepack does, and that it's as far as I know impossible to publish versions on the npm registry w/ build metadata (yarnpkg/berry#5785 (comment)).

@rotu
Copy link
Author

rotu commented Oct 12, 2023

Peer dependencies are packages which a package expects to be present on the host system but necessarily installed by the package manager. That describes build tools and package managers to a tee. It would also be perfectly fine to put this in a new field like “packageManagerLock”.

I’m not sure what you mean by “peer dependencies are resolved dynamically based on their ancestors”.

What other tool expect the packageManager field to include build metadata? I suspect it’s only proto, which merely strips it out and therefore wouldn’t break if it were relocated.

The semver spec is written with the tacit assumption that it is the package author defining the version!

I believe that npm does tolerate build metadata in package.json#version; it just doesn’t use it in resolving which package to download, and therefore won’t tolerate it in the tarball’s filename.

@aduh95
Copy link
Contributor

aduh95 commented Oct 12, 2023

The semver spec is written with the tacit assumption that it is the package author defining the version!

I think it's fair to say that's your interpretation, that's not how I read the spec.

I believe that npm does tolerate build metadata in package.json#version; it just doesn’t use it in resolving which package to download, and therefore won’t tolerate it in the tarball’s filename.

Here's what they say in their docs: The "version" field must be in the form x.x.x and follow the semantic versioning guidelines.
In practice, I'm not sure they enforce it, at least it doesn't complain when running npm pack on a package that has metadata in its "version", and generate a tarball with the metadata in its filename.

Anyway, to come back to the topic on hand, I agree with @arcanis that changing the syntax now is a non-zero cost that seems hard to justify unless the current syntax breaks something. If someone opens a PR to move the hash to somewhere else, we would probably still want to support the current syntax.

@rotu
Copy link
Author

rotu commented Oct 12, 2023

Yes, npm ignores part of the package version. That's probably a mistake in retrospect. I'm glad they support prerelease identifiers even though those linked pages don't mention it!

Using it as "free real estate" in corepack not only relies on npm's decision here, but precludes package managers from using build metadata, since it would break corepack!


Say I'm super-paranoid and download and build the package manager myself.

In my peerDependencies, I'd have:

"npm": "github:npm/cli#57a895781dda201956c645be3e8bb8ac93eade1f"

And the integrity fields would look something like:

"version": "10.2.0",
"resolved": "git+ssh://[email protected]/npm/cli.git#57a895781dda201956c645be3e8bb8ac93eade1f",
"integrity": "sha512-yRjLG88QeGJqtBorxfHq/92pxlOJhPOd2Xm+sHkmygMoKgQbe6rnHotejEtGMvrWMS8YaBE5hNiGS75a2e3/hQ=="

By contrast, corepack's current design of associating a hash directly with the version number precludes this. You can't link the package manager back to source code, because it's tied specifically to a tarball (presumably the one on npm's public repo).


Changing the syntax now is a non-zero cost, but it is going to get more expensive the further down the line we go. Whenever this is done, it should probably also use sha512 and base-64 encode the result so it can be visually compared with the integrity field in the npm Registry.

I'm starting to think that maybe the packageManager field should have just the brand (e.g. npm/yarn/pnpm/bun) and that the versioning constraint belongs in peerDependencies.

@aduh95
Copy link
Contributor

aduh95 commented Oct 13, 2023

Using it as "free real estate" in corepack not only relies on npm's decision here, but precludes package managers from using build metadata, since it would break corepack!

I don’t see how we “rely on npm decision” here, we rely on the semver spec (which arguably is a limitation, but one that makes sense IMO). Quoting the spec: “Build metadata MUST be ignored when determining version precedence. Thus two versions that differ only in the build metadata, have the same precedence”.
I think there’s a misunderstanding on what means “build metadata” in the semver spec, my interpretation of the spec is that their use in Corepack makes perfect sense.

corepack's current design of associating a hash directly with the version number precludes this. You can't link the package manager back to source code, because it's tied specifically to a tarball

Sure you can, you may need to specify the hash of the source code tarball in a different field, but there’s no reason Corepack compat would get in your way.
If you build the package manager in the Corepack’s cache, you could even use Corepack with your custom build.

it should probably also use sha512

FYI Corepack already supports all the hash algorithms supported by Node.js, including SHA-512.


Anyway, I feel it’s not productive to keep discussing, everyone has made their point. If someone opens a PR to propose changes, we can discuss the details.

@rotu
Copy link
Author

rotu commented Oct 13, 2023

I don’t see how we “rely on npm decision” here, you rely on the semver spec (which arguably is a limitation, but one that makes sense IMO). Quoting the spec: “Build metadata MUST be ignored when determining version precedence. Thus two versions that differ only in the build metadata, have the same precedence”.

That's determining version precedence, not subsumption, and I think it's the core of our misunderstanding. If the user specifies 1.0.1, then npm does not give them 1.0.2, even though it has higher precedence.

It's npm's decision to ignore the build metadata entirely, not just ignore it for precedence purposes.


Another confusing consequence of the current setup is that, passing corepack use the value of an existing packageManager field (1) doesn't check the passed hash (2) renders corepack confused:

$ corepack use "[email protected]+sha256.c362077587b1e782e5aef3dcf85826399ae552ad66b760e2585c4ac11102243f"
Installing [email protected]+sha256.c362077587b1e782e5aef3dcf85826399ae552ad66b760e2585c4ac11102243f in the project...
$ corepack up
Usage Error: Invalid package manager specification in package.json; expected a semver version
$ npm pkg get packageManager
"[email protected]+sha256.c362077587b1e782e5aef3dcf85826399ae552ad66b760e2585c4ac11102243f+sha256.c362077587b1e782e5aef3dcf85826399ae552ad66b760e2585c4ac11102243f"

@arcanis
Copy link
Contributor

arcanis commented Oct 13, 2023

That's determining version precedence, not subsumption, and I think it's the core of our misunderstanding

I don't think there's a misunderstanding; we understand you believe your interpretation is the right one, but we believe the spec doesn't make this clear, and that your interpretation is what it is - just one possible interpretation.

And interestingly it doesn't even seem to be supported by the semver package, as running semver.valid on a version with build metadata simply strips the metadata. You perhaps can argue that the semver package isn't the semver spec, but at this point it's really splitting hairs.

@ziyuang
Copy link

ziyuang commented May 28, 2024

By the way is it possible to specify whether to use SHA256 or SHA512? corepack under Ubuntu attaches SHA512 hash while it attaches SHA256 hash under Windows.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants