Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QUESTION] semver: Why is "+build" information ignored when "-pre" information is allowed #1479

Closed
tresf opened this issue Jul 1, 2020 · 23 comments

Comments

@tresf
Copy link

tresf commented Jul 1, 2020

What / Why

NPM ignores+<build> information to represent detailed build release information. Why?

  • An archived bug report asking a similar question: npm-version disregards build metadata npm#12825

  • A near-identical question is asked on stackoverflow: https://stackoverflow.com/q/29972999/3196753

    I would like to attach a build number to my project in package.json. I'm looking for the best way to do so.

    I've found that node-semver recognizes a string as a build number if preceded by +. For example this would be build 123.

    1.0.0+123
    

    However, the NPM version module will also accept this format, but trims off the build number in package.json. How should I go about representing the build number in package.json?

    The accepted answer states the following:

    ... having different builds of the same version does not make sense from an npm semver perspective ...

    However, if that's the case, why is prerelease allowed? Looking at larger projects, it's common to use +<build> to represent build information.

  • Popular software with +<build> information:

    • OpenJDK: 13.0.2+8
    • AdoptOpenJDK: 11.0.7+10
    • IntelliJ: 11.0.6+8
  • Quoting why minus shouldn't be used (semver.org):

    ... A pre-release version MAY be denoted by appending a hyphen and a series of dot separated identifiers immediately following the patch version. Identifiers MUST comprise only ASCII alphanumerics and hyphens [0-9A-Za-z-]. Identifiers MUST NOT be empty. Numeric identifiers MUST NOT include leading zeroes. Pre-release versions have a lower precedence than the associated normal version. A pre-release version indicates that the version is unstable and might not satisfy the intended compatibility requirements as denoted by its associated normal version. Examples: 1.0.0-alpha, 1.0.0-alpha.1, 1.0.0-0.3.7, 1.0.0-x.7.z.92, 1.0.0-x-y-z.–.

    Oddly, it appears that improper use of this hyphen has become contagious... for example, Visual Studio Code shows the following in the About dialog:

    • V8: 7.8.279.23-electron.0

    ... (is -electron.0 a pre-release version of Electron, or should this be +electron.0)

I believe there's a valid use-case for +<build> information and that ignoring it forces odd workarounds, such as using pre-release prefixes when they're invalid.

Where

  • npm public registry
@tresf tresf changed the title [QUESTION] semver: why os "+build" information ignored when "-pre" information is allowed [QUESTION] semver: Why is "+build" information ignored when "-pre" information is allowed Jul 1, 2020
@shadowspawn
Copy link
Contributor

shadowspawn commented Jul 2, 2020

(Not sure what you mean by "allowed". Do you mean displayed in the "versions" listed on npmjs.org, or what gets added to package.json, or package-lock.json?)

In Semantic versioning, pre-release versions are included in version precedence and have a lower precedence than the associated normal version. So 1.0-0 < 1.0 < 1.1

In Semantic versioning, the build metadata is ignored when determining version precedence, so the expression >13.0.2+8 is effectively the same as >13.0.2, and there are circumstances where it is clearer to leave out the build metadata.

https://semver.org/#spec-item-10

Build metadata MAY be denoted by appending a plus sign and a series of dot separated identifiers immediately following the patch or pre-release version. Identifiers MUST comprise only ASCII alphanumerics and hyphens [0-9A-Za-z-]. Identifiers MUST NOT be empty. Build metadata MUST be ignored when determining version precedence. Thus two versions that differ only in the build metadata, have the same precedence. Examples: 1.0.0-alpha+001, 1.0.0+20130313144700, 1.0.0-beta+exp.sha.5114f85, 1.0.0+21AF26D3—-117B344092BD.

@isaacs
Copy link
Contributor

isaacs commented Jul 2, 2020

What @shadowspawn said is 100% accurate.

The reason that npm drops build info from SemVer versions and ranges is that it's not relevant for the purpose of dependency resolution. (Ie, 1.0.0+foo is "the same" as 1.0.0+bar or just 1.0.0, as far as SemVer is concerned.) Unlike build metadata, prerelease version identifiers are relevant. 1.0.0-foo is not the same as 1.0.0-bar, and both are meaningfully different from 1.0.0.

@isaacs isaacs closed this as completed Jul 2, 2020
@tresf
Copy link
Author

tresf commented Jul 2, 2020

Thanks for the thorough explanation and it makes sense. I guess the issue is not related to NPM but rather related to the use-cases for semver to begin with. As identified, it's actually quite common for large, mass-consumed projects to rely on build information as important information. As a result, projects which -- properly -- ignore this information inadvertently and indirectly encourage abuse of the pre-release area for distinction. That's not an argument against, nor the fault of, nor can be fixed by NPM, but I don't see this problem going away.

Perhaps as a compromise, projects that knowingly expect non-patch-level changes (e.g. major.minor.patch) to never exit pre-release status. This would result in versions such as 1.0.0-0, 1.0.0-1, 1.0.0-2 -- an unfortunate side-effect to honoring the semver spec. Another compromise is to veer away from relying on +<build> entirely, which I believe is the point of .patch to begin with, but projects will have their own philosophy on this. I digress. 🍻

@ljharb
Copy link
Contributor

ljharb commented Aug 4, 2020

Sequencing in this ecosystem is defined by the semver package's sort function.

@drs9222
Copy link

drs9222 commented Mar 22, 2021

I tend to think just because it isn't used for determining precedence does not mean it should be removed altogether. It is still valid semver and removes valid information about the package's origins, which can be helpful.

@solarmosaic-kflorence
Copy link

Yes, I find it extremely frustrating and un-intuitive that npm removes build metadata from versions.

@isaacs
Copy link
Contributor

isaacs commented Sep 24, 2021

@drs9222 @solarmosaic-kflorence Can you elaborate on specifically what you'd like to use this build metadata for?

I find it extremely frustrating and un-intuitive

What need was frustrated? What intuition was violated? What do you attempt to do, and then find yourself blocked by npm stripping off build metadata? Where did you expect to see build metadata, and find it lacking?

Just to be super clear:

  • Build metadata will always be ignored when resolving version ranges (for reasons stated above).
  • Two versions that differ only in build metadata will not ever be allowed to both be published. (Ie, you can't publish both [email protected]+build.1 and [email protected]+build.2, because that would create ambiguity about what [email protected] should refer to.)

For this reason, the simplest approach is to just strip it from published versions, lockfiles, and the like. We have a name, semver, and sha512 integrity value, which uniquely addresses and identifies any npm artifact.

If there's some utility we're missing in that approach, and we can provide it somehow (in a way that doesn't break older versions of npm, of course), then that's worth exploring. But the important thing is to identify the goal first, and then work towards a feasible implementation.

@jwdonahue
Copy link

@isaacs

The goal is to permanently attach meta data to the version string for whatever purpose your customers deem necessary. They should not have to justify those reasons to you. The build meta tag is part of the SemVer 2 specification. The fact that it's not needed for sorting is not relevant to the discussion of whether the tag should persist. You don't need to strip it from the version string to ignore it for sorting.

@solarmosaic-kflorence
Copy link

solarmosaic-kflorence commented Sep 25, 2021

I agree with @jwdonahue -- the point is, it is part of semver for a reason, and people use it for a reason. NPM forcefully modifying my valid semver is frustrating and unintuitive because I would not expect NPM to be modifying a valid semver at all. In fact, in my opinion this means that NPM does not support the entire semver spec, only a subset. I can appreciate that NPM wants to internally ignore the build metadata on the semver for reasons, but honestly I should not have to even know about that.

@ljharb
Copy link
Contributor

ljharb commented Sep 25, 2021

Doesn't npm follow https://semver.org/spec/v1.0.0.html, since that's all that existed when npm started?

@solarmosaic-kflorence
Copy link

solarmosaic-kflorence commented Sep 27, 2021

Even if that's the case, 2.0.0 has been out since 2013, which is quite a long time ago. Also, FWIW, https://github.com/npm/node-semver supports v2.0.0.

@solarmosaic-kflorence
Copy link

See also npm/node-semver#264

@isaacs
Copy link
Contributor

isaacs commented Sep 28, 2021

Node-semver implements the semver 2.0 specification, and has since semver 2.0's formalization. (In fact, as I was involved with the process of finalizing semver 2.0 and ironing out a lot of the ambiguities in semver 1.0, node-semver was one of the first implementations to do so, shipping support for semver 2.0 slightly before semver 2.0 was formally published.)

The goal is to permanently attach meta data to the version string for whatever purpose your customers deem necessary. They should not have to justify those reasons to you.

it is part of semver for a reason, and people use it for a reason.

Look, I'm not saying it's somehow "bad" or that you have to "answer" for your use of build metadata or whatever. But you're asking for the npm cli and registry to change something that is extremely load bearing, which will be time consuming and risky to implement, and may cause disruption in other parts of the ecosystem, inconveniencing or even harming other members of our community. When asked "what for?", if your best answer is "I have my reasons!", well... sorry, that's a WONTFIX. Working as intended, please find another way to satisfy your secret needs.

In fact, I'm not even sure which example of semver build metadata stripping is actually bothering you. We semver.valid() or semver.parse() every time we touch a version number, which is in a lot of places, in the CLI and on the registry. So not only would we just have to make a lot of changes in a lot of places, we'd also have to make sure that all the assumptions that "same string = same version" hold true, everywhere we make that assumption.

So I know it might seem like I'm being obstinate here, but since (a) the semver specifically explicitly states that two versions that differ only in build metadata have the same precedence, and (b) the npm registry guarantees that there can only ever be a single published artifact of a given name/version combination, and (c) the npm cli's primary use of semver strings is comparing them against dependency ranges, the simplest and most effective way to calculate dependency graphs that correctly satisfy the dependency contract is to ignore build metadata entirely. And the best way to ensure that we're always ignoring it, is to strip it off any time we canonicalize a version string for comparison, or any time we store it someplace where it will be used for comparison.

Just to grab one arbitrary example, right now, let's say you publish a package with "version": "1.2.3+foo" in the package.json file. If you then published "version": "1.2.3+bar", we would have to not allow it. If we did allow it, it would cause some serious problems.

Currently, we do this by normalizing the version number, and putting the manifest in an object at versions['1.2.3'] (without the build metadata). We also permanently store the publish time as time['1.2.3'], so that any subsequent publish will be blocked if it has the same version.

Because, if you do npm install [email protected], it wouldn't be clear which one of those should be used; they have equal precedence, so according to the spec, they're explicitly identical priority, which means we'd have to be inventing new arbitrary rules for choosing one build over another. These could quite easily end up being undocumented implicit rules, subject to change by accident and without warning, likely to be done differently by other npm registry clients. Just stripping the build metadata ensures that we stay out of this situation entirely, avoiding any such questions. We can see that there's already a 1.2.3, even though this new publish is v1.2.3+foo and the previous one was =1.2.3+bar, and throw it back.

I'm not playing dumb or being difficult when I say that I literally don't know where you're asking us to "stop stripping the build metadata". We strip it everywhere, and we always assume that it will have been stripped.

So, I'm sorry, yes, you really will have to spell out exactly what you're trying to do, where you expect build metadata to be included that it isn't, and provide some justification for including it, some use case that would warrant doing this work.

If you want to put build metadata in your package.json, that's fine. You can also include it in your git tags, and anywhere else. npm just won't do anything with it, and the easiest way to ensure we don't accidentally start doing something with it is to not include it in npm's data model at all.

Show me something you want to do today but can't because build metadata is stripped somewhere, and we'll do what we can to empower you if it's at all feasible.

@solarmosaic-kflorence
Copy link

solarmosaic-kflorence commented Sep 29, 2021

I am working on CI/CD pipelines at our company. Our standard practice is to generate pre-release builds off pull request branches that look like: 1.0.0-[datetime]+[commit_sha]. This results in our published artifacts containing the commit SHA in the build metadata, which we use to track which commit the artifact is associated with. That is the primary point of metadata, it provides further context for the thing it is applied to. This is useful when viewing the artifacts or when displaying the pre-release version in other places such as in the web browser or other files related to the build in question. For example, when we do static site deployments, it makes it very easy to ensure the version of the web page you are looking at was deployed from the commit you expected (by referencing the version in the package.json). Otherwise you would have to do something like comparing the [datetime] values, which are not as human friendly.

This pattern worked great everywhere until I tried to apply it to NPM (and, to be fair, later Docker which also does not support + currently). Much to my surprise, using npm version [our_build_version] resulted in 1.0.0-[datetime] instead of 1.0.0-[datetime]+[commit_sha]. This came as a surprise because I did not intuitively expect NPM to do anything with the valid semver version I gave it. So now we have two choices (well, technically three, storing and accessing the metadata somewhere else just for NPM projects). We can make it much harder for our developers and other members of our organization to verify what the source for a build is by only using 1.0.0-[datetime], or we can try to hack it into the version anyways by using something like a - in place of the +, which could potentially cause issues elsewhere due to the aforementioned way NPM resolves versions (and I have a feeling, a common step taken when someone realizes they can't use +). In the case of our static site deployments, it's really only the modification of the package.json we care about as we don't publish these projects into the NPM registry, so we can also get around this issue by modifying the package.json in other ways outside of npm version, but again, frustrating.

I get that NPM implemented an internal solution to resolving versions that doesn't take build metadata into account at all and hence it would be a big lift to support it. I'm just telling you as an end user that the current behavior is confusing and frustrating from an end user perspective and I would like to be able to retain the metadata in my build versions. I'm also a little frustrated that the response has been to seemingly shut down the conversation rather than just acknowledging it would be a nice feature to have that may not be prioritized quickly due to the complexity in implementing it.

EDIT: after typing all of this up and re-reading this entire thread, I don't really see how pointing out specifically how we use the metadata information really adds a lot to this conversation, as everyone has pretty much already answered the basic points about why the metadata is useful to exist in the build version in the first place. There are of course workarounds, but we now have to do those just for NPM and not for other parts of our infrastructure.

@jwdonahue
Copy link

jwdonahue commented Sep 29, 2021

Another use is to embed the build machine name or version of the build environment in the meta tag so that test labs can pull prerelease versions that match specific meta tags for testing. In environments where cloud processes are not allowed to write back to the repo, it provides a way to append information that may be needed for correlation with reported bugs.

I have used them to do A/B testing of products produced by different tool chains or versions thereof.

@isaacs
Copy link
Contributor

isaacs commented Sep 29, 2021

I don't really see how pointing out specifically how we use the metadata information really adds a lot to this conversation

Actually this helps a lot. Because we strip build metadata from hundreds of different places, but in order to consider retaining it, or even to know what you're actually asking for, I'd have to know which instances of stripping the metadata is getting in your way. In this case, it's the difference between "oh, that's easy" and "that is beyond impossible".

We could pretty easily make npm version not strip metadata when writing to package.json. However, a published package will still have the build metadata stripped in the package manifest on the registry. For example, http://npm.im/@isaacs/testing-semver-build-metadata has a version in the package.json of 1.0.0+build.1, but the version reported on the website and in the registry is just 1.0.0. (In the meantime, you could edit your package.json another way, since everything else will prevent build metadata from being relevant later on down the line.)

Would it satisfy your needs to have npm version 1.0.0+build.1 retain the build metadata in the package.json file, even if it was still stripped from the registry manifest of published packages? Because that's relatively straightforward. If you need it to be retained on the registry and in dependencies and lockfiles, that's where it gets difficult.

@isaacs
Copy link
Contributor

isaacs commented Sep 29, 2021

If you need it to be retained on the registry and in dependencies and lockfiles, that's where it gets difficult.

Just to clarify: there are layers here. The version number is found in 4 main places in the package document that might interest you:

  1. The key in the versions collection, referencing the published version manifest.
  2. The "version" field in the manifest for each published version.
  3. The key in the time collection, indicating the time published (and preventing any future publishes with that version number).
  4. It is part of the dist.tarball url, which typically has the form {registry}/{package name}/-/{package name}-{version}.tgz

(1) and (3) are high risk to change. I could see us (or another registry implementation) quite easily mixing that up, and ending up with multiple 1.0.0 publishes which differ only in build metadata as a result.

(2) is somewhat flexible, but I would be on the lookout for a client relying implicitly on the fact that versions[x].version === x always, and having bugs creep in there.

(4) is completely arbitrary (and in fact, those dist.tarball urls are not and have never been guaranteed, and some registry implementations serve them from completely different sorts of urls). I'd be somewhat on the lookout for issues stemming from + as an archaic encoding for " ", but that's probably something we could address by just using something other than + to indicate it in the url, perhaps even stuffing the build metadata after the tgz in a hash section, so it's the same actual filename being requested.

There's also the possibility (lowest risk, not sure if it meets your needs, though) of adding that data to a different field in the dist object, so it's there, but just not attached to the version, and we ignore it.

@tresf
Copy link
Author

tresf commented Sep 29, 2021

When asked "what for?", if your best answer is "I have my reasons!", well... sorry, that's a WONTFIX

As the OP, I can state that my use-case violates semver specification and falls into WONTFIX. I don't necessarily agree with this behavior based on my workflow, but my workflow would violate the spec and for that reason, I have to live with this, and for that reason, I'm not chiming on the request to retain this information.

My use-case is pretty simple:

  • JS published multiple places (e.g. v1.2.3)
  • Notice after publishing that a very minor change was needed, want to publish v1.2.3+1
  • NPM rejects v1.2.3+1 because it's effectively v1.2.3 still.

I use other SemVer libraries and they behave the same way as NPM; comparing build information is non-standard and for that reason, ignored by default. In my case, my issue is that I'm not using SemVer properly. I'm using the .3 as a minor version, but this is actually dedicated for a patch version. This means my next version should always be (e.g. v1.2.3 -> v1.3.0, and I would increment the zero when a new version arrives. I don't and won't immediately change this (and I'm not the only one using versioning this way, examples provided in the issue description) but NPM is behaving properly and for those reasons, I have to live with the nuances of my own decisions. 🍻

@ljharb
Copy link
Contributor

ljharb commented Sep 29, 2021

There’s also the “versions” key in the packument? Unless that’s derived from the time key.

@isaacs
Copy link
Contributor

isaacs commented Sep 29, 2021

@ljharb If you're referring to npm view <pkg> versions returning the list of published versions, that's just Object.keys(packument.versions), for human consumpion.

@isaacs
Copy link
Contributor

isaacs commented Sep 29, 2021

In the case of our static site deployments, it's really only the modification of the package.json we care about as we don't publish these projects into the NPM registry, so we can also get around this issue by modifying the package.json in other ways outside of npm version, but again, frustrating.

Oh, 😂 sorry, I completely missed that you said you don't care about the published impact.

So, yeah, this is a thing that can be done. Maybe my npm registry braindump above will serve future discussions ;)

@solarmosaic-kflorence
Copy link

@isaacs that is correct, for this particular use-case just preserving the build metadata in the package.json would fix our issues. Since we use a datetime string for the pre-release version, this same pattern would also work for published packages, just with the build metadata stripped, which is still not ideal in my opinion but I understand that is a more complicated problem to solve. At least in those cases we can still refer to the package.json in the published library in order to find the metadata, which is still an improvement.

@isaacs
Copy link
Contributor

isaacs commented Sep 29, 2021

@solarmosaic-kflorence If the build metadata is just the git commit, you can also refer to the gitHead field in a published manifest. (This isn't currently set for packages that are published from a monorepo workspace folder rather than the root of the project, but that will be addressed soon, and it's there for most packages that are published from the root of their git repos.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants