Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bugfix]: removing whitespace from malformed version ranges #37

Merged
merged 11 commits into from
Mar 3, 2023

Conversation

kindoflew
Copy link
Contributor

@kindoflew kindoflew commented Feb 17, 2023

Resolves #32

Wow, this was supposed to be a quick swap out of replaceAll, but I realized that it isn't actually that simple (when is it ever? 😅)

Context

Initially, the range.replaceAll(' ', '') call was there to handle malformed version strings (like if there is a space between >= and the actual number). However, the comments in this issue in compare-versions made me realize that this isn't actually a good idea.

TL;DR -- whitespace actually has semantic meaning in certain version ranges.

Take the following version range, for example:

1.2.7 || >=1.2.9 <2.0.0

The whitespace between >=1.2.9 and <2.0.0 represents a logical AND. if we stripped that out, it'd be a newly malformed version range. so, following @arielj's suggestion, i went with RegEx.

The RegEx removes whitespace in four situations:

  1. Between the characters of the comparator operators (ie, < =)
  2. Between the operators and the version number (ie, ^ 12)
  3. Between the digits and decimals in the version number (ie, 1 2 .0 . 0)
  4. Between two versions in a logical AND range (ie, >=4.2.0 <=8.0.0)

(in the last case, it leaves one space, so each version can be asserted separately)

I believe this is good enough for now (thanks Sandi Metz 😂).

Not to mention, if you read through the above-linked issue in compare-versions, the author is currently working on supporting this out-of-the-box, so we can stop worrying about it when the new major version is released.

My RegEx might probably does suck

I tested it on regex101.com and it works. but i'm sure it could be optimized somehow. it's long and ugly and has lots of lookaheads/lookbehinds and logical ORs.

Also, it doesn't take into account whitespace in the "extended portions" of a version (ie, 12.0.0-rc.1.canary.beta) but that probably shouldn't come up for Node versions (I think?).

Another thought is maybe handling malformed version ranges more gracefully (instead of throwing an error). maybe we can have it output something like:

 {
  compatible: 'malformed',
  range: '1 2 . 0 . 0'
}

I figured that was outside the scope of this PR though. This should fix the issue for now, for the majority of cases (and then it can be compare-versions' problem 😂)

Details

Instead of using if/else logic when there's a || present, we treat every range the same:

  1. split at || (if present)
  2. remove whitespace using that obnoxiously long RegEx
  3. split the remaining range at ' ' (if present)
  4. assert every remaining range with satisfies

QA

To confirm the original bug is gone, just downgrade your Node version to something v14 or below and run dev. It should pass now.

To confirm the RegEx works, I guess you could use an online regex tester for weird edge cases. Or set up a mock project with dependencies that have weird ranges specified.

// ie, '>=1.2.9 <2.0.0', because we want to split those later
function removeWhitespace(range: string) {
const extraSpaceRegEx =
/((?<=(<|>))(\s+)(?=(=)))|(?<=(<|>|=|\^|~))(\s+)(?=\d)|((?<=(\d|\.|\*|x|X))(\s+)(?=(\d|\.|\*|x|X)))|(?<=\d)(\s+)(?=\s<|>)/g;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks scary xD, I'd like to give it a try into changing this to something less scary, can you share some of the malformed strings and the expected output?

it might be a good idea to add tests for this function with input/output examples

cause I can imagine there's some other solution (maybe combining regexp with strip, or smaller regexps to replace different parts one at a time)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi lew :D

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unfortunately, JavaScript doesn't have a native strip for strings. we could break down the RegEx into each conditional one though and then just chain replace calls?

const comparatorWhitespace = /((?<=(<|>))(\s+)(?=(=)))/g;
const comparatorAndVersionWhiteSpace = /(?<=(<|>|=|\^|~))(\s+)(?=\d)/g;
const versionWhitespace = /((?<=(\d|\.|\*|x|X))(\s+)(?=(\d|\.|\*|x|X)))/g;
const betweenWhitespace = /(?<=\d)(\s+)(?=\s<|>)/g;

return range
  .trim()
  .replace(comparatorWhitespace, '')
  .replace(comparatorAndVersionWhiteSpace, '')
  .replace(versionWhitespace, '')
  .replace(betweenWhitespace, '');

also, here are some test cases for things that would cause satisfies to throw:

> =12.0.0
> 12.0.0
1 2. 0 .0
>= 12.0.0
> = 1 2 . 0 . 0
> = 4 . 2 . 0     < 8. 0 .0

Copy link
Contributor Author

@kindoflew kindoflew Feb 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oohhh, i just came up with a waaay shorter RegEx using negative lookahead:

/(\s)(?!(<|>|^|~))/g

That matches any whitespace that is not immediately followed by < | > | ~ | ^

It's not perfect (because I'm not sure if you can do an AND range like >=4.2.0 8.0.0, in which case this would remove the whitespace between them), but it should cover all the cases in the above comment.

Copy link
Contributor

@KostiantynPopovych KostiantynPopovych Feb 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @kindoflew, I think we need to move forward with a fully working solution, as this /(\s)(?!(<|>|^|~))/g is not quite yet, let's consider the original one you suggested, for now. We can improve it in the future, but it will be good to have it merged asap.

Copy link
Contributor Author

@kindoflew kindoflew Mar 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@KostiantynPopovych oh yeah, you're right. i thought i solved something else but i didn't 😅. i have too many versions of this locally.

anyway, the issue isn't with >=4.2.0 8.0.0, it's with >=4.2.0 8.0.0.

but, i think i finally figured it out (i didn't benchmark execution time, but with the optimizations in #39, i don't think it'll matter much. plus i've been looking at how big libraries like axios and semver do regex stuff, and they seem to have multiple replace and match calls all over the place and do just fine).

anyways, this (hopefully) is the final version i came up with:

function removeWhitespace(range: string) {
  const comparatorWhitespace = /((?<=(<|>))(\s+)(?=(=)))/g;
  const comparatorAndVersionWhiteSpace = /(?<=(<|>|=|\^|~))(\s+)(?=\d)/g;
  const remainingRange =
   /(((\d\s*){1,3})\.\s*(\d\s*){1,3}\.\s*((0\s*)|(\d\s*){1,2})(?!(((<|>)=?)|~|^)?(((\d\s*){1,3})\.)|(\d)))/g;
  return range
    .trim()
    .replace(comparatorWhitespace, '')
    .replace(comparatorAndVersionWhiteSpace, '')
    .replace(remainingWhitespace, (match: string) => match.replace(/\s+/g, ''));
}

The first two regexes (comparatorWhitespace and comparatorAndVersionWhitespace) strip out the whitespace between what the names say. this would leave us with things like:

>=4.2.0      >=18.0.0
>=4.2. 0     1 8 .0.0   
>=1 2.  0.0
>1 2.0.1 0    1
1 2 . 1 0 . 0 
>=1 2.0.1 0
>=1   2.0.0
>=4.2.0      8.0
>=4.2.1  0      8

The last regex (remainingRange) matches a single SemVer version that may or may not have whitespace between digits or digits and decimals. and then we replace it with the same string with whitespace removed. this leaves us with either a whitespace-less single range or an AND range with a single space in between. we still need the last filter call because the last regex will not match on something like >1 <3, which is a valid range (minus the extra whitespace).

Copy link
Contributor Author

@kindoflew kindoflew Mar 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@KostiantynPopovych however, looking at what node-semver does, they assume the only additional spaces will be between comparator operator characters or between comparator operators and the version number -- they consider additional space between digits/decimals to be malformed and just throw an error.

we could do that (but instead of throwing, maybe we just return { compatible: undefined, range: {malformedRange} } or maybe even { compatible: 'invalid', range: {malformedRange} }. what do you think of that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kindoflew this { compatible: 'invalid', range: {malformedRange} } looks great to me! I think it will be good to ask why node-semver decide to mark it as malformed 🤔

Copy link
Contributor Author

@kindoflew kindoflew Mar 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think because the RegEx to parse it is impossible? or at least computationally expensive.

consider the case where you have the following, malformed range:

<1   2

is that supposed to be "less than 12" or "less than 1 AND exactly 2"? i think because there are so many different variations for SemVer strings, you kind of have to follow the happy path for the most part and just throw when you can't be sure. at least that's my assumption from researching this stuff the past few weeks ¯\_(ツ)_/¯

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds reasonable) So, I believe we can go with this approach.

@KostiantynPopovych
Copy link
Contributor

Great job, @kindoflew, and thank you!

@KostiantynPopovych KostiantynPopovych merged commit 5961124 into upgradejs:main Mar 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: depRange.replaceAll is not a function
3 participants