-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[bugfix]: removing whitespace from malformed version ranges #37
[bugfix]: removing whitespace from malformed version ranges #37
Conversation
src/queries/getPackageData.ts
Outdated
// ie, '>=1.2.9 <2.0.0', because we want to split those later | ||
function removeWhitespace(range: string) { | ||
const extraSpaceRegEx = | ||
/((?<=(<|>))(\s+)(?=(=)))|(?<=(<|>|=|\^|~))(\s+)(?=\d)|((?<=(\d|\.|\*|x|X))(\s+)(?=(\d|\.|\*|x|X)))|(?<=\d)(\s+)(?=\s<|>)/g; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks scary xD, I'd like to give it a try into changing this to something less scary, can you share some of the malformed strings and the expected output?
it might be a good idea to add tests for this function with input/output examples
cause I can imagine there's some other solution (maybe combining regexp with strip, or smaller regexps to replace different parts one at a time)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hi lew :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unfortunately, JavaScript doesn't have a native strip
for strings. we could break down the RegEx into each conditional one though and then just chain replace
calls?
const comparatorWhitespace = /((?<=(<|>))(\s+)(?=(=)))/g;
const comparatorAndVersionWhiteSpace = /(?<=(<|>|=|\^|~))(\s+)(?=\d)/g;
const versionWhitespace = /((?<=(\d|\.|\*|x|X))(\s+)(?=(\d|\.|\*|x|X)))/g;
const betweenWhitespace = /(?<=\d)(\s+)(?=\s<|>)/g;
return range
.trim()
.replace(comparatorWhitespace, '')
.replace(comparatorAndVersionWhiteSpace, '')
.replace(versionWhitespace, '')
.replace(betweenWhitespace, '');
also, here are some test cases for things that would cause satisfies
to throw:
> =12.0.0
> 12.0.0
1 2. 0 .0
>= 12.0.0
> = 1 2 . 0 . 0
> = 4 . 2 . 0 < 8. 0 .0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oohhh, i just came up with a waaay shorter RegEx using negative lookahead:
/(\s)(?!(<|>|^|~))/g
That matches any whitespace that is not immediately followed by < | > | ~ | ^
It's not perfect (because I'm not sure if you can do an AND range like >=4.2.0 8.0.0
, in which case this would remove the whitespace between them), but it should cover all the cases in the above comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @kindoflew, I think we need to move forward with a fully working solution, as this /(\s)(?!(<|>|^|~))/g
is not quite yet, let's consider the original one you suggested, for now. We can improve it in the future, but it will be good to have it merged asap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@KostiantynPopovych oh yeah, you're right. i thought i solved something else but i didn't 😅. i have too many versions of this locally.
anyway, the issue isn't with >=4.2.0 8.0.0
, it's with >=4.2.0 8.0.0
.
but, i think i finally figured it out (i didn't benchmark execution time, but with the optimizations in #39, i don't think it'll matter much. plus i've been looking at how big libraries like axios
and semver
do regex stuff, and they seem to have multiple replace
and match
calls all over the place and do just fine).
anyways, this (hopefully) is the final version i came up with:
function removeWhitespace(range: string) {
const comparatorWhitespace = /((?<=(<|>))(\s+)(?=(=)))/g;
const comparatorAndVersionWhiteSpace = /(?<=(<|>|=|\^|~))(\s+)(?=\d)/g;
const remainingRange =
/(((\d\s*){1,3})\.\s*(\d\s*){1,3}\.\s*((0\s*)|(\d\s*){1,2})(?!(((<|>)=?)|~|^)?(((\d\s*){1,3})\.)|(\d)))/g;
return range
.trim()
.replace(comparatorWhitespace, '')
.replace(comparatorAndVersionWhiteSpace, '')
.replace(remainingWhitespace, (match: string) => match.replace(/\s+/g, ''));
}
The first two regexes (comparatorWhitespace
and comparatorAndVersionWhitespace
) strip out the whitespace between what the names say. this would leave us with things like:
>=4.2.0 >=18.0.0
>=4.2. 0 1 8 .0.0
>=1 2. 0.0
>1 2.0.1 0 1
1 2 . 1 0 . 0
>=1 2.0.1 0
>=1 2.0.0
>=4.2.0 8.0
>=4.2.1 0 8
The last regex (remainingRange
) matches a single SemVer version that may or may not have whitespace between digits or digits and decimals. and then we replace it with the same string with whitespace removed. this leaves us with either a whitespace-less single range or an AND range with a single space in between. we still need the last filter
call because the last regex will not match on something like >1 <3
, which is a valid range (minus the extra whitespace).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@KostiantynPopovych however, looking at what node-semver
does, they assume the only additional spaces will be between comparator operator characters or between comparator operators and the version number -- they consider additional space between digits/decimals to be malformed and just throw an error.
we could do that (but instead of throwing, maybe we just return { compatible: undefined, range: {malformedRange} }
or maybe even { compatible: 'invalid', range: {malformedRange} }
. what do you think of that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kindoflew this { compatible: 'invalid', range: {malformedRange} }
looks great to me! I think it will be good to ask why node-semver
decide to mark it as malformed 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think because the RegEx to parse it is impossible? or at least computationally expensive.
consider the case where you have the following, malformed range:
<1 2
is that supposed to be "less than 12" or "less than 1 AND exactly 2"? i think because there are so many different variations for SemVer strings, you kind of have to follow the happy path for the most part and just throw when you can't be sure. at least that's my assumption from researching this stuff the past few weeks ¯\_(ツ)_/¯
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds reasonable) So, I believe we can go with this approach.
Great job, @kindoflew, and thank you! |
Resolves #32
Wow, this was supposed to be a quick swap out of
replaceAll
, but I realized that it isn't actually that simple (when is it ever? 😅)Context
Initially, the
range.replaceAll(' ', '')
call was there to handle malformed version strings (like if there is a space between>=
and the actual number). However, the comments in this issue incompare-versions
made me realize that this isn't actually a good idea.TL;DR -- whitespace actually has semantic meaning in certain version ranges.
Take the following version range, for example:
The whitespace between
>=1.2.9
and<2.0.0
represents a logical AND. if we stripped that out, it'd be a newly malformed version range. so, following @arielj's suggestion, i went with RegEx.The RegEx removes whitespace in four situations:
< =
)^ 12
)1 2 .0 . 0
)>=4.2.0 <=8.0.0
)(in the last case, it leaves one space, so each version can be asserted separately)
I believe this is good enough for now (thanks Sandi Metz 😂).
Not to mention, if you read through the above-linked issue in
compare-versions
, the author is currently working on supporting this out-of-the-box, so we can stop worrying about it when the new major version is released.My RegEx
mightprobably does suckI tested it on regex101.com and it works. but i'm sure it could be optimized somehow. it's long and ugly and has lots of lookaheads/lookbehinds and logical ORs.
Also, it doesn't take into account whitespace in the "extended portions" of a version (ie,
12.0.0-rc.1.canary.beta
) but that probably shouldn't come up for Node versions (I think?).Another thought is maybe handling malformed version ranges more gracefully (instead of throwing an error). maybe we can have it output something like:
I figured that was outside the scope of this PR though. This should fix the issue for now, for the majority of cases (and then it can be
compare-versions
' problem 😂)Details
Instead of using if/else logic when there's a
||
present, we treat every range the same:||
(if present)' '
(if present)satisfies
QA
To confirm the original bug is gone, just downgrade your Node version to something
v14
or below and rundev
. It should pass now.To confirm the RegEx works, I guess you could use an online regex tester for weird edge cases. Or set up a mock project with dependencies that have weird ranges specified.