-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix a out of bound bug in parse_url query #1746
Fix a out of bound bug in parse_url query #1746
Conversation
Signed-off-by: Haoyang Li <[email protected]>
Signed-off-by: Haoyang Li <[email protected]>
build |
Signed-off-by: Haoyang Li <[email protected]>
Tested again with current code from plugin, passed. |
Signed-off-by: Haoyang Li <[email protected]>
Co-authored-by: Chong Gao <[email protected]>
Signed-off-by: Haoyang Li <[email protected]>
Signed-off-by: Haoyang Li <[email protected]>
src/main/cpp/src/parse_uri.cu
Outdated
|
||
// stop matching early after it can no longer contain the string we are searching for | ||
while (h + n_bytes < h_end) { | ||
bool match_needle = false; // initialize to false to prevent empty query key |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems can not handle the corner case:
needle = '' empty string.
Set match_needle = true?
Is this a valid case: find_query_part('=value', '') ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice catch, thanks.
The test code to generate excepted results in java test has a bug, it returns null for '=value', ''
this case, while spark return value. So you are right, it should be true here.
Can current code handle this case? |
Signed-off-by: Haoyang Li <[email protected]>
only query part will be passed in this function, so I think we are good for this case. |
build |
Fixes #1745
#1740 adds new test cases
"https://www.nvidia.com/vote.php?=50"
as url and""
(empty string) as key in a parse_url query with a key kernel, which will lead to an out of bound issue detected by compute sanitizer.In the following code:
when pass haystack = "=50" and needle = "" in, the find_length will be 3 - 0 + 1 = 4, so in line:
h will be out of bound.
We use
find_length
to stop matching early after it can no longer contain the string we are searching for, andfind_length = haystack.size_bytes() - n_bytes
is a tighter bound for it.This PR will unblock CI and submodule-sync, Integration tests and big dataset test passed from plugin side.
(I decide to keep
-DUSE_SANITIZER=ON
in my build command forever...)