Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[querystring] avoid Object.prototype conflicts including __proto__ #5650

Closed
wants to merge 5 commits into from
Closed

[querystring] avoid Object.prototype conflicts including __proto__ #5650

wants to merge 5 commits into from

Conversation

WebReflection
Copy link
Contributor

In order to solve #5642 , keeping an eye and actually most likely improving performance for #728 , this PR is meant to show a better way to deal with inheritance.

Mostly all problems would be solved using const obj = Object.create(null); instead of const obj = {}; on top, but we all know that's going to break around more code than it will solve.

This PR is one option, the alternative is to grab, each time, at runtime, Object.getOwnPropertyNames(Object.prototype) and use double indexOf per each key, one for the known keys in object, one for the known keys in its prototype in case keys in object aren't known.

As most common CS related problem say, "hashes" are faster than O(N*?) searches in an array and this is the reason I've proposed a solution based on in operator first, but we know JS hases aren't even close to be comparable to structs or heap operations.

@mscdex mscdex added the querystring Issues and PRs related to the built-in querystring module. label Mar 10, 2016
@MylesBorins
Copy link
Contributor

ci: https://ci.nodejs.org/job/node-test-pull-request/1896/

a few nits.

  • It would be nice to see this come in with a new test to verify that this indeed fixes the edge case that originally spawned this change (proto)
  • please update the commit message as per our contributing guide lines
    1. The first line should be 50 characters or less and contain a short description of the change prefixed with the name of the changed subsystem (e.g. "net: add localAddress and localPort to Socket").
    2. Keep the second line blank.
    3. Wrap all other lines at 72 columns.

@WebReflection
Copy link
Contributor Author

Yeah ... you know ... I'd love to provide a test like

var qsTestCases = [
  ['__proto__=1',
   '__proto__=1',
   {'__proto__': '1'}],

... etc etc ...

The problem is that your assert.deepEqual swallows __proto__ too so this fails:

it doesn't, I was writing {"__proto__":1} as if it was parsed by JSON.parse, solved in the next commit.

$ make -j8 test
make -C out BUILDTYPE=Release V=1
make[1]: Entering directory '/home/webreflection/code/node/out'
make[1]: Nothing to be done for 'all'.
make[1]: Leaving directory '/home/webreflection/code/node/out'
ln -fs out/Release/node node
Running main() from gtest_main.cc
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from UtilTest
[ RUN      ] UtilTest.ListHead
[       OK ] UtilTest.ListHead (0 ms)
[----------] 1 test from UtilTest (0 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test case ran. (0 ms total)
[  PASSED  ] 1 test.
/usr/bin/python tools/test.py --mode=release message parallel sequential -J
=== release test-querystring ===                                               
Path: parallel/test-querystring
assert.js:89
  throw new assert.AssertionError({
  ^
AssertionError: {} deepEqual { __proto__: '1' }
    at /home/webreflection/code/node/test/parallel/test-querystring.js:99:10
    at Array.forEach (native)
    at Object.<anonymous> (/home/webreflection/code/node/test/parallel/test-querystring.js:98:13)
    at Module._compile (module.js:417:34)
    at Object.Module._extensions..js (module.js:426:10)
    at Module.load (module.js:357:32)
    at Function.Module._load (module.js:314:12)
    at Function.Module.runMain (module.js:451:10)
    at startup (node.js:151:18)
    at node.js:965:3
Command: out/Release/node /home/webreflection/code/node/test/parallel/test-querystring.js
[01:06|% 100|+ 1046|-   1]: Done                                               
Makefile:115: recipe for target 'test' failed
make: *** [test] Error 1

However, with my changes in, everything is just fine as you can see in my console after building with the PR you've nit-picked ;-)

$ ./out/Release/node 
> require('querystring')
{ unescapeBuffer: [Function],
  unescape: [Function: qsUnescape],
  escape: [Function],
  encode: [Function],
  stringify: [Function],
  decode: [Function],
  parse: [Function] }
> require('querystring').parse('a=1')
{ a: '1' }
> require('querystring').parse('__proto__=1')
{ __proto__: '1' }
> Object.getOwnPropertyNames(require('querystring').parse('__proto__=1'))
[ '__proto__' ]
> 

Now try to execute any of those commands in current stable node or latest from master and see different results.

As Summary: what should I do about tests? I kinda feel stuck in a loop I should PR fixes for assert.deepEqual so it won't suffer __proto__ but then I won't have tests to demonstrate that because your testing framework is broken ^_^;;

Please let me know how to proceed.

@mscdex
Copy link
Contributor

mscdex commented Mar 11, 2016

I would like to see benchmark results before and after this change.

@mscdex
Copy link
Contributor

mscdex commented Mar 11, 2016

Also, I wouldn't say assert is broken. If you need to test a very special case (like __proto__) then do so manually. You don't have to use assert, as long as you're throwing an error when the new test fails.

@WebReflection
Copy link
Contributor Author

I would like to see benchmark results before and after this change.

Me too but here's the catch: querystring is broken, we need to fix it. It's not an option to not fix it because maybe it got slightly slower.

I hope you agree, otherwise there won't ever be interoperability with query strings all over the web, where standards didn't accept edge cases like __proto__ or any method of the Object.prototype.

@mscdex
Copy link
Contributor

mscdex commented Mar 11, 2016

@WebReflection I'm not arguing about the need for a fix. I'm more interested in finding the best performing solution to the problem.

@WebReflection
Copy link
Contributor Author

Also, I wouldn't say assert is broken. If you need to test a very special case (like proto) then do so manually. You don't have to use assert, as long as you're throwing an error when the new test fails.

I disagree. assert.deepEqual should deeply check equality, not create objects that didn't exist and were not part of the test. it does, I was testing the wrong object.

Despite what I think, what should be the way to test that in your opinion ?

I can use Object.keys(querystring.parse('__proto__=1'))[0] === '__proto__' ... would that be enough?

@WebReflection
Copy link
Contributor Author

My apoligies, I've realized you're right, assert.deepEqueal is not responsible, my test is. Will push a better one right now and see if my assumption about swallowing objects was right.

@mscdex
Copy link
Contributor

mscdex commented Mar 11, 2016

Doing something like this should be fine if nothing else:

const result = querystring.parse('__proto__=1');
const keys = Object.keys(result);
assert.deepEqual(keys, ['__proto__']);
assert.strictEqual(result[keys[0]], '1');

@WebReflection
Copy link
Contributor Author

I've done hopefully a better test, I've used JSON.parse('{"__proto__":"1"}') which will create an own __proto__ property and indeed assert.deepEqual works fine.

Like I've said, it's half past midnight here so I'm probably better off coding now.

I'll try to quickly change the commit message if I manage, I don't know how to bench best, common, and worst cases against previous version though, any hint would be appreciated.

Best Regards

@mscdex
Copy link
Contributor

mscdex commented Mar 11, 2016

Running the benchmark/querystring/querystring-parse.js benchmark should be fine. Just have before and after binaries and then do this (from the project root):

$ node benchmark/compare.js -r -g ./node-binary-after-changes ./node-binary-before-changes -- querystring querystring-parse

Also make sure that you copy the binary from out/Release/node for at least one of the cases and not just copying ./node since the latter is just a symlink and using two symlinks that point to the same binary won't help much :-)

You may need to do that a few times to get a decent feel for how the changes affected performance as the results can vary between runs depending on the overall system load and other factors.

@WebReflection
Copy link
Contributor Author

These are two runs of the bench.

The version in ~/tmp/node/Release/node is the version built with this patch, the version in out/Release/node is the current file in master without the patch.

To be extra clear, the version in ~/tmp/Release/node doesn't fail:

$ ~/tmp/node/Release/node 
> require('querystring').parse('__proto__')
{ __proto__: '' }
> 

while the one in out/Release/node does:

$ out/Release/node 
> require('querystring').parse('__proto__')
{}
> 

Following the result of the bench, it's kinda expected, like I've commented in the code.
The worst case scenario is where in needs to be followed by indexOf checks because there are many pairs. Although everything else goes faster (plus it's safer and it solves all Object.prototype related shenanigans)

$ node benchmark/compare.js -r -g ~/tmp/node/Release/node out/Release/node -- querystring querystring-parse
running /home/webreflection/tmp/node/Release/node
querystring/querystring-parse.js
running out/Release/node
querystring/querystring-parse.js

querystring/querystring-parse.js type=noencode n=1000000: /home/webreflection/tmp/node/Release/node: 583690 out/Release/node: 610220 ........ -4.35%
querystring/querystring-parse.js type=multicharsep n=1000000: /home/webreflection/tmp/node/Release/node: 471900 out/Release/node: 544890 ... -13.40%
querystring/querystring-parse.js type=encodemany n=1000000: /home/webreflection/tmp/node/Release/node: 350630 out/Release/node: 356800 ...... -1.73%
querystring/querystring-parse.js type=encodelast n=1000000: /home/webreflection/tmp/node/Release/node: 515560 out/Release/node: 554390 ...... -7.00%
querystring/querystring-parse.js type=multivalue n=1000000: /home/webreflection/tmp/node/Release/node: 462950 out/Release/node: 525150 ..... -11.84%
querystring/querystring-parse.js type=multivaluemany n=1000000: /home/webreflection/tmp/node/Release/node: 197960 out/Release/node: 239460 . -17.33%
querystring/querystring-parse.js type=manypairs n=1000000: /home/webreflection/tmp/node/Release/node: 144000 out/Release/node: 126990 ....... 13.39%

[webreflection@archibold node]$ node benchmark/compare.js -r -g ~/tmp/node/Release/node out/Release/node -- querystring querystring-parse
running /home/webreflection/tmp/node/Release/node
querystring/querystring-parse.js
running out/Release/node
querystring/querystring-parse.js

querystring/querystring-parse.js type=noencode n=1000000: /home/webreflection/tmp/node/Release/node: 479990 out/Release/node: 620210 ....... -22.61%
querystring/querystring-parse.js type=multicharsep n=1000000: /home/webreflection/tmp/node/Release/node: 533420 out/Release/node: 568270 .... -6.13%
querystring/querystring-parse.js type=encodemany n=1000000: /home/webreflection/tmp/node/Release/node: 350320 out/Release/node: 363190 ...... -3.54%
querystring/querystring-parse.js type=encodelast n=1000000: /home/webreflection/tmp/node/Release/node: 504480 out/Release/node: 556740 ...... -9.39%
querystring/querystring-parse.js type=multivalue n=1000000: /home/webreflection/tmp/node/Release/node: 475120 out/Release/node: 531250 ..... -10.57%
querystring/querystring-parse.js type=multivaluemany n=1000000: /home/webreflection/tmp/node/Release/node: 193320 out/Release/node: 239250 . -19.20%
querystring/querystring-parse.js type=manypairs n=1000000: /home/webreflection/tmp/node/Release/node: 145510 out/Release/node: 140270 ........ 3.73%

Is there anything else I should do?

@evanlucas
Copy link
Contributor

Weird...I got this benchmark on one out of 5 runs:

running ./node
querystring/querystring-parse.js
running ./node_before
querystring/querystring-parse.js

querystring/querystring-parse.js type=noencode n=1000000: ./node: 688640 ./node_before: 624930 ........... 10.20%
querystring/querystring-parse.js type=multicharsep n=1000000: ./node: 642810 ./node_before: 598410 ........ 7.42%
querystring/querystring-parse.js type=encodemany n=1000000: ./node: 423780 ./node_before: 343480 ......... 23.38%
querystring/querystring-parse.js type=encodelast n=1000000: ./node: 603400 ./node_before: 408220 ......... 47.81%
querystring/querystring-parse.js type=multivalue n=1000000: ./node: 543010 ./node_before: 292040 ......... 85.93%
querystring/querystring-parse.js type=multivaluemany n=1000000: ./node: 246270 ./node_before: 229860 ...... 7.14%
querystring/querystring-parse.js type=manypairs n=1000000: ./node: 177630 ./node_before: 134940 .......... 31.64%

The others showed similar results to @WebReflection's

@WebReflection
Copy link
Contributor Author

Maybe something else was happening during the test and it got slower ...cause it makes no much sense otherwise, the in operator on a pain object is at least as fast as, but most likely faster than, an indexOf operation. If you find a way to reproduce please let me know cause in my Linux it's always more or less the same. Thanks for checking

In order to solve #5642 , keeping an eye and actually most likely
improving performance for #728, this PR is meant to show
a better way to deal with inheritance.

As most common CS related problem say, "hashes" are faster than O(N*?)
searches in an array and this is the reason I've proposed a solution
based on `in` operator first.
@WebReflection
Copy link
Contributor Author

not sure I've messed up with the flow here, I've tried to clean up the commit WebReflection@b121816

If you want a fresh new PR let me know so I can reset this one and make a better (single) one.

@jasnell
Copy link
Member

jasnell commented Mar 15, 2016

This PR is fine, the commits can be squashed and cleaned up on landing if necessary.

@WebReflection
Copy link
Contributor Author

damn it, bad timing. I've pushed a freesh new clean PR #5722 and dropped the previous fork for this one.

@jasnell
Copy link
Member

jasnell commented Mar 15, 2016

;-) no worries

@jasnell jasnell closed this Mar 15, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
querystring Issues and PRs related to the built-in querystring module.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants