Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

3x faster setImmediate #6436

Closed
wants to merge 12 commits into from
Closed

Conversation

andrasq
Copy link

@andrasq andrasq commented Apr 28, 2016

Checklist
  • tests and code linting passes
  • a test and/or benchmark is included
  • the commit message follows commit guidelines
Affected core subsystem(s)
Description of change

Sped up setImmediate processing by 60% to over 400% through faster linkedlist creation,
faster immediate object creation, not wrapping closures around callbacks, and invoking
the callbacks faster from an optimizable function without using .call

Sped up clearImmediate by delaying the linked list update until processImmediate pulls the
immediate off the queue, where it's done more efficiently. This speeds up clearImmediate 3x,
with the benefit stacking on top of the setImmediate speedup.

Added benchmarks for setImmediate and clearImmediate.

Andras added 4 commits April 27, 2016 20:05
use L.create() factory to create access-optimized linkedlist objects
Save the setImmediate callback arguments into an array instead of a
closure, and invoke the callback on the arguments from an optimizable
function.

  60% faster setImmediate with 0 args (15% if self-recursive)
  4x faster setImmediate with 1-3 args, 2x with > 3
  seems to be faster with less memory pressure when memory is tight

Changes:
- use L.create() to build faster lists
- use runCallback() from within tryOnImmediate
- create immediate timers with a function instead of new
- just save the arguments and not build closures for the callbacks
Instead of unlinking from the immediate queue immediate on clear,
put off the unlink until processImmediate where it's more efficient.

  3x faster clearImmediate processing

  The the benefits stack with the setImmediate speedups, ie total gain
  of 3.5x with 0 arguments and 4-5x with 1-3 args.

Changed the code to defer unlinking from the immediate queue until
processImmediate consumes the queue anyway.
Timings for sequential and concurren setImmediate with and without
arguments, and set + clearImmediate.
@addaleax addaleax added timers Issues and PRs related to the timers subsystem / setImmediate, setInterval, setTimeout. performance Issues and PRs related to the performance of Node.js. labels Apr 28, 2016
@@ -6,6 +6,13 @@ function init(list) {
}
exports.init = init;

// create a new linked list
function create() {
var list = { _idleNext: null, _idlePrev: null };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be const.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also wonder if there would be any performance benefit to instead creating a new instance of an object that doesn't inherit from Object.prototype? For example:

function LinkedListNode() {
  this._idleNext = null;
  this._idlePrev = null;
}
LinkedListNode.prototype = Object.create(null);

function create() {
  const list = new LinkedListNode();
  init(list);
  return list;
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed to const.

I tried the Object.create version above, and it's inclusive. Some test runs (10-20 sec runs)
show a 2% advantage one way, then changing the loop/repeat count (where loops*repeats
is the number of objects created) flips the advantage the other way.

@mscdex
Copy link
Contributor

mscdex commented Apr 28, 2016

Perhaps the benchmarks could more closely resemble the existing setImmediate() benchmarks in benchmark/misc (which probably need to be moved to benchmark/timers now, but that is a separate PR)? Specifically when I landed setImmediate() performance improvements awhile back, @bnoordhuis had commented that the callbacks should not reference variables in parent scopes if at all possible.


function Immediate() { }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a breaking change

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, nvm, #6206 has not landed yet

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this would be breaking if #6206 had landed, as it doesn't get rid of the Immediate class, and instead just moves the property assignments into the constructor.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That code changed since the comment was left here.

@andrasq
Copy link
Author

andrasq commented Apr 28, 2016

@mscdex re the benchmarks, oops, I didn't realize misc already had immediate tests.
I patterned the ones I wrote on the timers/timeout.js benchmarks ("patterned" as in cut-and-paste
reused them, pretty much).

Each test function is called once, the closure should be built once and reused across all callbacks.
The tests are strctured as eg

function test() {
  function cb() { if ... bench.end(K) }
  for (...) { setImmediate(cb) }
}

Changing it to

  for (...) { setImmediate(function cb() { if ... bench.end(K) }) }

halves the throughput, suggesting that the closure is reused in the above case and
not reused below. (The tests also measure the cost of passing arguments to the callback,
and the 0-argument depth test needs a closure to know when to stop.)

A different question is whether my benchmarks add enough value to keep, or if I should just ditch them
in favor of the ones that already exist.

@@ -502,24 +502,26 @@ Timeout.prototype.close = function() {
};


var immediateQueue = {};
L.init(immediateQueue);
var immediateQueue = L.create();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really understand why this change is needed or is helpful but I guess it does clean things up a little.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prevents a hidden class change

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at top-level it was changed to match processImmediate (and it cleans things up a bit).
In processImmediate it's a speedup, L.create() returns an access-optimized object.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I bet the object ends up being access-optimized after enough runs anyway. Not to mention we can make objects access-optimized explicitly.

That said - this change improves style anyway and is better coding.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably, though depends on how many immediates are queued in an event loop cycle.
The immediateQueue is re-created every time, so the optimization would not persist
(and there would be a run-time cost for the conversion).

Out of curiosity, what are the ways of creating access-optimized objects? I know about
objects created with { ... }, the prototype properties of new objects, (the this. properties
as assigned in the constructor?), and assigning an object as the prototype of a function
forces optimization. Any others?

Copy link
Member

@benjamingr benjamingr Apr 30, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrasq conveniently, here is a StackOverflow answer I wrote about a technique petkaantonov used in bluebird (with making an object as a prototype of a function).

This also works with Object.create so I guess that's another one.

this. properties assigned in the constructor is the "standard" way though, it's even "easier" than object literals and it makes the fact it's static obvious to v8 (object literals work fine too usually).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be explicit - this specific change LGTM, even if it's not faster but it probably is given the old code.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's a great writeup, and a handy reference, thanks!

@andrasq
Copy link
Author

andrasq commented Apr 29, 2016

pushed edits (some more consts)

@benjamingr
Copy link
Member

Minus new Immediate instead of createImmediate - looks good to me.

Thanks for this fix.

@andrasq
Copy link
Author

andrasq commented May 1, 2016

pushed edits (new Immediate)

@benjamingr
Copy link
Member

LGTM although I meant for new Immediate to take the arguments in the constructor.

Fishrock123 pushed a commit that referenced this pull request Jul 5, 2016
Now uses a new L.create() factory to create access-optimized linkedlist
objects.

PR-URL: #6436
Reviewed-By: Benjamin Gruenbaum <[email protected]>
Reviewed-By: Jeremiah Senkpiel <[email protected]>
Fishrock123 pushed a commit that referenced this pull request Jul 5, 2016
Save the setImmediate() callback arguments into an array instead of a
closure, and invoke the callback on the arguments from an optimizable
function.

  60% faster setImmediate with 0 args (15% if self-recursive)
  4x faster setImmediate with 1-3 args, 2x with > 3
  seems to be faster with less memory pressure when memory is tight

Changes:
- use L.create() to build faster lists
- use runCallback() from within tryOnImmediate()
- save the arguments and do not build closures for the callbacks

PR-URL: #6436
Reviewed-By: Benjamin Gruenbaum <[email protected]>
Reviewed-By: Jeremiah Senkpiel <[email protected]>
Fishrock123 pushed a commit that referenced this pull request Jul 5, 2016
Timings for sequential and concurren setImmediate() with and without
arguments, and set + clearImmediate().

PR-URL: #6436
Reviewed-By: Benjamin Gruenbaum <[email protected]>
Reviewed-By: Jeremiah Senkpiel <[email protected]>
@Fishrock123 Fishrock123 mentioned this pull request Jul 5, 2016
Fishrock123 added a commit that referenced this pull request Jul 6, 2016
Notable changes:

* buffer: Added `buffer.swap64()` to compliment `swap16()` &
`swap32()`. (Zach Bjornson) #7157
* build: New `configure` options have been added for building Node.js
as a shared library. (Stefan Budeanu)
#6994
  - The options are: `--shared`, `--without-v8-platform` &
`--without-bundled-v8`.
* crypto: Root certificates have been updated. (Ben Noordhuis)
#7363
* debugger: The server address is now configurable via
`--debug=<address>:<port>`. (Ben Noordhuis)
#3316
* npm: Upgraded npm to v3.10.3 (Kat Marchán)
#7515 & (Rebecca Turner)
#7410
* readline: Added the `prompt` option to the readline constructor.
(Evan Lucas) #7125
* repl / vm: `sigint`/`ctrl+c` will now break out of infinite loops
without stopping the Node.js instance. (Anna Henningsen)
#6635
* src:
  - Added a `node::FreeEnvironment` public C++ API. (Cheng Zhao)
#3098
  - Refactored `require('constants')`, constants are now available
directly from their respective modules. (James M Snell)
#6534
* stream: Improved `readable.read()` performance by up to 70%. (Brian
White) #7077
* timers: `setImmediate()` is now up to 150% faster in some situations.
(Andras) #6436
* util: Added a `breakLength` option to `util.inspect()` to control how
objects are formatted across lines. (cjihrig)
#7499
* v8-inspector: Experimental support has been added for debugging
Node.js over the inspector protocol. (Ali Ijaz Sheikh)
#6792
  - *Note: This feature is experimental, and it could be altered or
removed.*
  - You can try this feature by running Node.js with the `--inspect`
flag.

Refs: #7441
PR-URL: #7550
Fishrock123 added a commit that referenced this pull request Jul 6, 2016
Notable changes:

* buffer: Added `buffer.swap64()` to compliment `swap16()` &
`swap32()`. (Zach Bjornson) #7157
* build: New `configure` options have been added for building Node.js
as a shared library. (Stefan Budeanu)
#6994
  - The options are: `--shared`, `--without-v8-platform` &
`--without-bundled-v8`.
* crypto: Root certificates have been updated. (Ben Noordhuis)
#7363
* debugger: The server address is now configurable via
`--debug=<address>:<port>`. (Ben Noordhuis)
#3316
* npm: Upgraded npm to v3.10.3 (Kat Marchán)
#7515 & (Rebecca Turner)
#7410
* readline: Added the `prompt` option to the readline constructor.
(Evan Lucas) #7125
* repl / vm: `sigint`/`ctrl+c` will now break out of infinite loops
without stopping the Node.js instance. (Anna Henningsen)
#6635
* src:
  - Added a `node::FreeEnvironment` public C++ API. (Cheng Zhao)
#3098
  - Refactored `require('constants')`, constants are now available
directly from their respective modules. (James M Snell)
#6534
* stream: Improved `readable.read()` performance by up to 70%. (Brian
White) #7077
* timers: `setImmediate()` is now up to 150% faster in some situations.
(Andras) #6436
* util: Added a `breakLength` option to `util.inspect()` to control how
objects are formatted across lines. (cjihrig)
#7499
* v8-inspector: Experimental support has been added for debugging
Node.js over the inspector protocol. (Ali Ijaz Sheikh)
#6792
  - *Note: This feature is experimental, and it could be altered or
removed.*
  - You can try this feature by running Node.js with the `--inspect`
flag.

Refs: #7441
PR-URL: #7550
Fishrock123 added a commit that referenced this pull request Jul 6, 2016
Notable changes:

* buffer: Added `buffer.swap64()` to compliment `swap16()` &
`swap32()`. (Zach Bjornson) #7157
* build: New `configure` options have been added for building Node.js
as a shared library. (Stefan Budeanu)
#6994
  - The options are: `--shared`, `--without-v8-platform` &
`--without-bundled-v8`.
* crypto: Root certificates have been updated. (Ben Noordhuis)
#7363
* debugger: The server address is now configurable via
`--debug=<address>:<port>`. (Ben Noordhuis)
#3316
* npm: Upgraded npm to v3.10.3 (Kat Marchán)
#7515 & (Rebecca Turner)
#7410
* readline: Added the `prompt` option to the readline constructor.
(Evan Lucas) #7125
* repl / vm: `sigint`/`ctrl+c` will now break out of infinite loops
without stopping the Node.js instance. (Anna Henningsen)
#6635
* src:
  - Added a `node::FreeEnvironment` public C++ API. (Cheng Zhao)
#3098
  - Refactored `require('constants')`, constants are now available
directly from their respective modules. (James M Snell)
#6534
* stream: Improved `readable.read()` performance by up to 70%. (Brian
White) #7077
* timers: `setImmediate()` is now up to 150% faster in some situations.
(Andras) #6436
* util: Added a `breakLength` option to `util.inspect()` to control how
objects are formatted across lines. (cjihrig)
#7499
* v8-inspector: Experimental support has been added for debugging
Node.js over the inspector protocol. (Ali Ijaz Sheikh)
#6792
  - *Note: This feature is experimental, and it could be altered or
removed.*
  - You can try this feature by running Node.js with the `--inspect`
flag.

Refs: #7441
PR-URL: #7550
lukesampson pushed a commit to ScoopInstaller/Scoop that referenced this pull request Jul 7, 2016
### Notable changes

* **buffer**: Added `buffer.swap64()` to compliment `swap16()` & `swap32()`. (Zach Bjornson) [#7157](nodejs/node#7157)
* **build**: New `configure` options have been added for building Node.js as a shared library. (Stefan Budeanu) [#6994](nodejs/node#6994)
  - The options are: `--shared`, `--without-v8-platform` & `--without-bundled-v8`.
* **crypto**: Root certificates have been updated. (Ben Noordhuis) [#7363](nodejs/node#7363)
* **debugger**: The server address is now configurable via `--debug=<address>:<port>`. (Ben Noordhuis) [#3316](nodejs/node#3316)
* **npm**: Upgraded npm to v3.10.3 (Kat Marchán) [#7515](nodejs/node#7515) & (Rebecca Turner) [#7410](nodejs/node#7410)
* **readline**: Added the `prompt` option to the readline constructor. (Evan Lucas) [#7125](nodejs/node#7125)
* **repl / vm**: `sigint`/`ctrl+c` will now break out of infinite loops without stopping the Node.js instance. (Anna Henningsen) [#6635](nodejs/node#6635)
* **src**:
  - Added a `node::FreeEnvironment` public C++ API. (Cheng Zhao) [#3098](nodejs/node#3098)
  - Refactored `require('constants')`, constants are now available directly from their respective modules. (James M Snell) [#6534](nodejs/node#6534)
* **stream**: Improved `readable.read()` performance by up to 70%. (Brian White) [#7077](nodejs/node#7077)
* **timers**: `setImmediate()` is now up to 150% faster in some situations. (Andras) [#6436](nodejs/node#6436)
* **util**: Added a `breakLength` option to `util.inspect()` to control how objects are formatted across lines. (cjihrig) [#7499](nodejs/node#7499)
* **v8-inspector**: Experimental support has been added for debugging Node.js over the inspector protocol. (Ali Ijaz Sheikh) [#6792](nodejs/node#6792)
  - **Note: This feature is _experimental_, and it could be altered or removed.**
  - You can try this feature by running Node.js with the `--inspect` flag.
@andrasq
Copy link
Author

andrasq commented Jul 10, 2016

@mjsalinger perhaps some of the commits got squashed; all the expected edits are there.
The 4 (less 1) original + 8 code review change sets got merged as 3 commits.

@MylesBorins
Copy link
Contributor

I'm setting this as don't land. please change to LTS watch if you think this should land in LTS

@Fishrock123
Copy link
Contributor

@thealphanerd if it cleanly applies you should be able to land it

@MylesBorins
Copy link
Contributor

@Fishrock123 it does not land cleanly without other timers changes

@Fishrock123
Copy link
Contributor

@thealphanerd ok, probably not important unless it is needed for other fixes. (Although it would be nice.)

Fishrock123 added a commit to Fishrock123/node that referenced this pull request Dec 9, 2016
This was originally changed in 6f75b66
but it appears unnecessary and benhcmark results show little difference
without the extra property.

Refs: nodejs#6436
@zhoujinhai
Copy link

@Fishrock123 I have the same problem , did you save it?

<--- Last few GCs --->

[9852:000001B18947FDB0]   770113 ms: Mark-sweep 1401.2 (1457.9) -> 1398.7 (1457.9) MB, 16.4 / 0.0 ms  (+ 0.0 ms in 0 steps since start of marking, biggest step 0.0 ms, walltime since start of marking 18 ms) allocation failure GC in old space requested
[9852:000001B18947FDB0]   770136 ms: Mark-sweep 1398.7 (1457.9) -> 1398.6 (1426.4) MB, 21.0 / 0.0 ms  last resort
[9852:000001B18947FDB0]   770158 ms: Mark-sweep 1398.6 (1426.4) -> 1398.6 (1425.9) MB, 22.5 / 0.0 ms  last resort


<--- JS stacktrace --->

==== JS stack trace =========================================

Security context: 0000003FB9CA9891 <JS Object>
    2: new constructor(aka Request) [E:\nodecrawler\node_modules\request\request.js:128] [pc=0000004DC395835B](this=000000583D4E4AD1 <a Request with map 0000035C122A1B51>,options=000000583D4E4DC9 <an Object with map 0000025475FDD539>)
    4: request(aka request) [E:\nodecrawler\node_modules\request\index.js:54] [pc=0000004DC3957474](this=000000530CD82311 <undefined>,uri=000000583D4E8241 <an O...

FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory
my code is here:
var crawler = require('crawler');
var config = require('./config');
var fs = require('fs');
// var videoList = config.videoList;
//var read = require('./read');
var debug = require('debug')('nightmare:crawler');

console.time("程序运行时间");//开始运行时间
//从videoList文件中获取视频列表
var read = fs.readFileSync(__dirname+'/videoList.json',function(err){
	if(err){
		console.log(err);
	}
	debug('读取videoList文件');
	console.log('successed!');
});
var videoList = JSON.parse(read);

var informationList = [];
var c = new crawler({
	// maxConnection : 1000,
	//rateLimit : 1000,
	forceUTF8 : true,
	callback : function(error,res,done){
		if(error){
			console.log(error);
		}else{	
			information(res,done);
			done();
		}
		// console.log(informationList);
		// console.log(informationList.length);
	}
});

// //爬取订阅号列表

for(i=0;i<videoList.length;i++){
	c.queue({
		uri: videoList[i].url,
		proxy: 'http://127.0.0.1:61481'
	});
}

targos pushed a commit to targos/node that referenced this pull request Oct 21, 2017
This was originally changed in 6f75b66
but it appears unnecessary and benhcmark results show little difference
without the extra property.

Refs: nodejs#6436
targos pushed a commit that referenced this pull request Oct 23, 2017
This was originally changed in 6f75b66
but it appears unnecessary and benhcmark results show little difference
without the extra property.

Refs: #6436
PR-URL: #16355
Reviewed-By: Luigi Pinca <[email protected]>
Reviewed-By: Jeremiah Senkpiel <[email protected]>
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Anatoli Papirovski <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Refael Ackermann <[email protected]>
addaleax pushed a commit to ayojs/ayo that referenced this pull request Oct 26, 2017
This was originally changed in 6f75b66
but it appears unnecessary and benhcmark results show little difference
without the extra property.

Refs: nodejs/node#6436
PR-URL: nodejs/node#16355
Reviewed-By: Luigi Pinca <[email protected]>
Reviewed-By: Jeremiah Senkpiel <[email protected]>
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Anatoli Papirovski <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Refael Ackermann <[email protected]>
addaleax pushed a commit to ayojs/ayo that referenced this pull request Dec 7, 2017
This was originally changed in 6f75b66
but it appears unnecessary and benhcmark results show little difference
without the extra property.

Refs: nodejs/node#6436
PR-URL: nodejs/node#16355
Reviewed-By: Luigi Pinca <[email protected]>
Reviewed-By: Jeremiah Senkpiel <[email protected]>
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Anatoli Papirovski <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Refael Ackermann <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Issues and PRs related to the performance of Node.js. timers Issues and PRs related to the timers subsystem / setImmediate, setInterval, setTimeout.
Projects
None yet
Development

Successfully merging this pull request may close these issues.