Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce syscalls on require() #1920

Closed

Conversation

pierreinglebert
Copy link
Contributor

In some conditions, require makes many useless syscalls trying to find files in non-existent directories.

For example, this is a require('moment'); in one subfolder of my project, the lib/sub/node_modules doesn't even exist.

stat64("/Users/pierre/Projects/require_test/lib/sub/node_modules/moment\0", 0x7FFF5FBFE5D8, 0x9)          = -1 Err#2
stat64("/Users/pierre/Projects/require_test/lib/sub/node_modules/moment.js\0", 0x7FFF5FBFC270, 0x9)               = -1 Err#2
stat64("/Users/pierre/Projects/require_test/lib/sub/node_modules/moment.json\0", 0x7FFF5FBFC270, 0x9)             = -1 Err#2
stat64("/Users/pierre/Projects/require_test/lib/sub/node_modules/moment.node\0", 0x7FFF5FBFC270, 0x9)             = -1 Err#2
open("/Users/pierre/Projects/require_test/lib/sub/node_modules/moment/package.json\0", 0x0, 0x1B6)                = -1 Err#2
stat64("/Users/pierre/Projects/require_test/lib/sub/node_modules/moment/index.js\0", 0x7FFF5FBFE578, 0x1B6)               = -1 Err#2
stat64("/Users/pierre/Projects/require_test/lib/sub/node_modules/moment/index.json\0", 0x7FFF5FBFE578, 0x1B6)             = -1 Err#2
stat64("/Users/pierre/Projects/require_test/lib/sub/node_modules/moment/index.node\0", 0x7FFF5FBFE578, 0x1B6)             = -1 Err#2

This PR divides by 7 the number of syscalls when requiring a module from a directory without node_modules at the cost of adding 1 from a directory with.

From my tests with a whole process doing only require('express'), I pass from 760 stat64 to 574 (-25%) and from 106 open_nocancel to 58 (45%).

@pierreinglebert pierreinglebert changed the title Reduce syscalls on require Reduce syscalls on require() Jun 8, 2015
@@ -142,6 +142,8 @@ Module._findPath = function(request, paths) {

// For each path
for (var i = 0, PL = paths.length; i < PL; i++) {
// don't search further if path doesnt exist
if(paths[i] && internalModuleStat(paths[i]) <= 0) continue;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should at least be a space here between if and (.

The comment should be modified slightly to read: // Don't search further if path doesn't exist

@mscdex
Copy link
Contributor

mscdex commented Jun 8, 2015

The subsystem the commit is targeting should be module instead of just lib.

@brendanashworth brendanashworth added the module Issues and PRs related to the module subsystem. label Jun 9, 2015
@Fishrock123
Copy link
Contributor

Is this before or after #1801, or does that not make a difference?

EDIT: ah that may not matter.

@pierreinglebert
Copy link
Contributor Author

It's after but it makes no difference.

@sam-github
Copy link
Contributor

If this can improve node startup time, I'm all for it. Some of my CLIs take 400-500ms to start up, all in require, mostly statting non-existent directories. That doesn't sound like much, but under unit test, when the CLIs are spawned dozen's of times per test, it adds up very quickly.

@trevnorris
Copy link
Contributor

Change looks sound.

@isaacs Know of a reason why this would be a problem?

@Fishrock123
Copy link
Contributor

Ah, didn't realize this was a PR. SGTM..

CI: https://jenkins-iojs.nodesource.com/view/iojs/job/iojs+any-pr+multi/787/

@Fishrock123
Copy link
Contributor

Looking at the CI I do not think this patch works on FreeBSD, SmartOS, or Windows.

@pierreinglebert
Copy link
Contributor Author

The errors are suite strange, both freebsd throw 'Error: listen EADDRINUSE 0.0.0.0:12346'

@trevnorris
Copy link
Contributor

It does appear that everything that failed is network related. @rvagg You seen this happen before with Jenkins?

@jbergstroem
Copy link
Member

I found lingering processes on some machines (freebsd, smartos). We just discussed this in the build group and will be looking at making these tests more robust and/or reaping processes on test exit.

@trevnorris
Copy link
Contributor

@jbergstroem should we be good to run CI again?

@jbergstroem
Copy link
Member

@trevnorris feel free. The windows bots at least seems to give proper feedback (800+ failing tests because they can't be run): https://jenkins-iojs.nodesource.com/view/iojs/job/iojs+any-pr+multi/787/nodes=win2008r2/tapTestReport/test.tap-5/

@trevnorris
Copy link
Contributor

@jbergstroem thanks. looks like jenkins hasn't been happy the last few builds. :(

Having another go at it: https://jenkins-iojs.nodesource.com/job/iojs+any-pr+multi/789/

@jbergstroem
Copy link
Member

@trevnorris yeah, its really unfortunate.

@isaacs
Copy link
Contributor

isaacs commented Jun 9, 2015

The change seems fine to me. However, I'd love to see a test for this. Is it possible to monkey-patch fs.statSync and track which things get statted to create a failing test?

@isaacs
Copy link
Contributor

isaacs commented Jun 9, 2015

Oh, I guess with the change in #1801, testing this might be a bit tricky. @bnoordhuis do you have any insight?

If it's just not possible, and it doesn't make any existing tests fail (and doesn't, eg, break npm's or request's or express's tests) then LGTM.

@trevnorris
Copy link
Contributor

this does make windows blow up hard:

not ok 1 - test-arm-math-exp-regress-1376.js
# module.js:336
# throw err;
# ^
# Error: Cannot find module 'c:\workspace\iojs+any-pr+multi\nodes\win2008r2\test\parallel\test-arm-math-exp-regress-1376.js'
# at Function.Module._resolveFilename (module.js:334:15)
# at Function.Module._load (module.js:284:25)
# at Function.Module.runMain (module.js:469:10)
# at startup (node.js:117:18)
# at node.js:946:3

I'm guessing this error is specifically related to this PR.

@chrisdickinson
Copy link
Contributor

I wonder how many packages test for presence of a module, generate it on "does not exist", then re-require it? I wouldn't advocate such an approach, of course, but it's within the realm of possibility. Do we want to support that use case?

@trevnorris
Copy link
Contributor

@chrisdickinson how would that even be implemented?

@chrisdickinson
Copy link
Contributor

@trevnorris

try {
  var ourCompiledModule = require('./generated.js')
} catch (err) {
  fs.writeFileSync(__dirname + '/generated.js', 'module.exports = 3', 'utf8');
}

Also, consider the REPL use case:

> require('asdf')
Error: Cannot find module 'asdf'
    at Function.Module._resolveFilename (module.js:337:15)
    at Function.Module._load (module.js:287:25)
    at Module.require (module.js:366:17)
    at require (module.js:385:17)
    at repl:1:1
    at REPLServer.defaultEval (repl.js:155:27)
    at bound (domain.js:254:14)
    at REPLServer.runBound [as eval] (domain.js:267:12)
    at REPLServer.<anonymous> (repl.js:309:12)
    at emitOne (events.js:82:20)
>^Z
$ npm install asdf
...
$ fg
> require('asdf')

@trevnorris
Copy link
Contributor

@chrisdickinson do you mention that as a possible new feature? Or is that something that works today?

During require search, check the path exists before searching further in it.
@pierreinglebert
Copy link
Contributor Author

On Unix systems, when the request path was absolute, it removed all the search paths but not on Windows, the path.resolve() was handling it.
I replaced the '/' checking with path.isAbsolute so it should work better now.

@chrisdickinson
Copy link
Contributor

@trevnorris as a thing that works today that would not work if we cached "ENOENT" results to avoid syscalls.

@pierreinglebert
Copy link
Contributor Author

This PR doesn't cache 'ENOENT' between require() calls, it just avoid calling stat/open on files if the directory doesn't exist. That wont break your usecase.

@pierreinglebert
Copy link
Contributor Author

@trevnorris Could you run this PR on jenkins again ?

@bnoordhuis
Copy link
Member

@trevnorris
Copy link
Contributor

@bnoordhuis thanks.

CI looks happy. LGTM.

@chrisdickinson
Copy link
Contributor

One second – going to check this locally first just to make sure / prove that I'm just being paranoid.

@chrisdickinson
Copy link
Contributor

Myth: CONFIRMED. I was just being paranoid. LGTM :)

Fishrock123 pushed a commit that referenced this pull request Jun 18, 2015
require() now checks that the path exists before searching
further in it.

PR-URL: #1920
Reviewed-By: Isaac Z. Schlueter <[email protected]>
Reviewed-By: Trevor Norris <[email protected]>
Reviewed-By: Chris Dickinson <[email protected]>
Reviewed-By: Jeremiah Senkpiel <[email protected]>
@Fishrock123
Copy link
Contributor

Thanks, landed in a71ee93!

@@ -142,6 +142,8 @@ Module._findPath = function(request, paths) {

// For each path
for (var i = 0, PL = paths.length; i < PL; i++) {
// Don't search further if path doesn't exist
if (paths[i] && internalModuleStat(paths[i]) < 1) continue;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess that one needs the path._makeLong fix (#1991) ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, yes, I think so. PR: #2011

@rvagg rvagg mentioned this pull request Jun 18, 2015
rvagg added a commit that referenced this pull request Jun 23, 2015
PR-URL: #1996

Notable changes

* module: The number of syscalls made during a require() have been
  significantly reduced again (see #1801 from v2.2.0 for previous
  work), which should lead to a performance improvement
  (Pierre Inglebert) #1920.
* npm:
  - Upgrade to v2.11.2 (Rebecca Turner) #1956.
  - Upgrade to v2.11.3 (Forrest L Norvell) #2018.
* zlib: A bug was discovered where the process would abort if the
  final part of a zlib decompression results in a buffer that would
  exceed the maximum length of 0x3fffffff bytes (~1GiB). This was
  likely to only occur during buffered decompression (rather than
  streaming). This is now fixed and will instead result in a thrown
  RangeError (Michaël Zasso) #1811.
rvagg added a commit that referenced this pull request Jun 23, 2015
PR-URL: #1996

Notable changes

* module: The number of syscalls made during a require() have been
  significantly reduced again (see #1801 from v2.2.0 for previous
  work), which should lead to a performance improvement
  (Pierre Inglebert) #1920.
* npm:
  - Upgrade to v2.11.2 (Rebecca Turner) #1956.
  - Upgrade to v2.11.3 (Forrest L Norvell) #2018.
* zlib: A bug was discovered where the process would abort if the
  final part of a zlib decompression results in a buffer that would
  exceed the maximum length of 0x3fffffff bytes (~1GiB). This was
  likely to only occur during buffered decompression (rather than
  streaming). This is now fixed and will instead result in a thrown
  RangeError (Michaël Zasso) #1811.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module Issues and PRs related to the module subsystem.
Projects
None yet
Development

Successfully merging this pull request may close these issues.