Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: fix flaky child-process-fork-regr-gh-2847 #5422

Closed
wants to merge 1 commit into from

Conversation

santigimeno
Copy link
Member

Pull Request check-list

Please make sure to review and check all of these items:

  • Does make -j8 test (UNIX) or vcbuild test nosign (Windows) pass with
    this change (including linting)?
  • Is the commit message formatted according to CONTRIBUTING.md?
  • If this change fixes a bug (or a performance problem), is a regression
    test (or a benchmark) included?
  • Is a documentation update included (if this change modifies
    existing APIs, or introduces new ones)?

Affected core subsystem(s)

test

Description of change

The test is still failing sometimes because when trying to establish the
second connection, the server is already closed. Bring back the code
that handled this case and was removed in the last refactoring of the
test: 4e4b260.

@mscdex mscdex added child_process Issues and PRs related to the child_process subsystem. test Issues and PRs related to the tests. labels Feb 24, 2016
@mscdex
Copy link
Contributor

mscdex commented Feb 24, 2016

CI: https://ci.nodejs.org/job/node-test-pull-request/1734/

The CI infrastructure isn't all back up yet, but some nodes are.

@Trott
Copy link
Member

Trott commented Feb 24, 2016

LGTM if CI is green.

I de-ES6-ized it and made sure it still failed on io.js v3 so it still catches the regression it's designed to catch.

@mscdex
Copy link
Contributor

mscdex commented Feb 25, 2016

@mscdex
Copy link
Contributor

mscdex commented Feb 25, 2016

@santigimeno
Copy link
Member Author

2 failures out of 9999. Things have improved but not completely :(

@santigimeno
Copy link
Member Author

Updated the PR so it ignores errors when sending the 2nd handle to the worker

var s = send(function(err) {
// Ignore errors when sending the second handle because the worker
// may already have exited.
if (err) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it would be better to at least check that it's a possibly expected error?

@santigimeno santigimeno force-pushed the regre-gh-2847 branch 2 times, most recently from 2dfbfe9 to 4e8a0ca Compare February 25, 2016 10:22
@santigimeno
Copy link
Member Author

PR update to address @mscdex comment. Thanks

@mscdex
Copy link
Contributor

mscdex commented Feb 25, 2016

// Ignore errors when sending the second handle because the worker
// may already have exited.
if (err) {
assert.strictEqual(err.code,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Might be good just to follow the format of the previous case of swallowing ECONNREFUSED in the file: https://github.com/santigimeno/node/blob/regre-gh-2847/test/parallel/test-child-process-fork-regr-gh-2847.js#L19-L23

@Trott
Copy link
Member

Trott commented Feb 25, 2016

If it passes CI and stress test, LGTM. (Left a nit, but like all nits, it can be ignored. Up to you.)

@Trott
Copy link
Member

Trott commented Feb 25, 2016

Oh, and I ran the current version (with ES6-isms replaced) against iojs v3 and it fails as expected, so that's good.

@santigimeno
Copy link
Member Author

In last stress test, there was 8 failures because I was only ignoring ECONNREFUSED error. Should we make a new run with logs to see what other errors are happening, or just ignore every error as it was before the first commit I made here: 4e4b260?
/cc @mscdex @Trott

@mscdex
Copy link
Contributor

mscdex commented Feb 25, 2016

@santigimeno Maybe try the logging first?

@santigimeno
Copy link
Member Author

Ok, I have followed @Trott recommendations and throw if error is different from ECONNREFUSED. Could you run the stress test again? Thanks

@mscdex
Copy link
Contributor

mscdex commented Feb 25, 2016

@mhdawson
Copy link
Member

@Trott
Copy link
Member

Trott commented Feb 26, 2016

@santigimeno I think it will work if you swallow ECONNREFUSED on line 51 (like you are already doing) and swallow ECONNRESET on line 65 (instead of ECONNREFUSED).

@santigimeno
Copy link
Member Author

@Trott yes, but there are other ECONNREFUSED that I'm not sure where they come from. From https://ci.nodejs.org/job/node-stress-single-test/527/:

not ok 1 test-child-process-fork-regr-gh-2847.js
# events.js:155
#       throw er; // Unhandled 'error' event
#       ^
# 
# Error: connect ECONNREFUSED 127.0.0.1:12346
#     at Object.exports._errnoException (util.js:859:11)
#     at exports._exceptionWithHostPort (util.js:882:20)
#     at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1059:14)

I will push a new PR with more logs

@santigimeno
Copy link
Member Author

Can somebody run the stress test again? Thanks

@mscdex
Copy link
Contributor

mscdex commented Feb 27, 2016

@Trott
Copy link
Member

Trott commented Feb 27, 2016

And just in case it's challenging to find the corresponding console log: https://ci.nodejs.org/job/node-stress-single-test/546/nodes=ppcbe-fedora20/console

@santigimeno
Copy link
Member Author

I think (hope) it's now fixed. Can you run the stress test? Thanks!

@mscdex
Copy link
Contributor

mscdex commented Feb 28, 2016

@Trott
Copy link
Member

Trott commented Feb 28, 2016

That looks like it worked.

CI: https://ci.nodejs.org/job/node-test-commit/2371/

The test is still failing sometimes because when trying to establish the
second connection, the server is already closed. Bring back the code
that handled this case and was removed in the last refactoring of the
test. Also ignore the errors that might happen when sending the second
handle to the worker because it may already have exited.
@santigimeno
Copy link
Member Author

Fixed linting error.

@Trott
Copy link
Member

Trott commented Feb 29, 2016

@mscdex
Copy link
Contributor

mscdex commented Feb 29, 2016

CI is green.

Trott pushed a commit to Trott/io.js that referenced this pull request Feb 29, 2016
The test is still failing sometimes because when trying to establish the
second connection, the server is already closed. Bring back the code
that handled this case and was removed in the last refactoring of the
test. Also ignore the errors that might happen when sending the second
handle to the worker because it may already have exited.

PR-URL: nodejs#5422
Reviewed-By: Rich Trott <[email protected]>
@Trott
Copy link
Member

Trott commented Feb 29, 2016

Landed in 98b721e

@Trott Trott closed this Feb 29, 2016
@Fishrock123 Fishrock123 mentioned this pull request Mar 1, 2016
5 tasks
Fishrock123 pushed a commit that referenced this pull request Mar 2, 2016
The test is still failing sometimes because when trying to establish the
second connection, the server is already closed. Bring back the code
that handled this case and was removed in the last refactoring of the
test. Also ignore the errors that might happen when sending the second
handle to the worker because it may already have exited.

PR-URL: #5422
Reviewed-By: Rich Trott <[email protected]>
MylesBorins pushed a commit that referenced this pull request Mar 17, 2016
The test is still failing sometimes because when trying to establish the
second connection, the server is already closed. Bring back the code
that handled this case and was removed in the last refactoring of the
test. Also ignore the errors that might happen when sending the second
handle to the worker because it may already have exited.

PR-URL: #5422
Reviewed-By: Rich Trott <[email protected]>
MylesBorins pushed a commit that referenced this pull request Mar 21, 2016
The test is still failing sometimes because when trying to establish the
second connection, the server is already closed. Bring back the code
that handled this case and was removed in the last refactoring of the
test. Also ignore the errors that might happen when sending the second
handle to the worker because it may already have exited.

PR-URL: #5422
Reviewed-By: Rich Trott <[email protected]>
@refack refack mentioned this pull request Apr 27, 2017
3 tasks
refack added a commit to refack/node that referenced this pull request May 19, 2017
According to the explanation in nodejs#3635#issuecomment-157714683
And as a continuation to nodejs#5422 we also ignore EMFILE
"No more file descriptors are available,so no more files can be opened"

PR-URL: nodejs#12698
Fixes: nodejs#10286
Refs: nodejs#3635 (comment)
Refs: nodejs#5178
Refs: nodejs#5179
Refs: nodejs#4005
Refs: nodejs#5121
Refs: nodejs#5422
Refs: nodejs#12621 (comment)
Reviewed-By: Gibson Fahnestock <[email protected]>
Reviewed-By: Santiago Gimeno <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Colin Ihrig <[email protected]>
MylesBorins pushed a commit that referenced this pull request Jul 17, 2017
According to the explanation in #3635#issuecomment-157714683
And as a continuation to #5422 we also ignore EMFILE
"No more file descriptors are available,so no more files can be opened"

PR-URL: #12698
Fixes: #10286
Refs: #3635 (comment)
Refs: #5178
Refs: #5179
Refs: #4005
Refs: #5121
Refs: #5422
Refs: #12621 (comment)
Reviewed-By: Gibson Fahnestock <[email protected]>
Reviewed-By: Santiago Gimeno <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Colin Ihrig <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
child_process Issues and PRs related to the child_process subsystem. test Issues and PRs related to the tests.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants