Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

worker: avoid potential deadlock on NearHeapLimit #38403

Closed
wants to merge 1 commit into from

Conversation

santigimeno
Copy link
Member

It can happen that the NearHeapLimit callback is called while calling
the oninit() function on MessagePort construction causing a deadlock
when the Worker::Exit() method is called, as the mutex_ was already
held on the CreateEnvMessagePort() method. To fix it, just use the
mutex_ to protect the child_port_data_ variable and avoid holding it
when creating the MessagePort.
Also, return early from Worker::Run() if the worker message port
could not be created.

Fixes: #38208.

A couple of things I'd like to point out:

  • I've only been able to reproduce the issue from the original report in the v12.x branch, but I think that potentially it could happen on newer branches too.
  • I've implemented this solution on the assumption that the mutex is only needed to protect child_port_data_ based on this comment, if that's not the case, it won't work. Another option I've tried that also works was using a recursive mutex, but I don't know if that solution would be acceptable.

@santigimeno santigimeno self-assigned this Apr 25, 2021
@github-actions github-actions bot added c++ Issues and PRs that require attention from people who are familiar with C++. needs-ci PRs that need a full CI run. worker Issues and PRs related to Worker support. labels Apr 25, 2021
@nodejs-github-bot

This comment has been minimized.

@Trott
Copy link
Member

Trott commented Apr 29, 2021

Test failure on Windows is relevant.

@nodejs-github-bot

This comment has been minimized.

@nodejs-github-bot

This comment has been minimized.

@nodejs-github-bot

This comment has been minimized.

@nodejs-github-bot

This comment has been minimized.

@nodejs-github-bot

This comment has been minimized.

@nodejs-github-bot

This comment has been minimized.

@nodejs-github-bot

This comment has been minimized.

@nodejs-github-bot

This comment has been minimized.

@santigimeno
Copy link
Member Author

I think this is ready to go now, just in case anyone wants to check again. I have changed the test a bit so it passes on every platform while also keeps deadlocking in v12.x. Thanks

@nodejs-github-bot

This comment has been minimized.

@nodejs-github-bot

This comment has been minimized.

@nodejs-github-bot

This comment has been minimized.

It can happen that the `NearHeapLimit` callback is called while calling
the `oninit()` function on `MessagePort` construction causing a deadlock
when the `Worker::Exit()` method is called, as the `mutex_` was already
held on the `CreateEnvMessagePort()` method. To fix it, just use the
`mutex_` to protect the `child_port_data_` variable and avoid holding it
when creating the `MessagePort`.
Also, return early from `Worker::Run()` if the worker message port
could not be created.

Fixes: nodejs#38208
@nodejs-github-bot

This comment has been minimized.

@nodejs-github-bot
Copy link
Collaborator

CI: https://ci.nodejs.org/job/node-test-pull-request/39891/

@juanarbol juanarbol added the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label Sep 10, 2021
juanarbol pushed a commit to juanarbol/node that referenced this pull request Sep 11, 2021
It can happen that the `NearHeapLimit` callback is called while calling
the `oninit()` function on `MessagePort` construction causing a deadlock
when the `Worker::Exit()` method is called, as the `mutex_` was already
held on the `CreateEnvMessagePort()` method. To fix it, just use the
`mutex_` to protect the `child_port_data_` variable and avoid holding it
when creating the `MessagePort`.
Also, return early from `Worker::Run()` if the worker message port
could not be created.

Fixes: nodejs#38208

PR-URL: nodejs#38403
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Colin Ihrig <[email protected]>
Reviewed-By: Joyee Cheung <[email protected]>
Reviewed-By: Rich Trott <[email protected]>
Reviewed-By: Juan José Arboleda <[email protected]>
@juanarbol
Copy link
Member

Landed in 540f9d9 🎉

@juanarbol juanarbol closed this Sep 11, 2021
BethGriggs pushed a commit that referenced this pull request Sep 21, 2021
It can happen that the `NearHeapLimit` callback is called while calling
the `oninit()` function on `MessagePort` construction causing a deadlock
when the `Worker::Exit()` method is called, as the `mutex_` was already
held on the `CreateEnvMessagePort()` method. To fix it, just use the
`mutex_` to protect the `child_port_data_` variable and avoid holding it
when creating the `MessagePort`.
Also, return early from `Worker::Run()` if the worker message port
could not be created.

Fixes: #38208

PR-URL: #38403
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Colin Ihrig <[email protected]>
Reviewed-By: Joyee Cheung <[email protected]>
Reviewed-By: Rich Trott <[email protected]>
Reviewed-By: Juan José Arboleda <[email protected]>
BethGriggs pushed a commit that referenced this pull request Sep 21, 2021
It can happen that the `NearHeapLimit` callback is called while calling
the `oninit()` function on `MessagePort` construction causing a deadlock
when the `Worker::Exit()` method is called, as the `mutex_` was already
held on the `CreateEnvMessagePort()` method. To fix it, just use the
`mutex_` to protect the `child_port_data_` variable and avoid holding it
when creating the `MessagePort`.
Also, return early from `Worker::Run()` if the worker message port
could not be created.

Fixes: #38208

PR-URL: #38403
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Colin Ihrig <[email protected]>
Reviewed-By: Joyee Cheung <[email protected]>
Reviewed-By: Rich Trott <[email protected]>
Reviewed-By: Juan José Arboleda <[email protected]>
@BethGriggs BethGriggs mentioned this pull request Sep 21, 2021
1 task
richardlau pushed a commit that referenced this pull request Nov 26, 2021
It can happen that the `NearHeapLimit` callback is called while calling
the `oninit()` function on `MessagePort` construction causing a deadlock
when the `Worker::Exit()` method is called, as the `mutex_` was already
held on the `CreateEnvMessagePort()` method. To fix it, just use the
`mutex_` to protect the `child_port_data_` variable and avoid holding it
when creating the `MessagePort`.
Also, return early from `Worker::Run()` if the worker message port
could not be created.

Fixes: #38208

PR-URL: #38403
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Colin Ihrig <[email protected]>
Reviewed-By: Joyee Cheung <[email protected]>
Reviewed-By: Rich Trott <[email protected]>
Reviewed-By: Juan José Arboleda <[email protected]>
@richardlau richardlau mentioned this pull request Nov 26, 2021
richardlau pushed a commit that referenced this pull request Dec 14, 2021
It can happen that the `NearHeapLimit` callback is called while calling
the `oninit()` function on `MessagePort` construction causing a deadlock
when the `Worker::Exit()` method is called, as the `mutex_` was already
held on the `CreateEnvMessagePort()` method. To fix it, just use the
`mutex_` to protect the `child_port_data_` variable and avoid holding it
when creating the `MessagePort`.
Also, return early from `Worker::Run()` if the worker message port
could not be created.

Fixes: #38208

PR-URL: #38403
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Colin Ihrig <[email protected]>
Reviewed-By: Joyee Cheung <[email protected]>
Reviewed-By: Rich Trott <[email protected]>
Reviewed-By: Juan José Arboleda <[email protected]>
@richardlau richardlau mentioned this pull request Dec 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
author ready PRs that have at least one approval, no pending requests for changes, and a CI started. c++ Issues and PRs that require attention from people who are familiar with C++. needs-ci PRs that need a full CI run. worker Issues and PRs related to Worker support.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

worker_threads.terminate results in no response on v12.x
9 participants