-
Notifications
You must be signed in to change notification settings - Fork 897
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Close open connections from parent after fork #16953
Close open connections from parent after fork #16953
Conversation
Output from a run without clearing the connection pool. This shows all of the commonly hit issues, incompatible marshal file, too large packet, and just the wrong message returned:
|
# Close all open DRb connections so that connections in the parent's memory space | ||
# which is shared due to forking the child process do not pollute the child's DRb | ||
# connection pool. This can lead to errors when the children connect to a server | ||
# and get an incorrect response back. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's worded as well as it can be. Nice job!
app/models/miq_worker.rb
Outdated
# | ||
# ref: https://bugs.ruby-lang.org/issues/2718 | ||
def self.close_drb_pool_connections | ||
require 'drb/drb' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#MAJOR HACK approaching...
It's a stable api since it's been there since at least 2003 😆 🤣
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is as clean as it's going to get. 👍
Maybe say " to go to the wrong drb client"` |
Great point, also @Fryguy mentioned we might want to synchronize on that mutex so I'll make both of those changes |
DRb::DRbConn keeps a global pool of open connections which is shared by child processes when they are forked from a parent. If this parent executes a DRb call prior to forking a child process the child picks up this open connection and uses it which can cause replies from the server to go to the wrong DRb client. There is a long standing ruby bug https://bugs.ruby-lang.org/issues/2718 which describes the issue and has reproducer code attached. Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1385038
ea7e425
to
271ae40
Compare
Okay updated the commit message per #16953 (comment) and synchronize on the |
@agrare can we have a method called |
Checked commit agrare@271ae40 with ruby 2.3.3, rubocop 0.52.0, haml-lint 0.20.0, and yamllint 1.10.0 |
@agrare can you add the branches to backport this to? |
@jrafanie done, between the original BZ and the 3 clones this PR has a combined PM score of 30,820 😆 |
…ctions_after_fork Close open connections from parent after fork (cherry picked from commit b6062af) Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1542735
Gaprindashvili backport details:
|
…ctions_after_fork Close open connections from parent after fork (cherry picked from commit b6062af) Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1481378
Fine backport details:
|
I opened https://bugs.ruby-lang.org/issues/14471 to reopen the existing ruby bug |
…ctions_after_fork Close open connections from parent after fork (cherry picked from commit b6062af) Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1481677
Euwe backport details:
|
…rb_connections_after_fork Close open connections from parent after fork (cherry picked from commit b6062af) Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1481378
DRb::DRbConn keeps a global pool of open connections which is shared by
child processes when they are forked from a parent. If this parent
executes a DRb call prior to forking a child process the child picks up
this open connection and uses it which can cause replies from the server
to go to the wrong DRb client.
The connection pool in question is here
There is a long standing ruby bug https://bugs.ruby-lang.org/issues/2718
which describes the issue and has reproducer code attached.
This is the reproducer code that we used: https://gist.github.com/agrare/d9484884bd297b1615814128129cfc5c
Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1385038