-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rolling updates hang on K8s with rabbitmq:3.11.0 #6034
Comments
ansd
added a commit
that referenced
this issue
Oct 6, 2022
TODOs: * Continue to support OTP < 25.1 * Is it right use the same connection instance or do we need to create a new connection_id? See #6034
ansd
added a commit
that referenced
this issue
Oct 6, 2022
TODOs: * Continue to support OTP < 25.1 * Is it right use the same connection instance or do we need to create a new connection_id? * Do not use sys:get_state/1 See #6034
ansd
added a commit
that referenced
this issue
Oct 6, 2022
ansd
added a commit
that referenced
this issue
Oct 7, 2022
ansd
added a commit
that referenced
this issue
Oct 7, 2022
Commit erlang/otp@9274e89 present in OTP >= 25.1 expects a different nodedown / nodeup message format: The connection ID needs to be included. Fixes #6034
This was referenced Oct 7, 2022
mergify bot
pushed a commit
that referenced
this issue
Oct 10, 2022
Commit erlang/otp@9274e89 present in OTP >= 25.1 expects a different nodedown / nodeup message format: The connection ID needs to be included. Fixes #6034 (cherry picked from commit 47fe6e6)
mergify bot
pushed a commit
that referenced
this issue
Oct 10, 2022
Commit erlang/otp@9274e89 present in OTP >= 25.1 expects a different nodedown / nodeup message format: The connection ID needs to be included. Fixes #6034 (cherry picked from commit 47fe6e6) (cherry picked from commit bfcdeec) # Conflicts: # deps/rabbit/src/rabbit_node_monitor.erl
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The RabbitMQ “global hang workaround” as described in
rabbitmq-server/deps/rabbit/src/rabbit_node_monitor.erl
Lines 271 to 285 in 238995b
breaks from OTP 25.1 onwards because the messages being sent in
rabbitmq-server/deps/rabbit/src/rabbit_node_monitor.erl
Lines 334 to 337 in 238995b
global
due to the following commit: erlang/otp@9274e89This means rolling updates in RabbitMQ on Kubernetes with image
rabbitmq:3.11.0
containing Erlang 25.1.1 will get (sometimes) stuck.The reasoning of using "global hang workaround" is further described in #5438.
In short, a combination of the following issues leads us to relying on that workaround:
global:sync/0
early on bootglobal
bug (got fixed in Erlang by setting parameterprevent_overlapping_partitions
totrue
)prevent_overlapping_partitions
totrue
as described in Revert "Set kernel param prevent_overlapping_partitions to true" #5483To reproduce, trigger a few times a rolling update on
kindest/node:v1.25.2
using a 5 node RabbitMQ cluster with imagerabbitmq:3.11.0-management
.The logs show the following before the node gets stuck:
The text was updated successfully, but these errors were encountered: