Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server: fix panic if heartbeat reset happens for GC'd node #23383

Merged
merged 1 commit into from
Jun 20, 2024

Conversation

tgross
Copy link
Member

@tgross tgross commented Jun 18, 2024

When setting up the timer for heartbeat invalidation, there's no control that allows us to remove that timer when the node is GC'd. If the GC window is narrow enough, it's possible to GC a node that has a waiting heartbeat timer. In this case, we hit a bug where querying for the node returns nil and this is incorrectly handled when checking for disconnect/reconnect state. Fix this bug by correctly handling a nil node and allowing the Node.Update RPC to fire normally (which then errors correctly).

Fixes: #23376
Ref: https://hashicorp.atlassian.net/browse/NET-10109

@tgross tgross added type/bug theme/crash backport/ent/1.6.x+ent Changes are backported to 1.6.x+ent backport/ent/1.7.x+ent Changes are backported to 1.7.x+ent backport/1.8.x backport to 1.8.x release line labels Jun 18, 2024
@tgross tgross added this to the 1.8.2 milestone Jun 18, 2024
@tgross tgross marked this pull request as ready for review June 18, 2024 19:40
When setting up the timer for heartbeat invalidation, there's no control that
allows us to remove that timer when the node is GC'd. If the GC window is narrow
enough, it's possible to GC a node that has a waiting heartbeat timer. In this
case, we hit a bug where querying for the node returns `nil` and this is
incorrectly handled when checking for disconnect/reconnect state. Fix this bug
by correctly handling a `nil` node and allowing the `Node.Update` RPC to fire
normally (which then errors correctly).

Fixes: #23376
Ref: https://hashicorp.atlassian.net/browse/NET-10109
Copy link
Contributor

@pkazmierczak pkazmierczak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@tgross tgross merged commit ee48bdd into main Jun 20, 2024
19 checks passed
@tgross tgross deleted the b-heartbeat-reset-panic branch June 20, 2024 14:05
Copy link

github-actions bot commented Jan 3, 2025

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jan 3, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
backport/ent/1.6.x+ent Changes are backported to 1.6.x+ent backport/ent/1.7.x+ent Changes are backported to 1.7.x+ent backport/1.8.x backport to 1.8.x release line theme/crash type/bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

panic during node heartbeat reset
3 participants