Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RUST-856 Fix race between server selection and server monitoring #460

Merged
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 5 additions & 3 deletions src/sdam/monitor.rs
Original file line number Diff line number Diff line change
Expand Up @@ -122,13 +122,15 @@ impl HeartbeatMonitor {
None => break,
};

// subscribe to check requests before performing the check in case one comes in
// after the check completes
let mut topology_check_requests_subscriber =
topology.subscribe_to_topology_check_requests();

if self.check_server(&topology, &server).await {
topology.notify_topology_changed();
}

let mut topology_check_requests_subscriber =
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My theory is that the following sequence happens:

  • check completes
  • monitor notifies that topology changed
  • operation sleeping in server selection gets notification and quickly fails server selection, requesting another check
  • monitor starts subscribing to request checks, missing the previous one
  • monitor sleeps until heartbeatFrequencyMS is hit

This fixes that by ensuring any requests that come in after the check will be recorded too, hopefully fixing this case.

topology.subscribe_to_topology_check_requests();

// drop strong reference to topology before going back to sleep in case it drops off
// in between checks.
drop(topology);
Expand Down