Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DRIVERS-2972: add message requirement to ServerDescription.error #1729

Merged
merged 22 commits into from
Feb 4, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -226,7 +226,8 @@ Fields:
field in the server's hello or legacy hello response, in the case that the server reports an address different from
the address the client uses.

- (=) `error`: information about the last error related to this server. Default null.
- (=) `error`: information about the last error related to this server. Default null. MUST contain or be able to produce
a string describing the error.

- `roundTripTime`: the duration of the hello or legacy hello call. Default null.

Expand Down Expand Up @@ -485,7 +486,13 @@ removed once the primary is checked.
#### error

If the client experiences any error when checking a server, it stores error information in the ServerDescription's error
field.
field. The message contained in this field MUST contain the substrings detailed in the table below when the
ServerDescription is changed to Unknown in the circumstances outlined.

| circumstance | error substring |
| ---------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------- |
| RSPrimary with a stale electionId/setVersion is discovered | `'primary marked stale due to electionId/setVersion mismatch, <stale tuple> is stale compared to <max tuple>'` |
| New primary is elected/discovered | `'primary marked stale due to discovery of newer primary'` |

#### roundTripTime

Expand Down Expand Up @@ -871,7 +878,8 @@ if serverDescription.maxWireVersion >= 17: # MongoDB 6.0+
topologyDescription.maxSetVersion = serverDescription.setVersion
else:
# Stale primary.
# replace serverDescription with a default ServerDescription of type "Unknown"
# The error field MUST include the substring "primary marked stale due to electionId/setVersion mismatch"
replace serverDescription with a default ServerDescription of type "Unknown"
checkIfHasPrimary()
return
else:
Expand All @@ -889,7 +897,8 @@ else:
)
):
# Stale primary.
# replace serverDescription with a default ServerDescription of type "Unknown"
# The error field MUST include the substring "primary marked stale due to electionId/setVersion mismatch"
replace serverDescription with a default ServerDescription of type "Unknown"
checkIfHasPrimary()
return

Expand All @@ -906,6 +915,7 @@ for each server in topologyDescription.servers:
if server.address != serverDescription.address:
if server.type is RSPrimary:
# See note below about invalidating an old primary.
# the error field MUST include the substring "primary marked stale due to discovery of newer primary"
replace the server with a default ServerDescription of type "Unknown"

for each address in serverDescription's "hosts", "passives", and "arbiters":
Expand All @@ -921,9 +931,10 @@ checkIfHasPrimary()
```

A note on invalidating the old primary: when a new primary is discovered, the client finds the previous primary (there
should be none or one) and replaces its description with a default ServerDescription of type "Unknown." A multi-threaded
client MUST [request an immediate check](server-monitoring.md#requesting-an-immediate-check) for that server as soon as
possible.
should be none or one) and replaces its description with a default ServerDescription of type "Unknown". Additionally,
the `error` field of the new `ServerDescription` object MUST include a descriptive error explaining that it was
invalidated because the primary was determined to be stale. A multi-threaded client MUST
[request an immediate check](server-monitoring.md#requesting-an-immediate-check) for that server as soon as possible.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we expand the scope of this ticket to include more info in other cases where we reset a server to "unknown" besides stale primary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I think that makes sense to do here. Seems like the only other place we do that is in the handleError function, so I'll take a look at that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So there are two places we do this in our handleError logic, but in both cases, the handleError function takes in an error as a parameter. Do we still want to be testing against the errors here?


If the old primary server version is 4.0 or earlier, the client MUST clear its connection pool for the old primary, too:
the connections are all bad because the old primary has closed its sockets. If the old primary server version is 4.2 or
Expand All @@ -934,6 +945,8 @@ See [replica set monitoring with and without a primary](#replica-set-monitoring-
If the server is primary with an obsolete electionId or setVersion, it is likely a stale primary that is going to step
down. Mark it Unknown and let periodic monitoring detect when it becomes secondary. See
[using electionId and setVersion to detect stale primaries](#using-electionid-and-setversion-to-detect-stale-primaries).
Drivers MAY additionally specify whether this was due to an electionId or setVersion mismatch as described in the
[ServerDescripion.error section](#error).

A note on checking "me": Unlike `updateRSWithPrimaryFromMember`, there is no need to remove the server if the address is
not equal to "me": since the server address will not be a member of either "hosts", "passives", or "arbiters", the
Expand Down Expand Up @@ -1921,16 +1934,6 @@ oversaw the specification process.

## Changelog

- 2024-11-11: Removed references to `getLastError`

- 2024-11-04: Make the description of `TopologyDescription.servers` consistent with the spec tests.

- 2024-08-16: Updated host b wire versions in `too_new` and `too_old` tests

- 2024-08-09: Updated wire versions in tests to 4.0+.

- 2024-05-08: Migrated from reStructuredText to Markdown.

- 2015-12-17: Require clients to compare (setVersion, electionId) tuples.

- 2015-10-09: Specify electionID comparison method.
Expand Down Expand Up @@ -2012,6 +2015,19 @@ oversaw the specification process.

- 2024-01-17: Add section on expected client close behaviour

- 2024-05-08: Migrated from reStructuredText to Markdown.

- 2024-08-09: Updated wire versions in tests to 4.0+.

- 2024-08-16: Updated host b wire versions in `too_new` and `too_old` tests

- 2024-11-04: Make the description of `TopologyDescription.servers` consistent with the spec tests.

- 2024-11-11: Removed references to `getLastError`

- 2025-01-22: Add error messages when a new primary is elected or a primary with a stale electionId or setVersion is
discovered.

______________________________________________________________________

[^1]: "localThresholdMS" was called "secondaryAcceptableLatencyMS" in the Read Preferences Spec, before it was superseded
Expand Down
1 change: 1 addition & 0 deletions source/server-discovery-and-monitoring/tests/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,7 @@ following keys:

- type: A ServerType name, like "RSSecondary". See [ServerType](../server-discovery-and-monitoring.md#servertype) for
details pertaining to async and multi-threaded drivers.
- error: An optional string that must be a substring of the message on the `ServerDescription.error` object
- setName: A string with the expected replica set name, or null.
- setVersion: absent or an integer.
- electionId: absent, null, or an ObjectId.
Expand Down

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,8 @@ phases: [
"a:27017": {

type: "Unknown",
setName:
setName:,
error: "primary marked stale due to discovery of newer primary"
},

"b:27017": {
Expand Down

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,8 @@ phases: [
"a:27017": {
type: "Unknown",
setName: ,
electionId:
electionId: ,
error: "primary marked stale due to discovery of newer primary"
},
"b:27017": {
type: "RSPrimary",
Expand Down Expand Up @@ -100,7 +101,8 @@ phases: [
"a:27017": {
type: "Unknown",
setName: ,
electionId:
electionId:,
error: "primary marked stale due to electionId/setVersion mismatch"
},
"b:27017": {
type: "RSPrimary",
Expand Down

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,8 @@ phases: [
"a:27017": {
type: "Unknown",
setName: ,
electionId:
electionId:,
error: "primary marked stale due to discovery of newer primary"
},
"b:27017": {
type: "RSPrimary",
Expand Down Expand Up @@ -100,7 +101,8 @@ phases: [
"a:27017": {
type: "Unknown",
setName: ,
electionId:
electionId:,
error: "primary marked stale due to electionId/setVersion mismatch"
},
"b:27017": {
type: "RSPrimary",
Expand Down

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,8 @@ phases: [
"a:27017": {
type: "Unknown",
setName: ,
electionId:
electionId:,
error: "primary marked stale due to discovery of newer primary"
},
"b:27017": {
type: "RSPrimary",
Expand Down Expand Up @@ -99,6 +100,7 @@ phases: [
"a:27017": {
type: "Unknown",
setName: ,
error: "primary marked stale due to electionId/setVersion mismatch",
electionId:
},
"b:27017": {
Expand Down

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,8 @@ phases: [
"a:27017": {
type: "Unknown",
setName: ,
electionId:
electionId:,
error: "primary marked stale due to discovery of newer primary"
},
"b:27017": {
type: "RSPrimary",
Expand Down Expand Up @@ -99,6 +100,7 @@ phases: [
"a:27017": {
type: "Unknown",
setName: ,
error: "primary marked stale due to electionId/setVersion mismatch",
electionId:
},
"b:27017": {
Expand Down

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,8 @@ phases: [
"a:27017": {
type: "Unknown",
setName: ,
electionId:
electionId:,
error: "primary marked stale due to discovery of newer primary"
},
"b:27017": {
type: "RSPrimary",
Expand Down

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,8 @@ phases: [
"a:27017": {
type: "Unknown",
setName: ,
electionId:
electionId:,
error: "primary marked stale due to discovery of newer primary"
},
"b:27017": {
type: "RSPrimary",
Expand Down

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,8 @@ phases: [
"a:27017": {
type: "Unknown",
setName: ,
electionId:
electionId:,
error: "primary marked stale due to discovery of newer primary"
},
"b:27017": {
type: "RSPrimary",
Expand Down Expand Up @@ -99,7 +100,8 @@ phases: [
"a:27017": {
type: "Unknown",
setName: ,
electionId:
electionId:,
error: "primary marked stale due to electionId/setVersion mismatch"
},
"b:27017": {
type: "RSPrimary",
Expand Down

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading
Loading