Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[mongo] Improve messaging&tags on MongoDB replset member state events #2409

Closed
wants to merge 1 commit into from
Closed

[mongo] Improve messaging&tags on MongoDB replset member state events #2409

wants to merge 1 commit into from

Conversation

antifuchs
Copy link
Contributor

This change summarized a lot of changes that improve how leadership election events look in the mongodb service check:

  • The event title nowrefers to the explicit hostname if checking localhost
  • The title now calls the member state by its official name and gives an the expanded state description in the body.
  • Events are only generated if there is a previous, known, state. This elimilates a lot of events being generated whenever the dd-agent is restarted.
  • Events now come with a set of tags featuring the previous and current states in a form that can unambiguously be queried, as well as the replSet in question.

This change also eliminates the duplicate replica set state table that got introduced in a previous PR.

Testing

We're running this updated check in production at the moment. It generates events that we can easily overlay on top of graphs using queries like tags:member_status:primary,previous_member_status:secondary (this finds re-elections where the primary is different from before).

@antifuchs antifuchs changed the title [mongo] Improve messaging on MongoDB replset member state events [mongo] Improve messaging&tags on MongoDB replset member state events Apr 7, 2016
@yannmh yannmh self-assigned this Apr 15, 2016
@yannmh yannmh added this to the 5.7.4 milestone Apr 15, 2016
@yannmh yannmh added the checks label Apr 15, 2016
return 'Down'
elif state == 9:
return 'Rollback'
return self.REPLSET_MEMBER_STATES[state][1]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for the get_state_name method, can we consider the case where state does not exist in REPLSET_MEMBER_STATES ?

if state in 
  pass
else:
  return 'UNKNOWN'

Even though, this should not happen (as opposed to get_state_name when last_state==-1 ), I think that would make the check less error prone to MongoDB updates.

Copy link
Contributor Author

@antifuchs antifuchs Apr 20, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, that would be desirable. I'll fix this up.

(Actually, I'm going to pull out these methods to make them testable)

@yannmh
Copy link
Member

yannmh commented Apr 15, 2016

Thanks a lot @antifuchs. Your changes look great: I simply added a few nitpicks ✏️.

We are planning a 5.7.4 release with various improvements on the MongoDB check. I added your changes to the milestone 🚢

@yannmh
Copy link
Member

yannmh commented Apr 20, 2016

Would you mind rebasing your changes on the last master branch state please ? I merged various changes on the check, which seem to conflict with your PR 😕.
It should be pretty straightforward to fix, let me know if you are busy I'll do it :).

@antifuchs
Copy link
Contributor Author

@yannmh - the two fixes are in, and I rebased against master. Please let me know if the two changes are ok as-is or if you prefer them squashed into one.

@yannmh
Copy link
Member

yannmh commented Apr 20, 2016

Squashed changes would be perfect 😄

This change summarized a lot of changes that improve how leadership
election events look in the mongodb service check:

* The event title nowrefers to the explicit hostname if checking
  localhost
* The title now calls the member state by its official name and gives an
  the expanded state description in the body.
* Events are only generated if there is a previous, known, state. This
  elimilates a lot of events being generated whenever the dd-agent is
  restarted.
* Events now come with a set of tags featuring the previous and current
  states in a form that can unambiguously be queried, as well as the
  replSet in question.
@antifuchs
Copy link
Contributor Author

Squashed! (:

@yannmh
Copy link
Member

yannmh commented Apr 21, 2016

@antifuchs, I added a small commit on top of yours to fix the tests and restore a variable which got removed by accident in the rebase.
Let's move to #2432 😄

@yannmh yannmh closed this Apr 21, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants