Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Events/event source node #8918

Merged
merged 2 commits into from
Sep 23, 2020
Merged

Events/event source node #8918

merged 2 commits into from
Sep 23, 2020

Conversation

drewbailey
Copy link
Contributor

@drewbailey drewbailey commented Sep 17, 2020

Source node registration and deregistration events from the FSM

This PR updates the FSM to pass in the message type to apply functions. A Context is added to state store methods to allow retrieving the msg type during event creation.

The bodies of the two events here, NodeRegisration and NodeDeregistration may change some, but depending on feedback to the FSM / state store this is how all events will be generated

example upsert node with context

fill in writetxnwithctx

ctx passing to handle event type creation, wip test

node deregistration event

drop Node from registration event
@drewbailey drewbailey force-pushed the events/event-source-node branch from 7991cc2 to 24d8ef2 Compare September 22, 2020 12:48
@drewbailey drewbailey marked this pull request as ready for review September 22, 2020 14:31
@drewbailey drewbailey requested a review from tgross September 23, 2020 12:58
Copy link
Member

@tgross tgross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My two comments here somewhat contradict each other, but that's mostly because I'm not sure about the design here using context. Can you maybe explain a bit some, uh, context (😀) around why we want to do it that way?

}

// DeleteNode deregisters a batch of nodes
func (s *StateStore) DeleteNodeCtx(ctx context.Context, index uint64, nodes []string) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By splitting this method out into one version with ctx and the other without, we're not forcing all consumers of this state store function to update and thread their changes through the event store. That's maybe convenient for tests but seems like it could be a source of bugs later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once these changes land in master, I was thinking DeleteNodeCtx would go away and DeleteNode or any other state store func would get a context as its first argument. I refrained from doing so here mainly because of tests, there are 252 invocations for UpsertNode in test files and since this PR is going into a long lived feature branch I just honestly didn't want to deal with all the potential merge conflicts 😅

If this route is agreed upon by everyone I'll go back and comment / TODO it being refactored away

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I totally missed this wasn't going to master, so 👍 on that approach.

defer metrics.MeasureSince([]string{"nomad", "fsm", "register_node"}, time.Now())
var req structs.NodeRegisterRequest
if err := structs.Decode(buf, &req); err != nil {
panic(fmt.Errorf("failed to decode request: %v", err))
}

ctx := context.WithValue(context.Background(), state.CtxMsgType, reqType)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell there's a 1:1 mapping between message types and n.apply* functions. So we're creating a context and threading the message type from raft -> fsm -> state store. There's only ever 1 parameter and it's the message type, which the apply function already needs to know. What do we get by creating a context rather than tacking on the msgType as a new parameter lower down the stack?

@drewbailey
Copy link
Contributor Author

I was wary of a context approach at first as well since it seems like needless indirection going from FSM apply func -> state store. We could change the parameter to take a message type.

In practice, state store methods that Write are only ever called from the FSM, but there are tons of invocations in tests and I wasn't sure about tacking on fake message types or making tests use proper message types when setting up tests vs just having a context.Background(). Context also felt open ended and could be used for tracing or log correlation down the road.

@drewbailey
Copy link
Contributor Author

merging into event-stream to unblock other work streams but definitely still open to feedback / alternative approaches!

@drewbailey drewbailey merged this pull request into event-stream Sep 23, 2020
@drewbailey drewbailey deleted the events/event-source-node branch September 23, 2020 14:52
drewbailey added a commit that referenced this pull request Sep 23, 2020
* Node Register/Deregister event sourcing

example upsert node with context

fill in writetxnwithctx

ctx passing to handle event type creation, wip test

node deregistration event

drop Node from registration event

* node batch deregistration
Comment on lines +209 to +218
s.StopEventPublisher()
close(s.abandonCh)
}

// StopStopEventPublisher calls the cancel func for the state stores event
// publisher. It should be called during server shutdown.
func (s *StateStore) StopEventPublisher() {
s.stopEventPublisher()
}

Copy link
Member

@schmichael schmichael Sep 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just call Abandon in tests? Or perhaps I'm misunderstanding the point of this wrapper?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StopEventPublisher will be used during server shutdown, the state store tests can call Abandon(), when I had Abandon() being called during fsm/server shutdown it caused weird leadership flapping https://app.circleci.com/pipelines/github/hashicorp/nomad/11813/workflows/611f362f-009a-497f-a0e2-60cc25977fb6/jobs/100908

drewbailey added a commit that referenced this pull request Sep 25, 2020
* Node Register/Deregister event sourcing

example upsert node with context

fill in writetxnwithctx

ctx passing to handle event type creation, wip test

node deregistration event

drop Node from registration event

* node batch deregistration
drewbailey added a commit that referenced this pull request Sep 28, 2020
* Node Register/Deregister event sourcing

example upsert node with context

fill in writetxnwithctx

ctx passing to handle event type creation, wip test

node deregistration event

drop Node from registration event

* node batch deregistration
drewbailey added a commit that referenced this pull request Sep 29, 2020
* Node Register/Deregister event sourcing

example upsert node with context

fill in writetxnwithctx

ctx passing to handle event type creation, wip test

node deregistration event

drop Node from registration event

* node batch deregistration
drewbailey added a commit that referenced this pull request Sep 29, 2020
* Node Register/Deregister event sourcing

example upsert node with context

fill in writetxnwithctx

ctx passing to handle event type creation, wip test

node deregistration event

drop Node from registration event

* node batch deregistration
drewbailey added a commit that referenced this pull request Oct 1, 2020
* Node Register/Deregister event sourcing

example upsert node with context

fill in writetxnwithctx

ctx passing to handle event type creation, wip test

node deregistration event

drop Node from registration event

* node batch deregistration
drewbailey added a commit that referenced this pull request Oct 2, 2020
* Node Register/Deregister event sourcing

example upsert node with context

fill in writetxnwithctx

ctx passing to handle event type creation, wip test

node deregistration event

drop Node from registration event

* node batch deregistration
@tgross tgross added this to the 0.13 milestone Oct 2, 2020
drewbailey added a commit that referenced this pull request Oct 5, 2020
* Node Register/Deregister event sourcing

example upsert node with context

fill in writetxnwithctx

ctx passing to handle event type creation, wip test

node deregistration event

drop Node from registration event

* node batch deregistration
drewbailey added a commit that referenced this pull request Oct 7, 2020
* Node Register/Deregister event sourcing

example upsert node with context

fill in writetxnwithctx

ctx passing to handle event type creation, wip test

node deregistration event

drop Node from registration event

* node batch deregistration
drewbailey added a commit that referenced this pull request Oct 8, 2020
* Node Register/Deregister event sourcing

example upsert node with context

fill in writetxnwithctx

ctx passing to handle event type creation, wip test

node deregistration event

drop Node from registration event

* node batch deregistration
drewbailey added a commit that referenced this pull request Oct 12, 2020
* Node Register/Deregister event sourcing

example upsert node with context

fill in writetxnwithctx

ctx passing to handle event type creation, wip test

node deregistration event

drop Node from registration event

* node batch deregistration
drewbailey added a commit that referenced this pull request Oct 13, 2020
* Node Register/Deregister event sourcing

example upsert node with context

fill in writetxnwithctx

ctx passing to handle event type creation, wip test

node deregistration event

drop Node from registration event

* node batch deregistration
drewbailey added a commit that referenced this pull request Oct 14, 2020
* Node Register/Deregister event sourcing

example upsert node with context

fill in writetxnwithctx

ctx passing to handle event type creation, wip test

node deregistration event

drop Node from registration event

* node batch deregistration
fredrikhgrelland pushed a commit to fredrikhgrelland/nomad that referenced this pull request Oct 22, 2020
* Node Register/Deregister event sourcing

example upsert node with context

fill in writetxnwithctx

ctx passing to handle event type creation, wip test

node deregistration event

drop Node from registration event

* node batch deregistration
@github-actions
Copy link

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 17, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants