-
Notifications
You must be signed in to change notification settings - Fork 880
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update mutable state consistency check logic #2747
Update mutable state consistency check logic #2747
Conversation
) error { | ||
if !*shardOwnershipAsserted { | ||
*shardOwnershipAsserted = true | ||
return shardContext.AssertOwnership(ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@samarabbas @paulnpdev @yiminc
added a new shard & persistence API to assert shard ownership
Will add additional unit test once this approach is accepted |
namespaceID string, | ||
workflowID string, | ||
) (string, error) { | ||
shardOwnershipAsserted := false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the usage of this pass by ref param is not very clear. Could you put some comment explaining how this is used and why it need to be pass by ref
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: would it be better on readability to use sync.once?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sync once should be considered if multiple goroutine race to initialize something
the logic here is per goroutine: call this function at most once per request
if currentWorkflowTaskRunning || scheduleID < msBuilder.GetNextEventID() { | ||
break | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this logic replaced by the inline predicate at line 327?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
return nil | ||
} | ||
|
||
func BypassMutableStateConsistencyPredicate( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see this is used in most of places, including RespondActivityTaskCompleted where the action of updateWorkflowActionFunc still rely on scheduleID >= mutableState.GetNextEventID() to detect stale cache do we still need those? I think the intention is to rely on the vclock to detect the staleness in those cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see this is used in most of places, including RespondActivityTaskCompleted where the action of updateWorkflowActionFunc still rely on scheduleID >= mutableState.GetNextEventID() to detect stale cache do we still need those? I think the intention is to rely on the vclock to detect the staleness in those cases?
currently only respond workflow task completed will use a customized predicate, i will probably update other place after this huge PR
I think the intention is to rely on the vclock to detect the staleness in those cases?
yes, but there will be a transition period: forward / backward compatibility
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is one exception, respond activity by ID, this set of APIs will probably continue to use existing checks
* Utilize shard clock or shard ownership API for consistency check * Workflow task: * Start * Completion * Failure * Activity task: * Start * Heartbeat * Completion * Failure * Cancellation * Child workflow: * Start * Completion * Utilize shard ownership API for consistency check * Delete workflow * Terminate workflow * RequestCancel workflow * SignalWorkflowExecution * RemoveSignalMutableState * ResetWorkflowExecution * RefreshWorkflowTasks
This reverts commit cf9e3c8.
What changed?
Why?
see #2743
How did you test it?
Updated tests
Potential risks
N/A
Is hotfix candidate?
N/A