-
Notifications
You must be signed in to change notification settings - Fork 152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't return update handles until desired stage reached #2066
Changes from 5 commits
f03eeb9
8085c83
d085d54
95dc2f3
ca23272
e66446c
6e9cb0c
13c3bb9
0a7499e
17a0460
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -23,7 +23,19 @@ | |
import io.temporal.api.enums.v1.UpdateWorkflowExecutionLifecycleStage; | ||
|
||
public enum UpdateWaitPolicy { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm in python looks like we changed the name to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 👍 And the docs here about what each enum means only apply to starting an update and maybe should move to there (but maybe not). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. K, I'm down to change the name There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Docs wise, weirdly, even the proto APIs mention nothing about what the stages mean beyond as input to requests. That would be good to fix. |
||
/** Update request waits for the update to be accepted by the workflow */ | ||
/** | ||
* Update request waits for the update to be until the update request has been admitted by the | ||
* server - it may be the case that due to a considerations like load or resource limits that an | ||
* update is made to wait before the server will indicate that it has been received and will be | ||
* processed. This value does not wait for any sort of acknowledgement from a worker. | ||
*/ | ||
ADMITTED( | ||
UpdateWorkflowExecutionLifecycleStage.UPDATE_WORKFLOW_EXECUTION_LIFECYCLE_STAGE_ADMITTED), | ||
|
||
/** | ||
* Update request waits for the update to be accepted (and validated, if there is a validator) by | ||
* the workflow | ||
*/ | ||
ACCEPTED( | ||
UpdateWorkflowExecutionLifecycleStage.UPDATE_WORKFLOW_EXECUTION_LIFECYCLE_STAGE_ACCEPTED), | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -334,8 +334,17 @@ public <R> StartUpdateOutput<R> startUpdate(StartUpdateInput<R> input) { | |
.setRequest(request) | ||
.build(); | ||
Deadline pollTimeoutDeadline = Deadline.after(POLL_UPDATE_TIMEOUT_S, TimeUnit.SECONDS); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Shouldn't the deadline be in the loop? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Arguably it doesn't need to be set at all There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I unset this in the most recent commit - but, I'm not sure having a super long timeout by default is what we want to do? OTOH I don't have a firm reason why not I suppose. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I have no strong opinion so long as it's always longer than server's by enough to let server return an empty response on its timeout There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We should treat start update as a long poll, hence the long timeout There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Moved |
||
UpdateWorkflowExecutionResponse result = | ||
genericClient.update(updateRequest, pollTimeoutDeadline); | ||
|
||
// Re-attempt the update until it is at least accepted, or passes the lifecycle stage specified | ||
// by the user. | ||
UpdateWorkflowExecutionResponse result; | ||
do { | ||
result = genericClient.update(updateRequest, pollTimeoutDeadline); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you make sure down below in There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Wait, isn't the Python one doing that when it passes There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's easy to change Java to not do this and just default to max timeout for getResult calls, but, not sure that's the right thing to do. (I committed it so we can see what I mean - works fine, but, seems like maybe not right? At minimum what python is saying the doc vs. what it does is either inconsistent, or the loop is not needed, or not the same as what I've just done here) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok, I need to update that Python doc to remove that last sentence (I fixed logic but forgot about docs). We are no longer using timeout/exceptions to drive the loop. Just need to remove the idea that deadline exceeded means something special in the start/poll loop. Let all RPC exceptions bubble out as they always would and change the code to only care about the successful result instead of the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, that's done now. All it's doing is just interpreting the failure code into the right exception type which makes sense to me. |
||
} while (result.getStage().getNumber() < input.getWaitPolicy().getLifecycleStage().getNumber() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we set a default for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Per @drewhoskins-temporal's latest requirements, we want wait-for-stage to be a required field for start. Also, we should call it "wait-for-stage" IMO to match Python and future SDKs (or if we don't like that term, we should call it something else and be consistent across SDKs with what it is called). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the latest requirements for start were to, if the wait stage is COMPLETED, after ACCEPTED you switched to polling for response before returning from the start call. Can you confirm at least from the user perspective that occurs? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 👍 Sorry I missed that |
||
&& result.getStage().getNumber() | ||
< UpdateWorkflowExecutionLifecycleStage | ||
.UPDATE_WORKFLOW_EXECUTION_LIFECYCLE_STAGE_ACCEPTED | ||
.getNumber()); | ||
|
||
if (result.hasOutcome()) { | ||
switch (result.getOutcome().getValueCase()) { | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -177,16 +177,6 @@ public interface QueryableWorkflow { | |
void mySignal(String value); | ||
} | ||
|
||
@WorkflowInterface | ||
public interface SimpleWorkflowWithUpdate { | ||
Comment on lines
-180
to
-181
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This was unused |
||
|
||
@WorkflowMethod | ||
String execute(); | ||
|
||
@UpdateMethod | ||
String update(String value); | ||
} | ||
|
||
@WorkflowInterface | ||
public interface WorkflowWithUpdate { | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, i have read this a few times and I am not sure logic trying to accomplish?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since before the handle is returned from start when user says complete, there might be a result from polling already and if there is we want to use that, otherwise try it - but then we need to wipe that result in case getResult gets called again
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm this seems racy, if you have two concurrent calls isn't is possible for
pollCall=null
if two threads interleave in the right way?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes it is, I didn't think about this being called concurrently, too used to Rust.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simplified this (works better as a cache now too)