-
Notifications
You must be signed in to change notification settings - Fork 894
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for "workflow chain IDs" in all APIs that are able to address the "latest" run #2691
Comments
Response from Slack: This should be pretty straight forward to do. Server already keep track of the |
Response from Slack: I agree that this might not be hard to do but I don’t think it’s the semantic we want to promulgate. I realize that it’s possible to use the existing APIs in this fashion, but (with the acknowledgement that I’m not the most experienced user from the client side), I think this pattern promotes complexity instead of discouraging it. Basically, I think that the correct model is for the user to change the workflow ID at the end of a chain. So each Workflow is exactly one chain (in the parlance of macrogreg’s question). Broadly speaking I think that we would be better served by directing our users to compose simpler semantic constructs to achieve complexity, and only adding new semantics like this “chain_start_run_id” when the win is huge. If the user follows the pattern that I’m proposing here, then he gets exactly the semantic macrogreg asked for using our existing API. Please tell me if I am missing something. |
Response from Slack: I completely agree that it would be conceptually much simpler if we simply disallowed to have multiple workflow chains with the same workflow id. The problem discussed in this thread would not exist. Moreover, we would not need to explain the confusing part about the workflow-id uniquely describing a running chain, but not really uniquely describing a chain in general, because there may be multiple chains with the same workflow id as long as only one of them is running. But, at some point, for some reason (probably a good one, but either way, it's a done deal now) we decided that we wanted to support a workflow-id-reuse-policy that allowed creating new workflow chains with an id that was used previously. One of our central promises is that Temporal-base software is easy-to-use and robust. I am not sure how frequently the workflow-id-reuse-policy is used in practice. But in combination with a very powerful feature of being able to interact with a workflow that finished in the past, (query, result, ...), the workflow-id-reuse-policy does break the clean and simple "workflow id can be used to address a workflow" assumption. The desired outcome of the strategy described in this thread is to solve a problem that, when it occurs, is hard to understand and diagnose for our users. Something that we strive to avoid for our users. And, I think, the proposed solution is conceptually clean and simple with the help of an SDK that supports it. But perhaps there are other approaches to achieve this outcome that can work as well? So, overall, I agree. It is easier and cleaner to discourage reusing workflow-ids. But if we believe that that feature has value and is not a historical mistake that we wish we did not make, then we need to have some way of protecting people from the caveat of the "chain overflow" described in this thread. 😃 |
Response from Slack: the configured default already disallows workflow ID reuse. Personally, I think it would be sufficient to provide a warning about this issue on the page where we tell users how to change the default. There’s only so much you can go to prevent someone from shooting themselves in the foot. |
Response from Slack:
this is incorrect. The default ID reuse policy is "allow duplicate" |
Response from Slack:
The "safety" proposed is, essentially, transparent / for free to the SDK users. What would be the drawback of having it? |
Response from slack: I agree that if we disallow duplicate workflows with same ID, that would be much simpler. But reality is we allow it by the ID reuse policy. And there are many use cases that needed that feature. We cannot take it away. With that, I think the proposed solution is reasonable. |
Response from Slack: Fun fact, we are not taking away ID reuse policy, but we are adding more option to it: #2608 😄 |
Response from Slack: Ability to reuse ID is very important for many user facing scenarios. |
We discussed this over slack. Adding an Issue here so that we can keep track. While this is not (yet) time-critical, I would like to design the .NET SDK under the assumption that this gets eventually implemented before we release production-ready versions of the SDK.
Below, I copy a slightly edited version of the slack conversation for context and records.
x x x x x x x x x x x x
Hey folks, I am trying to understand the feasibility of the following:
Consider the public server APIs that operate on a particular Workflow Run. E.g.
QueryWorkflow
,TerminateWorkflowExecution
, and many others. These APIs tend to in-takeWorkflowExecution
, which is a tuple of (workflow_id
,run_id
). Also, for most (all?) such APIs therun_id
may be omitted. In such cases the invocation will apply to the most recent run that carries the specifiedworkflow_id
.This is the situation today. Please correct me if I am wrong. 😃
Now two questions (first one may have been asked before).
(1)
Could those APIs be extended such that instead of specifying the
run_id
, the user could specify thechain_start_run_id
(meaningrun_id
of the first (i.e. the oldest) run in the execution chain)? Then the API would apply to the most recent run (i.e. the newest) in the chain specified by thechain_start_run_id
. If the chain finishes at some point and a new chain with the same workflow id is started, then invocations where (run_id
,chain_start_run_id
) is specified would not "flow" . They would continue to refer to the finished chain.The purpose of this is hopefully clear: A chain represents a workflow with one or more runs (caused by retries, continue-as-new continuations, ...). Once such a chain finishes, the workflow logically concludes. A new chain is a completely new workflow (with the same workflow id). Typically, a user who interacts with a specific workflow does not want to switch to interacting with a new workflow without noticing.
I assume that the answer to this part of the question is Yes.
There is even a corresponding PR for the API.
( There,
chain_start_run_id
is calledfirst_execution_run_id
, but the name does not matter at this stage. For the current discussion just the concept, not the term is critical, so I'll temporarily stick tochain_start_run_id
for brevity/clarity. In fact, I would love to coin the term "workflow chain id", as it is such an important concept that it deserves its own name. But, again, this terminology is not in scope here. 😃 ).Either way, that PR does not really solve the issue completely. The SDKs need to not only be able to supply the
chain_start_run_id
, but also they need to know it. Thus:(2)
Now the second (related) question: Can all those APIs be extended in a way so that their return payload includes the
chain_start_run_id
of the chain that contains the run to which the call in fact applied?For example:
A user calls
SignalWorkflowExecution(workflowId="W1", runId=null, ...)
.This means "send a signal to the latest (=most recent) run with the workflow id
"W1"
".Now, the server will determine the latest run with that workflow id and deliver the signal to that particular run. (Lets assume that the
run_id
of that run was"R42"
.)After that, the user probably wants to continue interacting with "the workflow" that was affected by that signal. On a technical level "the workflow" is a chain-of-workflow-runs. They likely want to continue interacting with the "latest" run only as long as the "latest" run is still a part of the same chain as the run
"R42"
was. If the chain finishes and a new run is started with the same workflow id, that run is no longer part of the same logical workflow. Then the user likely does not want to interact with that run in the same session.How can we enable the user to avoid unwillingly "overflowing" beyond the end of the workflow chain?
We ensure that
SignalWorkflowExecution(..)
includes thechain_start_run_id
into its return payload. Then, after staring the set of interactions as described above, the client knows thechain_start_run_id
of chain that contained the run withrun_id="R42"
. (Let's assume thatchain_start_run_id
was"R18"
). So, all subsequent invocations would include that information.E.g., to send another signal, the user invokes
SignalWorkflowExecution(workflowId="W1", chain_start_run_id="R18", runId=null, ...)
which means "send signal to the latest run in the chain with
chain_start_run_id="R18"
".If the user wanted to address the actually latest run, without restricting the call to the same workflow chain as they interacted previously, they would simply no longer include the
chain_start_run_id
.Problem solved. :)
This may sound a little complicated, but I believe once you think it though, it appears quite straight forward. And, of course, we do not actually expect users to deal with the complexity. Language-SDKs will store the
chain_start_run_id
into whatever object they use to refer to a workflow and to invoke APIs on it (e.g. to send a signal to a workflow).So: my question to the server team is: how hard / feasible is it to extent the APIs in the manner described? It is something we can reasonably tackle?
Thank you!
x x x x x x x x x x x x
Below is a minimally edited record of the Slack conversation about this topic between a few people.
The text was updated successfully, but these errors were encountered: