You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This issue replaces or concludes a number of previously expressed idea, aiming to reduce the complexity and make more obvious where we stand right now. Replaced are:
The present concept is to think of a recomputation as a three-step process, where each step can be represented as a node in a CWL workflow:
Provision: Establish the environment required for a computation
Compute: Run a computation
Extract: Pick relevant outputs and present them as outputs of the computation in a particular context
Each step needs critical information that must be stored and supplied. All steps also have different scopes:
Provision: the exact same parameterization can yield suitable inputs for more than one computation
Compute: the exact same compute specification can be combinable with a broad range of inputs and yield different outputs
Extract: One and the same compute output can be filtered in many ways to yield desired outputs in a particular context
The steps also have different applicability with respect to fixed or variable values for a particular recompute
Provision: exact for reproducing (I want the same) vs. variable for reevaluation (I want to see how different it is, e.g. datalad rerun --onto)
Compute: recompute exactly vs. recompute exactly with the new version of the tool
Extract: Mostly together with a change in the compute specification or implementation, output filters may need to be adjusted to continue to deliver the same output (name/location change)
Taken together these requirements determine where and how the parameters of all three steps can be stored, and, importantly, how they need to be referenced. In general this means that we would want to be able to identified all parameter sets, simultaneously, by precise version (exact parameters), and by concept (or latest version).
TODO:
anticipatory walkthrough for the use case "recompute git-annex key"
The text was updated successfully, but these errors were encountered:
This issue replaces or concludes a number of previously expressed idea, aiming to reduce the complexity and make more obvious where we stand right now. Replaced are:
The present concept is to think of a recomputation as a three-step process, where each step can be represented as a node in a CWL workflow:
Each step needs critical information that must be stored and supplied. All steps also have different scopes:
The steps also have different applicability with respect to fixed or variable values for a particular recompute
datalad rerun --onto
)Taken together these requirements determine where and how the parameters of all three steps can be stored, and, importantly, how they need to be referenced. In general this means that we would want to be able to identified all parameter sets, simultaneously, by precise version (exact parameters), and by concept (or latest version).
TODO:
The text was updated successfully, but these errors were encountered: