You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is about the second half of #10 -- a datalad-based data sink or data capture helper. See #12 for the other half.
Purpose
Accept data (files) from some source/location, and inject them into a datalad dataset (as a new commit, to some branch, under some file names(s)), and optionally push the dataset modification to a remote or service that accepts a (serialized) dataset (update).
Target use cases
Point to a (local) repository clone, and capture any (subset) of modifications in it
Obtain a dataset from some datalad-compatible source, and populate it with content from some other location (e.g. workflow output) (under some given name(s)) (and push back to the remote)
Provenance capture
It would be useful to be able to ingest provenance information on the dataset modifications
where is the content coming from
what created it (e.g., workflow execution)
API
(1) dataset to add changes to (URL, identifier, similar to remake-provision
(2) branch to commit modifications to
(3) some mode switch to define the nature of the modification
incremental: add new, replace existing
explicit: the incoming content is the sole content for the new version, all absent previous content is removed
(4) options to declare where/how to deposit update at a remote
remake-capture is likely using remake-provision whenever it is not operating on an already existing local repository so (1) and (2) would need to be aligned between the commands.
(4) is included to make remake-provision and remake-capture be the only two datalad "nodes" to make arbitrary workflow system datalad compatible. Of course (4) could also be a dedicated execution of datalad push -- different trade-off -- subject to further discussion.
The text was updated successfully, but these errors were encountered:
This is about the second half of #10 -- a datalad-based data sink or data capture helper. See #12 for the other half.
Purpose
Accept data (files) from some source/location, and inject them into a datalad dataset (as a new commit, to some branch, under some file names(s)), and optionally push the dataset modification to a remote or service that accepts a (serialized) dataset (update).
Target use cases
Provenance capture
It would be useful to be able to ingest provenance information on the dataset modifications
API
remake-provision
remake-capture
is likely usingremake-provision
whenever it is not operating on an already existing local repository so (1) and (2) would need to be aligned between the commands.(4) is included to make
remake-provision
andremake-capture
be the only two datalad "nodes" to make arbitrary workflow system datalad compatible. Of course (4) could also be a dedicated execution ofdatalad push
-- different trade-off -- subject to further discussion.The text was updated successfully, but these errors were encountered: