Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Export Histories into Research Objects #3088

Closed
Tracked by #12399
bgruening opened this issue Oct 25, 2016 · 9 comments
Closed
Tracked by #12399

Export Histories into Research Objects #3088

bgruening opened this issue Oct 25, 2016 · 9 comments

Comments

@bgruening
Copy link
Member

bgruening commented Oct 25, 2016

@dannon and I talked yesterday about better and more structured way of exporting and reusing Galaxy histories. Exporting this into a http://www.researchobject.org might be a good solution.

@lparsons
Copy link
Contributor

lparsons commented Nov 2, 2016

Many of my users would also love a way to export a history without the data (sort of like a workflow). In a way, "Create workflow from history" kind of does this, but it's often a bit broken and doesn't save the names/etc. Basically, having a way to retain history items, but remove the underlying dataset, would be very useful for archival purposes. One could retain the input data and some final output datasets, but remove the unnecessary intermediate files without losing the history of what was done.

@nsoranzo
Copy link
Member

Export of histories as BagIt bags was implemented in #7367 , but it was just a first step as explained in #4345 (comment) .

@HadleyKing
Copy link
Contributor

HadleyKing commented Jul 25, 2019

This is something we are looking at with BioCompute using the galaxy history AND galaxy workflow JSON outputs. I have been searching but have not been able to find specific documentation about how the the key:value pairs are generated and what they mean, for either of those objects. Does this exist?

@stain
Copy link

stain commented Jun 11, 2020

Now it would make sense as a BagIt to just add a ro-crate-metadata.json file according to RO-Crate - this could at least describe the workflow JSON and the derivation.

Adding BioCompute IEEE 2791 would also make sense as I don't think it has a packaging at the moment.

Further work could look at the history provenance and model it according to CWLProv with timestamps etc - but that would be more detailed and kept in a separate PROV document - we would have to decide which flavour, e.g. PROV-XML vs PROV-JSON vs JSON-LD vs Turtle.

@stain
Copy link

stain commented Jun 11, 2020

See also #9077. We suggested this as topic for the BCC CoFest - hoping @HadleyKing @nsoranzo et al will join!

@HadleyKing
Copy link
Contributor

Now it would make sense as a BagIt to just add a ro-crate-metadata.json file according to RO-Crate - this could at least describe the workflow JSON and the derivation.

Adding BioCompute IEEE 2791 would also make sense as I don't think it has a packaging at the moment.

Further work could look at the history provenance and model it according to CWLProv with timestamps etc - but that would be more detailed and kept in a separate PROV document - we would have to decide which flavour, e.g. PROV-XML vs PROV-JSON vs JSON-LD vs Turtle.

Right now the #9077 is creating a BioCompute IEEE 2791 compliant JSON via API and mostly the workflow invocation. There is a download feature and I am working on finishing up allowing some basic editing and modification via a UI, all of which @nsoranzo and I will be presenting as a lightening talk, demo, and poster at BCC2020

Looking through all (RO, CWL) of the material you have linked to @stain I see many correlations as well as very similar edits to the same spaces in the galaxy code. We should defiantly see where we can overlap our existing and future efforts. Thanks for the invite, I will be there!

@bgruening
Copy link
Member Author

bgruening commented Jun 3, 2022

I just wanted to make us all aware of #13920 which has a lot of the stuff in which we need to get this ticket closed.

@HadleyKing I guess it would be nice to use this infrastructure as well for BCO.

@davelopez @ieguinoa I guess we should coordinate with @jmchilton on how to proceed and how we can make RO imports and exports rock-solid in 22.09.

@HadleyKing
Copy link
Contributor

@syntheticgio @skeeney01 @kee007ney
This could also tie in nicely with the BCO class definition implementation. There is a lot to unpack here, but this could be a good use case to start with.

@mvdbeek
Copy link
Member

mvdbeek commented Jul 21, 2023

Implemented in #14595

@mvdbeek mvdbeek closed this as completed Jul 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants