Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workflow RO-Crate 1.0 suggestions #22

Merged
merged 25 commits into from
Nov 18, 2021
Merged

Workflow RO-Crate 1.0 suggestions #22

merged 25 commits into from
Nov 18, 2021

Conversation

stain
Copy link
Member

@stain stain commented Nov 12, 2021

Content negotiation would be via https://w3id.org/workflowhub/workflow-ro-crate/1.0 (see https://github.com/perma-id/w3id.org/tree/master/workflowhub) which seems to work:

(base) stain@xena:~/src/about/Workflow-RO-Crate/1.0-DRAFT$ curl -H "Accept: application/ld+json;profile=https://w3id.org/ro/crate" https://w3id.org/workflowhub/workflow-ro-crate/1.0
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>302 Found</title>
</head><body>
<h1>Found</h1>
<p>The document has moved <a href="https://about.workflowhub.eu/Workflow-RO-Crate/1.0/ro-crate-metadata.json">here</a>.</p>
<hr>
<address>Apache/2.4.29 (Ubuntu) Server at w3id.org Port 443</address>
</body></html>

Now I think we may need to discuss several of these items before releasing this.

You can preview this in https://github.com/workflowhub-eu/about/blob/workflow-ro-crate-1.0/Workflow-RO-Crate/1.0-DRAFT/index.md

stain added 15 commits November 12, 2021 11:30
Also CWL 1.2 has abstract CWL which we want
I set this as a SHOULD to not force the FormalParameters - not yet supported in WorkflowHub.
I don't think we should be picky about the filename,
as anyone consuming a zip can simply look for ro-crate-metadata.json inside
and thus determine it is an RO-Crate.

I think it is true you want the ro-crate-metadata.json straight in the root, and
do not like somefolder/ro-crate-metadata.json?

.jsonld -> .json
moved signposting to the profile
moved signposting to the profile
@stain
Copy link
Member Author

stain commented Nov 12, 2021

assuming authors are Finn, Alan, Stian
"@type": [
"File",
"SoftwareSourceCode",
"HowTo"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be ComputationalWorkflow? Since this file is the actual workflow, not the abstract CWL.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, this example should perhaps be amended to have both a ComputationalWorkflow (which could be Nextflow) and a Abstract CWL HowTo

@simleo
Copy link
Contributor

simleo commented Nov 18, 2021

@stain should we have explicit types for the workflow languages (instead of #cwl, #galaxy, etc.), like you suggested for Workflow Testing RO-Crate test service / engine types? I remember starting with things like #jenkins and #planemo, trying to do the same as Workflow RO-Crate, but then we changed to https://w3id.org/ro/terms/test#JenkinsService and https://w3id.org/ro/terms/test#PlanemoEngine after your suggestion. Since Workflow Testing RO-Crate is derived from Workflow RO-Crate I think they should be consistent in this respect.

@simleo
Copy link
Contributor

simleo commented Nov 18, 2021

Related to the above: since https://w3id.org/ro/terms/test#PlanemoEngine should be a well-defined entity we ended up putting the version in the referencing entity instead:

        {
            "@id": "test/test1/sort-and-change-case-test.yml",
            "@type": [
                "File",
                "TestDefinition"
            ],
            "conformsTo": {"@id": "https://w3id.org/ro/terms/test#PlanemoEngine"},
            "engineVersion": ">=0.70"
        },
        {
            "@id": "https://w3id.org/ro/terms/test#PlanemoEngine",
            "@type": "SoftwareApplication",
            "name": "Planemo",
            "url": {"@id": "https://github.com/galaxyproject/planemo"}
        }

The use case I had in mind was multiple tests needing different engine versions, since this would not be valid:

        {
            "@id": "https://w3id.org/ro/terms/test#PlanemoEngine",
            "@type": "SoftwareApplication",
            "name": "Planemo",
            "url": {"@id": "https://github.com/galaxyproject/planemo"},
            "version": 0.70
        },
        {
            "@id": "https://w3id.org/ro/terms/test#PlanemoEngine",
            "@type": "SoftwareApplication",
            "name": "Planemo",
            "url": {"@id": "https://github.com/galaxyproject/planemo"},
            "version": 0.71
        }

I think the same would happen in the Workflow RO-Crate case. What if there's a secondary workflow that's written in the same language as the main workflow, but in a different version? You can't repeat the same workflow language entity with two different versions, can you?

@stain
Copy link
Member Author

stain commented Nov 18, 2021

Agree, @simleo - rather than everyone defining their own new #cwl etc. they can now use the PID from this profile.

However I don't want to use #cwl then inside the profile as that would mean expand to the versioned URL of the profile, languages should survive across.

So I will change the languages to be under the https://w3id.org/workflowhub/workflow-ro-crate# namespace which would resolve to the current profile crate.

Not sure what are implication of this on https://github.com/seek4science/seek/blob/master/config/default_data/workflow_classes.yml and corresponding [match_from_metadata algorithm]https://github.com/seek4science/seek/blob/workflowhub/app/models/workflow_class.rb#L43()

For now think they should still also be explicit in the crate, that means algorithm would still be able to match on identifier etc. Longer term perhaps the Ruby code would have the Workflow Crate loaded for dereferencing licenses and languages.

@stain
Copy link
Member Author

stain commented Nov 18, 2021

On the problem with versions, I think you would need a second contextual entity. And then it is harder to make @id as most workflow systems are not good at defining PIDs for their releases or even to have a page about their language syntax.

Now this profile itself does not say you need a version on the programmingLanguage, but https://www.researchobject.org/ro-crate/1.1/workflows.html#workflow-runtime-and-programming-language does.

Ideally some kind of prov:specializationOf the versionless PID? We've not done that kind of versioning of entities or crates in RO-Crate yet. See https://practicalprovenance.wordpress.com/2016/05/07/tracking-versions-with-pav/ for possibilities.

I tried for instance to improve on the URL for Knime, since https://www.knime.com/ is a company and not a programming language - but they don't have a page about the .knwf format.

@stain
Copy link
Member Author

stain commented Nov 18, 2021

I also fixed that the old examples had odd nesting with @id on identifier and url - as these are defined to have the data type URL they should be flat strings (the identifier is the string itself, not whatever typed object you point at)

But I think that algorithm currently only tries to go into @id of a nested object.

@stain stain merged commit 3cf2d15 into master Nov 18, 2021
@stain stain deleted the workflow-ro-crate-1.0 branch November 18, 2021 12:05
@stain
Copy link
Member Author

stain commented Nov 18, 2021

https://w3id.org/workflowhub/workflow-ro-crate/1.0 now live.

RO-Crates previews:

In the Profile Crate I changed to use http://schema.org/Guide as discussed in hackathon. Not sure about PID entity which is needed to not violate MUST end in / requirement of https://www.researchobject.org/ro-crate/1.1/root-data-entity.html#direct-properties-of-the-root-data-entity which also does not permit additional @type. To fix in RO-Crate 1.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants