Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adding location and command for the various forms of components #7

Open
pennyl67 opened this issue Jun 18, 2017 · 8 comments
Open

adding location and command for the various forms of components #7

pennyl67 opened this issue Jun 18, 2017 · 8 comments

Comments

@pennyl67
Copy link

pennyl67 commented Jun 18, 2017

In the demo experiment, we have been using the resourceIdentifier for the location and/or command used to run components and workflows, but this is not safe.

The proposal is to use for this reason the "componentLoc" with the following changes:

  • change cardinality to "unbounded"
  • move "command" (optional) under it
  • change distributionURL to distributionLocation and make it xs:string.

In this way we can handle the following examples:
a) components at github:

<ms:componentLoc>
   <ms:componentDistributionForm>sourceCode</ms:componentDistributionForm>
   <ms:distributionLocation>https://github.com/dkpro/dkpro-core/tre/master</ms:distributionLocation>
</ms:componentLoc>

b) components at Maven

<ms:componentLoc>
   <ms:componentDistributionForm>sourceAndExecutableCode</ms:componentDistributionForm>
   <ms:distributionLocation>mvn:de.tudarmstadt.ukp.dkpro.core-gpl:de.tudarmstadt.ukp.dkpro.core.standfordnlp-gpl:1.8.0</ms:distributionLocation>
   <ms:command>de.tudarmstadt.ukp.dkpro.core.stanfordnlp.StanfordNamedEntityRecognizer</ms:command>
</ms:somponentLoc>

c) docker images

<ms:componentLoc>
   <ms:componentDistributionForm>dockerImage</ms:componentDistributionForm>
   <ms:distributionLocation>lappsgrid/galaxy</ms:distributionLocation>
</ms:componentLoc>

d) components dockerised and in galaxy

<ms:componentLoc>
   <ms:componentDistributionForm>dockerImage</ms:componentDistributionForm>
   <ms:distributionLocation>docker id for funding mining</ms:distributionLocation>
   <ms:command>Foufoulas command </ms:command>
</ms:componentLoc>
<ms:componentLoc>
   <ms:componentDistributionForm>wrappedInGalaxy</ms:componentDistributionForm>
   <ms:distributionLocation>services.openminted.eu</ms:distributionLocation>
   <ms:command>funding-mining</ms:command>
</ms:componentLoc>

f) web services, their dockerised images and wrapped in Galaxy

<ms:componentLoc>
   <ms:componentDistributionForm>webService</ms:componentDistributionForm>
   <ms:distributionLocation>nlp.ilsp.gr</ms:distributionLocation>
   <ms:command>tagger</ms:command>
</ms:componentLoc>
<ms:componentLoc>
   <ms:componentDistributionForm>dockerImage</ms:componentDistributionForm>
   <ms:distributionLocation>docker id</ms:distributionLocation>
   <ms:command>endpoint</ms:command>
</ms:componentLoc>
<ms:componentLoc>
   <ms:componentDistributionForm>wrappedInGalaxy</ms:componentDistributionForm>
   <ms:distributionLocation>services.openminted.eu</ms:distributionLocation>
   <ms:command>Galaxy id</ms:command>
</ms:componentLoc>

Could @antleb @courado and @reckart check and let me know if I've missed something?

@reckart
Copy link
Member

reckart commented Jun 18, 2017

a) points to the source code repo but would never be actually usable to download/build/invoke a component. Pointing to "master" is usually a bad idea since it is a moving target. Better would be to point to a release ZIP. IMHO there is a difference between the repo location and a source distribution. Cf.

c) is lacking version information - afaik it is usually provided by appending ":" and the version number to the image name

I don't understand why a difference is made between c) and d).

In f), do we really need separate entries for the dockerized/galaxy versions?

@reckart
Copy link
Member

reckart commented Jun 18, 2017

Should there be an e)?

@pennyl67
Copy link
Author

For a) Thanks @reckart for the explanations; these will come handy as tips in the guidelines. Just keep in mind that all these were meant as an example.
The idea for source/exe/... forms of distribution is to let providers decide how they want to share their components and where to get them from, in addition to the Maven coordinates.
On the other hand, some forms (e.g. the wrapped in galaxy versions) should be automatically added to the metadata by the OpenMinTeD platform.
c) I couldn't find any specs for these; I'll add to the guidelines.
c) and d): indeed no difference; e) also a mistake - has to do with the various scenarios on how components come into the registry, so disregard.
f) they are not separate entries: the same metadata record for one component with two different distribution forms; keeping them distinct allows running the dockerised image with some other workflow system, if one wants to

@reckart
Copy link
Member

reckart commented Jun 19, 2017

wrt f) and d): I still don't see what the difference is between "a docker image" and "a galaxy docker image" - why should it not be possible to run a docker image in another workflow system instead of Galaxy?

@pennyl67
Copy link
Author

Of course it's possible, but (at least the way it was done for the demo), the galaxy id was required for running the workflows; so, in a similar way to having the docker id stored in the metadata, I need to predict the same info for the galaxy id. The examples are meant to check that this can be done for the various scenarios - they don't all have to be filled in.

@reckart
Copy link
Member

reckart commented Jun 20, 2017

Why do we need separate IDs? In which way does a "docker" and a "galaxyDocker" docker image differ?

@pennyl67
Copy link
Author

Sorry, it was a mixup of a couple of examples together - one for a dockerised component/web service and a separate one for workflows. The galaxy id is required only for workflows which are composed of more than one components (aka. docker images).

@reckart
Copy link
Member

reckart commented Jun 21, 2017

Ok, I think I got it. Then I would propose galaxyWorkflow or workflow instead of wrappedInGalaxy for these.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants