-
Notifications
You must be signed in to change notification settings - Fork 490
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suggestion: Update the CodeMeta Metadata Block to add some more structure for machine actionability #10859
Comments
If you're willing to produce an updated tsv, I'd be happy to look at it! On a related note, as of Dataverse 6.4, you'll be able to designate the "type" of a dataset as software. Please see: |
These's a task under IQSS/dataverse-pm#174 to support CodeMeta and I just added a subtask to look at this issue and consider upgrading to v3 of CodeMeta first. Pull requests welcome, of course! 😄 ❤️ |
@doigl @poikilotherm and others, as I work on this issue... ... I'm wondering if I should promote codemeta.tsv as it exists now, in 6.4, in tests and explanations of the feature or if I should use computational_workflow.tsv which as far as I know, doesn't have any planned updates. Basically, I'll pick one or the other to explain the feature of associating a dataset type such as "software" with a metadata block such as CodeMeta or Computational Workflow. I'm a little nervous about promoting CodeMeta much in its current form, since it sounds like it's likely to change. So maybe I'll go with Computational Workflow. 🤷 |
@pdurbin: sorry for the late answer and the missing pull request so far (too much other things on the plate). Wouldn't be Computational Workflow a good metadata block for workflows and CodeMeta a good one for software? But I have to admit, that I do not really have a clear understanding about the difference between the two types workflow and software. The main changes in version 3 are - as far as I know - the review/reviewBody/reviewAspect fields, a start and end date, the hasSourceCode/isSourceCodeOf relations and the renaming of continousIntegration and embargoEndDate. While it would be really great to have the possibility to link to external reviews for software and for data (perhaps in form of badges), I would not see this feature in the software metadata/codemeta block, because this is important for datasets (and workflows?) as well. The relations between source code and application could be implemented in the "Related Materials" in Citation, if we would have there also the relation types. And so far, we do not have a use case for a start and end date for software. What do you mean @pdurbin , @poikilotherm ? |
@doigl thanks, yes, CodeMeta for software makes sense, of course. I'm playing with the |
Overview of the Suggestion
Actually, the fields MemoryRequirements and ProcessorRequirements and StorageRequirements are just free text fields, what makes it difficult to use them in an automated process to provide the right resources for running a jupyter notebook or a container. Adding subfields to these fields with controlled vocabularies would it make it easier to differentiate between different types and identify the right amount of resources like memory.
Also, as @poikilotherm mentioned, the CodeMeta Scheme is now available in version 3 and it could be worth a look, if we want to also add some of the new fields (code reviews) in the metadata block.
What kind of user is the suggestion intended for?
(Example users roles: API User, Curator, Depositor, Guest, Superuser, Sysadmin)
User, Sysadmin
What inspired this idea?
Two different things:
What existing behavior do you want changed?
Adding structured subfields and controlled vocabularies at least for the fields memoryRequirements, processorRequirements and storageRequirements. Make the memoryRequirements field multiple to allow different types of memory. We are open to discuss changes also for other fields and think about adding new version 3 fields to the block (do we need software reviews?).
Any brand new behavior do you want to add to Dataverse?
Also interesting would be a CodeMeta-Export that then puts the structured fields again together to be compatible with the CodeMeta standard. And we would have to adjust our GitHub-Action to import the information from codemeta files in Git-Repos into Dataverse datasets.
Any open or closed issues related to this suggestion?
Are you thinking about creating a pull request for this issue?
Help is always welcome, is this idea something you or your organization plan to implement?
We would be happy to provide a suggestion for an updated tsv of the codemeta block, but would also be very interested in the opinion and the requirements of the community, and perhaps especially from @jggautier and @pdurbin
The text was updated successfully, but these errors were encountered: