Suggestion: Update the CodeMeta Metadata Block to add some more structure for machine actionability #10859

doigl · 2024-09-18T13:57:39Z

Overview of the Suggestion
Actually, the fields MemoryRequirements and ProcessorRequirements and StorageRequirements are just free text fields, what makes it difficult to use them in an automated process to provide the right resources for running a jupyter notebook or a container. Adding subfields to these fields with controlled vocabularies would it make it easier to differentiate between different types and identify the right amount of resources like memory.

Also, as @poikilotherm mentioned, the CodeMeta Scheme is now available in version 3 and it could be worth a look, if we want to also add some of the new fields (code reviews) in the metadata block.

What kind of user is the suggestion intended for?
(Example users roles: API User, Curator, Depositor, Guest, Superuser, Sysadmin)
User, Sysadmin

What inspired this idea?
Two different things:

We have a dataset, where a user tried to add two different types of memory requirements to a research software (RAM and GPU memory) and we expect this to happen more often in the future
We want to connect our dataverse instance to a Jupyter Hub as an external tool to allow for an interactive exploration of published Jupyter Notebooks. In this process, we have to decide, which ressources the machine should provide, that will run the notebook.

What existing behavior do you want changed?
Adding structured subfields and controlled vocabularies at least for the fields memoryRequirements, processorRequirements and storageRequirements. Make the memoryRequirements field multiple to allow different types of memory. We are open to discuss changes also for other fields and think about adding new version 3 fields to the block (do we need software reviews?).

Any brand new behavior do you want to add to Dataverse?
Also interesting would be a CodeMeta-Export that then puts the structured fields again together to be compatible with the CodeMeta standard. And we would have to adjust our GitHub-Action to import the information from codemeta files in Git-Repos into Dataverse datasets.

Any open or closed issues related to this suggestion?

Include CodeMeta schema out of the box #7844

Are you thinking about creating a pull request for this issue?
Help is always welcome, is this idea something you or your organization plan to implement?
We would be happy to provide a suggestion for an updated tsv of the codemeta block, but would also be very interested in the opinion and the requirements of the community, and perhaps especially from @jggautier and @pdurbin

pdurbin · 2024-09-18T15:30:10Z

We would be happy to provide a suggestion for an updated tsv of the codemeta block

If you're willing to produce an updated tsv, I'd be happy to look at it!

On a related note, as of Dataverse 6.4, you'll be able to designate the "type" of a dataset as software. Please see:

dataset types (software, workflow, etc.) - initial support #10694

pdurbin · 2024-10-21T15:56:29Z

These's a task under IQSS/dataverse-pm#174 to support CodeMeta and I just added a subtask to look at this issue and consider upgrading to v3 of CodeMeta first. Pull requests welcome, of course! 😄 ❤️

pdurbin · 2024-10-28T20:15:18Z

@doigl @poikilotherm and others, as I work on this issue...

Implement datasetType metadata block support (at global level) #10519

... I'm wondering if I should promote codemeta.tsv as it exists now, in 6.4, in tests and explanations of the feature or if I should use computational_workflow.tsv which as far as I know, doesn't have any planned updates.

Basically, I'll pick one or the other to explain the feature of associating a dataset type such as "software" with a metadata block such as CodeMeta or Computational Workflow.

I'm a little nervous about promoting CodeMeta much in its current form, since it sounds like it's likely to change. So maybe I'll go with Computational Workflow. 🤷

doigl · 2024-10-29T07:15:57Z

@pdurbin: sorry for the late answer and the missing pull request so far (too much other things on the plate). Wouldn't be Computational Workflow a good metadata block for workflows and CodeMeta a good one for software? But I have to admit, that I do not really have a clear understanding about the difference between the two types workflow and software.

The main changes in version 3 are - as far as I know - the review/reviewBody/reviewAspect fields, a start and end date, the hasSourceCode/isSourceCodeOf relations and the renaming of continousIntegration and embargoEndDate.

While it would be really great to have the possibility to link to external reviews for software and for data (perhaps in form of badges), I would not see this feature in the software metadata/codemeta block, because this is important for datasets (and workflows?) as well.

The relations between source code and application could be implemented in the "Related Materials" in Citation, if we would have there also the relation types.

And so far, we do not have a use case for a start and end date for software.

What do you mean @pdurbin , @poikilotherm ?

pdurbin · 2024-10-29T15:58:11Z

@doigl thanks, yes, CodeMeta for software makes sense, of course. I'm playing with the codeMeta20 block right now. One thing I observe about both codeMeta20 and computationalworkflow is that both have fields with displayoncreate set to TRUE, which other metadatablocks don't have.

doigl added the Type: Suggestion an idea label Sep 18, 2024

pdurbin mentioned this issue Oct 21, 2024

GREI 2: HDV Task - Improve Dataverse Biomedical Metadata Support IQSS/dataverse-pm#174

Open

14 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Suggestion: Update the CodeMeta Metadata Block to add some more structure for machine actionability #10859

Suggestion: Update the CodeMeta Metadata Block to add some more structure for machine actionability #10859

doigl commented Sep 18, 2024 •

edited by pdurbin

Loading

pdurbin commented Sep 18, 2024

pdurbin commented Oct 21, 2024

pdurbin commented Oct 28, 2024 •

edited

Loading

doigl commented Oct 29, 2024 •

edited

Loading

pdurbin commented Oct 29, 2024

Suggestion: Update the CodeMeta Metadata Block to add some more structure for machine actionability #10859

Suggestion: Update the CodeMeta Metadata Block to add some more structure for machine actionability #10859

Comments

doigl commented Sep 18, 2024 • edited by pdurbin Loading

pdurbin commented Sep 18, 2024

pdurbin commented Oct 21, 2024

pdurbin commented Oct 28, 2024 • edited Loading

doigl commented Oct 29, 2024 • edited Loading

pdurbin commented Oct 29, 2024

doigl commented Sep 18, 2024 •

edited by pdurbin

Loading

pdurbin commented Oct 28, 2024 •

edited

Loading

doigl commented Oct 29, 2024 •

edited

Loading