Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JOSS review: version numbers and github releases #2

Closed
bmcfee opened this issue Aug 22, 2019 · 13 comments
Closed

JOSS review: version numbers and github releases #2

bmcfee opened this issue Aug 22, 2019 · 13 comments

Comments

@bmcfee
Copy link

bmcfee commented Aug 22, 2019

(Note: I'll be opening a bunch of issues relating to JOSS review.)

The review form asks if version number for the software reflects that reported in the github release. There do not appear to be any release tags for this project, and the repository (repositories?) do not contain version identifiers.

@faroit
Copy link
Member

faroit commented Aug 27, 2019

I think this a bit tricky to address and we would love to have your comments on this. As you pointed out there are currently no version identifiers whatsoever. This is mainly because I didn't want to confuse users with too many different versions all over the place. Currently we are facing the following situation:

The pre-trained models are hosted and versioned on zenodo. E.g. the umx models is deposited here and I think it is the best to use releases for code that matches pre-trained models. In the model files we state which commit was used to train the model, is this particular case it was 43372f7. This is an old version of the repo which had significant shortcomings with respect to the documentation or data loaders other than musdb18, so we don't want to tag this v1.0.0. However, the most recent version is still compatible with respect to the model and the weights. Now, if we might update the model in the future - thus introducing an incompatible change, we will retrain the model and tag a new release.
Furthermore, torchhub suggests to use tags when performing torch.hub.load('tag'), so this should match to the other tags as well.

Therefore it would be the best to tag the pytorch version with v1.0.0 as soon as this JOSS review is nearly completed, leaving the nnabla and tensorflow versions (which are not planned for a pre-trained model) to no-tag or tagged as 0.1-dev.

Thanks for your input on this.

@bmcfee
Copy link
Author

bmcfee commented Aug 27, 2019

Yeah, that's indeed trickier than the typical OSS package release. I do think it's important to have versioning for citation and replication purposes, and your proposed scheme seems good to me. It's maybe worth looping in @arokem for a second opinion on how this aligns to the spirit of the version requirements in JOSS submissions.

@arokem
Copy link
Contributor

arokem commented Aug 28, 2019

For the purpose of the paper, I think that clarity about which version of each one of the sub-modules is referenced would be important and archival versions for each one of them would need to be created. I think this means creating a release/tag and zenodo archive for each one of them when the paper is close to acceptance and then pointing from your README here to these DOIs. Then, once the paper is accepted, you create one more archive for this repo, including that README, with the DOIs pointing to the archival versions of each sub-module. Does that make sense?

@faroit
Copy link
Member

faroit commented Aug 28, 2019

@arokem yes, I think I understand.

In detail, JOSS is okay with the fact that one particular submodule (namely the tensorflow reimplementation) will be tagged and archived without being fully functional (e.g. a 0.1-dev release) and the release tag of the JOSS submission will match to our main/lead implementation. We will take care that this is explictly mentioned in the readme of this repo as well as the actual paper. Is this suffcient?

@arokem
Copy link
Contributor

arokem commented Aug 29, 2019

Yes. I think this is fine as long as it's clear (both in the docs here, as well as in the paper) what works and what doesn't.

@hagenw
Copy link

hagenw commented Aug 29, 2019

I'm also a little bit puzzled by combining different implementations/sub-modules into one submission for a software paper. I see the reasoning, as the goal of all sub-modules is to implement the same functionality.

What I cannot see at the moment, is why it is sufficient for the paper when only one of those implementations is ready (torch). In my opinion it would make more sense to focus the paper on the torch implementation, or submit it when all implementations are ready.

To cite from https://joss.theoj.org/about#submitting:

JOSS submissions must:
[...]

  • Be feature-complete (no half-baked solutions) and be designed for maintainable extension (not one-off modifications).

@faroit: do you also plan to maintain all three different implementations, or will you not most likely benchmark them and in the end continue with the best one?

@faroit
Copy link
Member

faroit commented Aug 29, 2019

@hagenw thanks for your input.

I agree that a feature-complete submission for all three implementations would be optimal. But at least for tensorflow we don't see the benefit to release a 1.x version when so many things change with 2.0 (which also concern audio data loading as this is important for us).

@faroit: do you also plan to maintain all three different implementations, or will you not most likely benchmark them and in the end continue with the best one?

We see the current submission addressing the scientific community first for which we think the pytorch version is best suited. Therefore we will only release the pre-trained models for pytorch.
However, we plan to maintain all three implementations as far as framework updates concern. Also we might choose to use a tensorflow version of the model later for deployment on a interactive website. The core model will be always kept in sync though.

In the end we want researchers to be able to cite only one DOI when they use any version of open-unmix. For this to happen I see two ways:

A) We might be better off with an arxiv submission focussing on the technical details of the model, withdrawing this submission. Then we could submit separate JOSS submissions for each implementation. This, however, means that researchers would probably mainly only cite the arxiv version, which we don't think is optimal.

B) We proceed with the submission with the following changes:

  • remove the tf stub submodule and remove it from the main section of the paper (but keep in the outlook section)
  • wait for the nnabla version to be tested (will be this week) but not feature complete compared to the pytorch version
  • tag and release both, the pytorch and nnabla version with the nnabla version not getting a 1.0 tag.

C) same as B) but also remove nnabla

@arokem @hagenw @bmcfee please tell me what you think.

@hagenw
Copy link

hagenw commented Aug 30, 2019

I guess the issue comes from intermixing software and journal publications as it is of course the goal of JOSS.

  • From a journal perspective the software should be feature complete, otherwise the publication could seem strange, if it is just a declaration of intend to write some software.
  • From a researcher perspective that develops open source software and hopes that the community picks up, an early release of the paper is crucial as she/he can then get credits for the work in the form of citations.

I don't think that this can easily be solved besides getting rid of the paper/citations madness, but I will definitely not start that discussion here ;)

For me C) would be a good solution, but I would be also fine with the proposal of B) if this is favored by the other reviewers.

@bmcfee
Copy link
Author

bmcfee commented Aug 30, 2019

Options B or C sound fine to me.

It seems like the pytorch implementation is the first-class artifact, and would be enough on its own to merit a joss publication IMO. Ports / implementations in other frameworks seem to me like extensions or iterations, which are obviously important to users, but not directly relevant to the pub here. The story might be different if the focus here was on cross-framework implementation (eg onnx) rather than a specific model and application, but that's clearly not the focus of this work.

@faroit
Copy link
Member

faroit commented Aug 30, 2019

@hagenw @bmcfee thanks!

We went with C)

  • I updated the README.md of this project
  • the paper is updated to refer to the non-pytorch version as "ports" without pre-trained models
  • the pytorch version is tagged and released as v1.0.0

I would propose to still leave the utilities as submodules here, since they are a mandatory requirement for running open-unmix and we don't plan to submit them to JOSS at this point. I am open to remove them as well, if you think this makes more sense.

@bmcfee
Copy link
Author

bmcfee commented Aug 30, 2019

Thanks @faroit (and all other commenters). I think this resolves the issue, but I'll hold off on closing it out pending commentary from @hagenw and/or @arokem

@hagenw
Copy link

hagenw commented Sep 1, 2019

@bmcfee the solution is totally fine with me.

@arokem
Copy link
Contributor

arokem commented Sep 1, 2019

Yep. Looks good!

@bmcfee bmcfee closed this as completed Sep 1, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants