Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[r] Enumeration of Pros and Cons and different options to provide builds of the R tiledbsoma package #1427

Closed
mlin opened this issue May 27, 2023 · 29 comments
Assignees

Comments

@mlin
Copy link
Member

mlin commented May 27, 2023

Currently our readme has above-the-fold instructions to install R tiledbsoma from r-universe, which rebuilds for every update to our main branch.

As the package stabilizes, we want the default version end users (and dependent packages like cellxgene.census) install to be a more controlled release version, perhaps that's been vetted more thoroughly than every single merge to main necessarily is. Basically, the version end users get by following our above-the-fold instructions, should only change once we've decided it's suitable to release.

In the long term releasing on CRAN would naturally accomplish this, but we'd like a short/medium-term solution too since it might be some months before we feel ready to publish on CRAN.

A couple possibilities:

  1. Copy the release version tarballs/binaries from r-universe to some other self-hosted repo (like on gh-pages using drat or similar), and change readme instructions to install from that.
  2. Reconfigure r-universe to build from a non-main branch like release or r-release, which we would only push to at appropriate times. Keep current instructions to install from r-universe and suggest some other way like install_github to get the bleeding edge.

In either case, also define and document the release process.

@mlin
Copy link
Member Author

mlin commented May 27, 2023

@eddelbuettel @mojaveazure @aaronwolen It would be helpful to toss out a couple possible solutions, BUT: if you think we will be best served (and most consistent with R community practices) to just keep using the head of main, then please say so! This past week was frustrating because a new breaking change would already have been merged by the time I updated cellxgene.census to handle the last one, but, I also recognize this is just a phase we are getting through right now. cc @pablo-gar @ebezzi @atolopko-czi @bkmartinjr

@mlin mlin changed the title [Feature request] [r] R tiledbsoma package versioning & archive [r] controlled release process+repo for R tiledbsoma package May 31, 2023
@mlin
Copy link
Member Author

mlin commented May 31, 2023

Revised this ticket with a simpler scope per realtime discussion with @eddelbuettel and @johnkerl. Original, outdated description follows here.


Currently we publish tiledbsoma through r-universe which builds constantly from our main branch (and does not retain historical versions).

In the long term, tiledbsoma should become stable enough to publish on CRAN or Bioconductor with infrequent updates.

But while the package is still maturing in the short & medium term, we'd like to find a "goldilocks" way for dependent packages like cellxgene.census to pin a specific tiledbsoma version they've been tested against -- meaning there would exist some package repository archiving historical tiledbsoma versions, that we can push to more frequently as needed.

It would be ideal (but not essential) for the repository to serve binaries for common platforms.

@johnkerl johnkerl changed the title [r] controlled release process+repo for R tiledbsoma package [r] Controlled release process+repo for R tiledbsoma package Jun 5, 2023
@eddelbuettel
Copy link
Contributor

Related to #1340 and #1412

@eddelbuettel
Copy link
Contributor

Also related to open PRs / branches

These issue are, at least as I understand them, tied and would benefit from getting 1.0 out. There is no API change involved, it is about how we produce the packages most easily / consistently. So might this be a question better addressed post 1.0 so that we should remove the '1.0rc-r' tag here?

@mlin
Copy link
Member Author

mlin commented Jun 9, 2023

@eddelbuettel I would think that a natural part of releasing 1.0 would be deciding how and where to actually release it. Imagine we've tagged 1.0rc-r; how would we tell somebody to install that version rather than whatever is the current tip of main?

The r-universe build is great, it seems to me like we just need a more-controlled process to update it.

@eddelbuettel
Copy link
Contributor

eddelbuettel commented Jun 9, 2023

My feeling is that this will be a lot simpler once we are at 1.0 and beyond (as we will likely release C++ and Py and R concurrently under one release version number) and maybe we should not worry too much about anything else before then are we all think we are reasonably close.

Concrete proposals welcome, otherwise I do not have too much of an idea of what you consider a more controlled process. Always happy to chat, now or tomorrow or next week.

@mlin
Copy link
Member Author

mlin commented Jun 9, 2023

OK, I'd propose the following:

  1. Create a new protected branch on this repo r-universe
  2. Update https://github.com/TileDB-Inc/tiledb-inc.r-universe.dev/blob/master/packages.json to make r-universe track that branch instead of main (example)
  3. Update https://github.com/single-cell-data/TileDB-SOMA/blob/main/apis/r/README.md to (i) document that the "release process" consists of advancing that branch and checking that the r-universe build succeeds, and (ii) for those users who do want the bleeding edge (main branch), suggest the appropriate one-line install_github() command

@eddelbuettel
Copy link
Contributor

eddelbuettel commented Jun 9, 2023

That would work, but I am not sure I would recommend that. It takes the ability of using r-universe to test interim releases away, and "hacks" r-universe into something it is not. That said, this is a joint project and if this is what you and/or CZI think is best I will not stand in the way.

As an alternative, it is very straightforward to host a repository with binary and release artifacts. I can expand on that. It is the same "mechanics" of what underpins access to r-universe.

But my recommendation is straightforward based on both twenty+ years of uploading there as a developer and now also several years with the tiledb R package: we should send the package to CRAN. This delivers increases visibility, and it ensures development consistency as well as continued rolling quality and integration checks while avoiding having to run a manual release repo.

@mlin
Copy link
Member Author

mlin commented Jun 12, 2023

@eddelbuettel We're agreed that tiledbsoma should be published on CRAN, but:

  • what do you think is a suitable timeline for that?
  • might that timeline get delayed by build/compatibility changes they might request?
  • thereafter, there's a limited cadence (and significant latency) of releasing, right?

Subject to learning more about those constraints, I think the suggested r-universe approach would be beneficial in the short term at least, while we'll still be iterating on the releases at a steady tempo (but not so frequent as every merge to main). I wouldn't see it as "hacking" r-universe to use the branch feature it provides, and testing interim versions is why I suggest adding the suitable install_github() incantation to the readme.

@eddelbuettel
Copy link
Contributor

eddelbuettel commented Jun 12, 2023

Well -- If it were just me I would have uploaded it months ago. The code is in a public repo, it is visible, it tests and builds fine and we have been very, very clear about "it is not 1.0 yet and we may break an API". Yet none of that prohibits a CRAN upload. Thousands of packages are at CRAN that "are not 1.0 yet". (There is precious little material out there that formally defines anything, the CRAN Repository Policy is one.) But _there simply is no "build/compatibility changes request" by them. If you upload a package to CRAN and either it changes or its dependents change you are asked to keep up. No more, no less.

So it all depends on what we as a team think is best, and how I can possibly help you accomodate changes from the census side of consuming the package. As discussed and shown, one approach may be an intermediate repo to host source packages -- and binary builds if you think that is what you want to do. Or if you think you need to point r-universe at different branches, you can certainly do so. It is our code repo and we can do whatever we want.

@mlin
Copy link
Member Author

mlin commented Jun 18, 2023

Summary of discussion from Fri meeting:

  • CRAN is the eventual destination, but their limitations on release cadence make it less suitable in the short term.
  • It is NOT necessary to archive all historical versions of the package; serving just the newest "released" version will suffice.
  • So the suggested r-universe solution would still have just one entry in the TileDB-Inc packages.json, either tracking a release branch or...
  • @mojaveazure also recalled r-universe can track GitHub Releases by setting branch to *release.

I can think of minor pros and cons to those last two options, so I don't have a strong preference between them.

@eddelbuettel
Copy link
Contributor

Regarding

  • Point 1: It is worth repeating that that is the "asymptotic behavior': new packages can and do release as fixes are needed so this is a non-issue; most people also tend to think that "release early, release often" has merit and we follow this in our release process (but it is also fine to to wait for 1.0 or whatever other milestone makes the team as a whole happy)
  • Points 3 and 4: That should work but I have yet to a live packages.json doing it and have not yet heard back from @jeroen who runs this about a question lobbed in DM but will follow-up once I know more; "manually" (or via action) folding releases into a fixed branch that we point at should also work equally

@eddelbuettel eddelbuettel changed the title [r] Controlled release process+repo for R tiledbsoma package [r] Enumeration Pros and Cons and different options to provide builds of the R tiledbsoma package Jun 18, 2023
@eddelbuettel eddelbuettel changed the title [r] Enumeration Pros and Cons and different options to provide builds of the R tiledbsoma package [r] Enumeration of Pros and Cons and different options to provide builds of the R tiledbsoma package Jun 18, 2023
@eddelbuettel
Copy link
Contributor

I changed the title to stress that this evolved into the sub-aspect of how and where we would provide release tarballs and builds. The process of making releases remains somewhat unaffected by this and is under fine control.

@mlin
Copy link
Member Author

mlin commented Jun 18, 2023

@eddelbuettel Are we agreed on pursuing the short-term r-universe plan? (unless/until we hit some big problem with it, of course)

To be clear, this is not merely a discussion/enumeration ticket. Our above-the-fold installation instructions should lead users to a version we decided to release, instead of the tip of the main branch.

@eddelbuettel
Copy link
Contributor

@mlin Yes as discussed in Friday's meeting. But as I stated in #1427 (comment) I am waiting to learn from @jeroen on what the simpler approach (of tagging a release tag rather than a branch is) which we assume to be feasible as remotes supports the access pattern.

@jeroen
Copy link
Contributor

jeroen commented Jun 18, 2023

You can set the branch field in the registry to any reference, including a tag or a sha or a special value *release which tracks the "latest" release on github. See https://ropensci.org/blog/2021/06/22/setup-runiverse/#pro-tip-tracking-custom-branches-or-releases

@jeroen
Copy link
Contributor

jeroen commented Jun 18, 2023

You may also be interested in this post (which links to the one above) about managing versioned universes: https://ropensci.org/blog/2023/05/31/runiverse-snapshots/

@eddelbuettel
Copy link
Contributor

eddelbuettel commented Jun 18, 2023

Perfect, and thanks!

Kudos also to @mojaveazure who must have read that post as he had suggested just that value (which I hadn't seen so I had to bug you for confirmation / docs). And will peruse 'managed version universes' which could be what we desired.
(Edit: Oh I had seen that when you wrote about it / tweeted it but AFAIUI @mlin is after something else: less of a snapshot of an entire universe repo but more of a selection of multiple binary version of a package (which is non-obvious to do for R as R doesn't really do repos that way).)

(We had been setting branch as well as sub-dir (needed in that repo) but for "various reasons" we will now flip from branch=main to branch=*release. Thanks for running the service, it is coming in rather handily for us.)

@eddelbuettel
Copy link
Contributor

@mlin
Copy link
Member Author

mlin commented Jun 18, 2023

AFAIUI @mlin is after something else: less of a snapshot of an entire universe repo but more of a selection of multiple binary version of a package

@eddelbuettel Though we did discuss that initially, to once more repeat:

  • It is NOT necessary to archive all historical versions of the package; serving just the newest "released" version will suffice.

So as far as I understand, to conclude this ticket we just need to add one line to your packages.json, create a suitable branch (if using the branch approach), and update the README.md.

@eddelbuettel
Copy link
Contributor

eddelbuettel commented Jun 18, 2023

Please see chanzuckerberg/chanzuckerberg.r-universe.dev#6 which adds a line in the universe file for the CZI org pointing at the CZI r-universe.

So I concur that this ticket could be closed.

@mlin
Copy link
Member Author

mlin commented Jun 18, 2023

@eddelbuettel Yes, thank you for that PR; as we discussed on Friday, if for some reason we cannot get on the same page in the collaborative project, then on the CZI side we will set up our r-universe to control the released tiledbsoma version in just the way we desire. But I do not understand the apparent reluctance to completing the following steps:

@eddelbuettel
Copy link
Contributor

There is no functional difference between two r-universes: the feed equivalent install.packages() calls. Your concern appeared (at least to my reading and understanding) to stem from use with the cellxgene.census package so I aimed to help by suggesting a change in its corresponding r-universe. As you plan to update / alter a README anyway, why not point to that one?

We can surely alter other r-universe settings as well. So maybe a PR is the next step.

@mlin
Copy link
Member Author

mlin commented Jun 19, 2023

@eddelbuettel Indeed, a key difference between the two r-universes is that https://github.com/single-cell-data/TileDB-SOMA/blob/main/apis/r/README.md directs users to one and not the other. So, if you anticipate you'd approve future PR by me to change that readme to point users to the chanzuckerberg r-universe (which I'll reconfigure) instead of TileDB-Inc's, then I think at last -- we have a plan.

@eddelbuettel
Copy link
Contributor

Yes. Changing a README once seems easier than frequently changing a branch or other plans we have moved on from.

Moreover doing so would move all moving parts into your corner which should also allow you to just take care of it as you see fit. Sounds like a plan?

@aaronwolen
Copy link
Member

I agree with @mlin that:

Our above-the-fold installation instructions should lead users to a version we decided to release, instead of the tip of the main branch.

I think we should just update the tiledb-inc r-universe to track tiledbsoma releases. It's already setup and we've already shared the install snippet pointing to https://tiledb-inc.r-universe.dev with interested users.

@eddelbuettel
Copy link
Contributor

FYI this was now merged in the PR at the other r-universe chanzuckerberg/chanzuckerberg.r-universe.dev#6

If I read the page correctly it also just built per the page https://chanzuckerberg.r-universe.dev/builds

@aaronwolen
Copy link
Member

That's fine. We should still update our r-universe to for the benefit of existing users already in the habit of installing from there.

@mlin
Copy link
Member Author

mlin commented Jun 30, 2023

Merged #1502 and broke out #1515

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants