-
Notifications
You must be signed in to change notification settings - Fork 327
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define a release policy #532
Comments
I respectfully disagree. I've personally encountered multiple frustrations with regard to the setup of the dev environment (#438, #467, #523); even after having gotten it all working initially and making a couple of contributions, things broke down when I tried to re-set-up a new dev environment on the same machine. This to me suggests that the installation / contribution workflow is not tested thoroughly enough, and it's causing pain to would-be contributors (I think not just me; cf. #521 (comment)) Don't get me wrong: there has been progress here; I like the move to nox / away from webpack. But after half-a-dozen PRs over the last year I still feel very hesitant to "open a quick PR" to fix or improve something in this theme because I'm not confident that it will actually be "quick" or that I'll be able to do things like update dependency versions in my dev environment and have things still work. |
@drammock - I meant that it would be unlikely somebody would introduce a change that breaks something really important without also being noticed by somebody first, because we are testing things automatically. This is (imo) what we should aspire to, so that we do not feel the need to do lots of manual checks for things. I agree with you that the contribution process itself is still way too complicated. I'm not sure what you're suggesting as it pertains to release policy. Are you saying that we should be more conservative and raise more barriers to making releases? |
being able to install a dev environment for the theme on Windows is really important: without that, no windows users can easily contribute to this theme (nor can they easily contribute to the documentation of MNE-Python, or other packages that use the theme). Installation of the dev environment on Windows was broken when the switch to sphinx-theme-builder was merged in, without being noticed by somebody first. (The first "person" to notice was the MNE-Python CI job that makes sure our docs can be built on Windows --- and that was discovered only because we were proactively testing out pydata-sphinx-theme's current main to see if recent changes worked well for our docs). What I'm saying is that perhaps the theme should have CI jobs that would have caught this before #514 was merged, and until such checks are in place, I disagree with the part of your rationale for frequent releases that I quoted before. If you had released today, it would have broken MNE-Python CIs on main and we would have had to pin to an earlier version of the theme until #523 gets sorted out. Pinning versions is not such a huge deal in the grand scheme of things, but it costs non-zero downstream developer time and CI time, potentially in many other theme users besides just MNE-Python. A CI job on this repo that caught the windows install problem would prevent that. |
To perhaps make my point more succinctly:
I agree, but would add "... as long as you also have robust automated testing." What I'm arguing above that the testing is not yet sufficiently robust to warrant a strategy of "release early, release often". |
@drammock - that's a great insight, I appreciate you sharing your experience here. I think we should aim for testing infrastructure that is robust enough to catch major regressions. That said, it's going to be hard to do this in some cases for the Javascript stuff, unless we also set up our own javascript testing infrastructure, which I don't think we have the developer expertise to maintain :-/ So a couple questions there:
|
CI jobs that run on Windows and macOS, that test setting up a dev environment and building a static copy of the docs. Looks like you've just started that in #533 (great!). Beyond that, the main problem I see is that a lot of the "testing" here is about making sure sphinx doesn't error out when trying to render all the different kinds of objects / markup; there isn't a subsequent step of checking to see if the rendering actually matches what was expected. Instead, it relies on people looking at the rendered docs and visually checking that the results look right. I think something like selenium allows screenshotting and pixel-level comparisons (as well as scripted tests of things like menu behavior), but that's rapidly getting into high-hanging-fruit territory. Once set up I think (?) it would be fairly low maintenance, but there's a large start-up cost in developer time.
It's hard to know where to draw the line. I'm inclined to say that "cosmetic" changes (like your recent PR to fix padding for site titles without a logo) should be fine... but changing the font size, padding, or flex behavior on main content divs could also be said to be "cosmetic" in some sense. Maybe it would work to have a roster of a dozen projects that all use the theme, and have a policy that at least N of them need to test any "major" changes on their own sites before those changes get merged in? GitHub's "require N PR approvals before merge" mechanism seems a natural choice here (with the understanding that approval should be preceded by the aforementioned test-on-your-own-project's-site step). But PR approval requires being a member/maintainer, so that approach points to having several more maintainers than you currently have. Not sure how folks at pydata feel about that. There might even be a way to create a CI job that automatically opened PRs into other projects, with the sole change in those other project's PRs being "install the theme from such-and-such a fork/branch and build the docs". That way the other projects' maintainers only need to look at the resulting artifacts. You'd want to coordinate with their maintainers, of course (e.g. for MNE-Python we'd want the PR to skip CI jobs that test our codebase and only run the doc-building job, so it would need some Ultimately though I'm not sure how to define a heuristic for which changes are "small" changes that don't need that level of QC, and which ones are "big" changes that should be slowed down. It might need to be decided by maintainers in a triaging process when new issues/PRs get opened.
The best suggestion I can come up with for minimizing effort is to spread the burden around. If you could convince one dev from each of the ten biggest theme consumers to agree to join as a maintainer, then you'd have a pool of people to ask for PR reviews or who could cut a release for you. A second idea would be to have the triage/tagging process in place for "big" vs "small" changes, and have an extra release step for cases where there are "big" changes that have been merged in (e.g., before releasing, actually read through the kitchen-sink page carefully to make sure things are actually rendering correctly). Or maybe the suggestion above about creating PRs into other projects to test how the theme changes affect them --- maybe that happens only at release time, instead of on every PR or even every "big" PR. That approach could keep releases with only "small" changes quick and easy. |
@damianavila and I brainstormed this a little bit today. We came up with a few ideas that we came up with that could help out here as well. Most of them are focused around automating things etc, because these seem like the most realistic ones to accomplish, given that we do not have a lot of ongoing dedicated resources.
|
Maybe, also the pydata mailing list? |
@damianavila that's a good idea - though I am less knowledgeable about how to send an email via a github action than I am at sending a tweet via an action haha. I think we should lean heavily towards things we can automate, under the assumption that the more manual steps needed, the less likely they'll happen, or that a release will happen at all |
It is probably not a simple thing to send an automated email via a GitHub action... I give you that 😉
👍 |
A quick thought on release candidates, given the recent tension that popped up around an IPython release that shipped a feature that some people disagreed with: I am not so sure that having a release candidate will surface many issues, unless we know that communities will actually use them. In the recent IPython problem, betas + release candidates were offered, but nobody actually used them or provided feedback so issues only came up when a proper release happened anyway. So I am wondering - if we started using release candidates with this theme, would people actually use them and provide feedback? Do others have experience / intuition for this? I think the risk we take is that we'd add an extra manual / toil step to our release process, with limited benefit. |
we (MNE Python) have a CI job testing against RCs or even nightlies but at present I think we don't do that for doc building. I think we'd be willing to add a CI job for that but I'd have to run it by the team. I don't think anyone would have time to review the builds visually. But in principle we could at least test against RCs whether the build success or fails. |
We should encourage the communities to use it! And ask our communities to have some automated testing to at least catch big things when a release candidate is published. One alternative is just a rolling-release process where we continuously (or on a frequent basis) publish new releases... but I would not bet on that sort of system for a "parental" theme like this one. |
Background
In #529 we had some mismatch in our expectations around when to release. We should set a repository policy when it comes to releases, so that we have shared understanding and expectations around when to do this.
Proposal
I propose we follow practices like this:
block-release
tag addedIf there are no issues with a
block-release
label, then we allow and encourage any maintainer to make a release at any time.Rationale
Minimizing the "delta" between releases, and automating it as much as possible, is a generally good modern software development practice. It ships improvements more quickly and ensures our users benefit from new development. It also minimizes the "toil" associated with creating a new release, which is generally not a super fun process of maintenance. Given that we have reasonable testing infrastructure, we can be confident that major problems are not introduced in general.
Using an issue label for
block-release
will allow us to be explicit about when an issue is so important that we should block a release on it. If we don't make this explicit then it may create frustration and confusion between maintainers who have differing ideas of when an issue should prevent a release.Feedback?
I suspect that @jorisvandenbossche has different ideas for some of the above stuff - so would love to know what he thinks about it. Perhaps we can iterate and agree on a policy, and then encode this in our developer guidelines.
The text was updated successfully, but these errors were encountered: