-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade lint infra & add a few more conservative built-in checks #2286
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! A few minor suggestions and questions.
default_language_version: | ||
python: python3 | ||
|
||
default_stages: [commit] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are these new defaults needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default_language_version
to ensure Python 3 is used on hooks that don't specify it, but I think its redundant nowadays? Doesn't hurt to have it, but probably doesn't hurt to remove it either?
The default_stages
ensures that pre-commit run --hook-stage manual
only runs the manual
hooks (i,e. codespell
, and any others we may add); otherwise, it would run all hooks and there wouldn't be a way to run only manual hooks, which often comes in handy.
@CAM-Gerlach can this be split into multiple PRs please. Distinct changes I identified
Some of these could be merged (1 and 3), but these deserve individual conversations, and I don't want to loose a thread by intermixing everything here. In general I would only favour "omnibus" PRs where there's going to be lots of pinging, but I don't think that's a massive concern here. If you massively disagree, let me know and I'll give a review of everything here. I do have a few concerns though on some things, so I'm marking this PR as draft for now. Thanks / sorry! |
I would would have considered splitting it into multiple PRs in hindsight, but I did put a lot of effort into trying to neatly separate the work into atomic commits for each of those changes, to the extent practical (for example, the PEP and link fixes belong with the respective checks that flagged them), and there is a large amount of cross-dependencies between changes, both physical (merge conflicts between similar lines) and logical (the fixes must be merged with the checks that catch them; PEP 12 must be updated before we add a check to enforce it, etc.). Unfortunately, GitHub doesn't natively allow for easily chaining PRs (much less more complex dependency relationships between them), so I'd have to make them one by one or manually rebase each of them as previous ones were merged. Furthermore, I wanted to be sure I could test the entirety of the changes integrated together to ensure no unforseen interdependent issues emerged, as can occasionally happen with pre-commit checks. I did try to minimize the number of PEPs touched, but this would mean at least a half dozen authors would get pinged again, some multiple times, not to mention pinging all the PEP editors for every PR, and fragmenting the existing discussion here—especially given that Brett and Guido have both indicated on multiple occasions recently that they are having trouble keeping up with everything going on in the PEPs repo these days due to the volume of notifications, I don't think that's a trivial concern. To note, while it does do a few different related things, it is still only around 100-odd lines of changes over 14 files (relative to #2209 , which had ≈5000 lines of changes over nearly 400 files) so I don't think its unmanageable to review in one place with separate review comment threads (and GitHub has a feature now that can flag unresolved conversations right above the merge button, and, if enabled, can even block merging on it). I did add a few of the existing built-in checks, but was very conservative while doing so and didn't add anything that flagged more than a few existing files in the repo, or added more than a total of a few seconds to the CI lint check run (which is by far the fastest runner). I'll certainly keep it in mind for future PRs (and the issues/PRs I'm planning are all much more tightly focused one specific changes), but given the extra noise, effort and difficulties, would you be able to just share your concerns here and we can discuss them? |
Short reply as on mobile (timezones) I see the point on dependencies. Can we at least split out general updating vs PEP 12 vsthe new role checker? that could be three sequential PRs. I think number of changes and "type" of changes are orthoganal -- it can be hard to review if there are a lot of either. But maybe I'm just less experienced. I also don't want to hide a substantive PEP update in an "infra" PR. (Taking Jelle's thumbs up on my original comment as that he agrees with at least the high level of my ask there). Will see what your thoughts are and act accordingly tomorrow (Saturday). A |
Sorry, meant to reply last night (err, morning), but fell asleep in my chair. Honestly, I didn't like having the PEP 12 changes as part of this PR either; while the update isn't intended to be that substantive, just implementing the part of what we already agreed on in #2130 that's relevant to linking PEPs (and RFCs) specifically, and update it accurately reflect the current state of the repo after the implementation changes in #2208 and #2209 (which would conflict even more so with the requested linting check), I'll manually rebase @hugovk 's requested changes in, move it and its corresponding check to a followup PR, and keep the baseline linting infra updates here. |
6760f4d
to
765bd41
Compare
d9d6411
to
b884f77
Compare
Okay @AA-Turner @hugovk @JelleZijlstra , I manually rebased in all of Hugo's suggested changes to the appropriate commits and shaved off the last two (the PEP 12 changes about the PEP links and the corresponding check) to another branch for a followup PR once this one is merged. I also split up the first two commits more cleanly as I'd originally intended, with the first including all the descriptive text changes on the Pre-Commit checks and GitHub workflows, with a few new additions to more clearly describe and disambiguate the Sphinx and Docutils-based CI checks (as its rather non-obvious and confusing from their current names), and the second that actually bumps the versions and makes a few other minor functional tweaks. Looking forward to your reivews and feedback, @AA-Turner and @JelleZijlstra . Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this draft PR be set as ready for review?
Since it was @AA-Turner who set it to avoid it getting merged before he went ahead and reviewed it due to him having "a few concerns though on some things", I wanted to give him the opportunity to elaborate on them before I did so. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'twas I, and here is the promised review. (Take care of yourself by the way, falling asleep at your desk isn't great!).
Precis:
- Branches -> commit hashes -- I must have missed this discussion. Personally I think wayback machine links convey better the idea that it is code or whatever at a particular point in time
- New checks -- there are a load of these, and if I'm honest I might be being a cumudgeon, but the PEPs repo is fundamenetally of text documents -- I'm not sure I see lots of value creating infrastructure amongst ancilliary files, which would only ever be meant to be an ancilliary example (the live example one imagines would be in CPython, type checkers, packaging tools, etc).
- clockutils.py -- the GH interface shows that the rwx bits changed -- I don't think this is needed for a historical file (
python clockutils.py
is but seven characters more, or fewer if you have python aliased to py) - GHA files -- I think the renaming is premature at this stage (deploy-gh-pages is intentionally named not to connote official status as a Sphinx build), and some of the other changes aren't needed (not every step needs a name!)
A
@@ -38,6 +38,7 @@ pep-0009.txt @warsaw | |||
pep-0010.txt @warsaw | |||
pep-0011.txt @brettcannon | |||
pep-0012.rst @brettcannon @warsaw | |||
pep-0012/ @brettcannon |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not going to comment on each of these individually -- I don't think any of these are required, though? If a PEP author cares enough about ancilliary files (as I believe you intend to do with your metadata one), he can add himself here manually.
I think I'd also prefer these were in an "ancilliary files" section instead of inline if we did do this, but as above I'd prefer we didn't
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not going to comment on each of these individually -- I don't think any of these are required, though? If a PEP author cares enough about ancilliary files (as I believe you intend to do with your metadata one), he can add himself [sic] here manually.
The only changes (other than adding myself as co-maintainer of the GitHub actions subdir, since I maintain the lint action, and have background experience in this area) were fixing two minor syntax issues, updating it to reflect the move of the confusingly-named/located scan_ops.py
file to follow PEP 1 like all the others (and make clear its not infra), and adding Brett as owner of the PEP 12 template, as creator and PEP co-owner (which appeared to be a simple oversight).
Other than that, all of the other subdirs and various ancillary files were assigned to their PEP author/sponsor—and per the intent, if not the explict letter of PEP 1, PEP authors/sponsors should be responsible for and notified about the files specific to their PEPs (as is their designated role per PEP 1), and that should be clear to others from the GitHub UI when one is modified, unless there is good cause in a particular circumstance to explicitly opt out.
@brettcannon is this okay with you?
I think I'd also prefer these were in an "ancilliary files" section instead of inline if we did do this, but as above I'd prefer we didn't
Changing the current organization scheme is a much bigger change than this PR and is out of scope here, which just makes a few tweaks to fix a couple apparent oversights of auxiliary files getting out of sync with PEPs—a problem which not keeping the two together would tend to exacerbate.
he can add himself
Just FYI, I can't speak for BrE, but using "he" to refer to persons of unspecified gender (not all PEP authors are men) is considered pretty objectionable (and rather archaic/uncommon) in modern, idiomatic AmE and on this repo; see #855 , #2214 , #2258
@@ -1,13 +1,15 @@ | |||
name: Build | |||
name: Docutils Build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this churn is warranted -- the workflow will be retired soon (TM), and all that need to care about which is Docutils-only and which is Sphinx already know
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think its really fair to dismiss as mere "churn" a change that clarifies names that are actively confusing, at least to me as a PEP editor (and one quite familiar with the build system) and likely to others as well. In particular here, there's no obvious indication as to whether the Build
check refers to the Docutils build or, as might logically be assumed, the Sphinx build (given it is the latter that is invoked by the file named build.py
). I only figured out what they actually referred to after digging through the various files, and I still have to remind myself sometimes.
I'm not sure how one could argue that being clearer and more descriptive is a bad thing, so given it isn't retired yet, while I welcome your suggestions on revised or alternative names, if you feel it isn't worthwhile to spend time on it the most logical course of action would be to just use it for now, rather than both of us wasting time debating (or reverting) it.
runs-on: ubuntu-latest | ||
|
||
steps: | ||
- uses: actions/checkout@v2 | ||
- name: Check out repo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style -- I never name the checkout action step
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...Except you did here on the other workflow in this repo, complete with an emoji even 😝
I added it there for consistency with the others, which all had names and had name
as their first key. Its not terribly important, but neither does it do any harm, and it would be a non-zero amount of work (and rebasing) to revert it,
@@ -1,9 +1,10 @@ | |||
name: Deploy to GitHub Pages | |||
name: Sphinx Build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer not to do this now. Fixing up everything on preview build + etc is on my to do list, but I want outcome of 676 & backend provider done first. We may entirely delete this if RTD is used, for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, I don't think we should let the perfect be the enemy of the good and actively reject a change that clarifies something causing active confusion right now (potentially not only for PEP editors, but authors as well), just because we might change it at some point in the future. At least in the context of running on each PR, which is likely how 95% of people are going to actually see it (especially those besides you), calling the whole workflow "Deploy to GitHub pages" when that is actively confusing as that is the one thing it doesn't do (PEP authors might assume, for instance, that it actually deploys their PEPs live somewhere), and also not being clear what build system is involved (one might assume the same one in the Build
job because of the very confusing names).
I find it difficult to argue that clarifying this, even imperfectly, is not an improvement over the status quo, and if you feel its not worth spending more effort on, the least-effort course of action for all involved is going with the above for now.
|
||
on: [push, pull_request, workflow_dispatch] | ||
|
||
jobs: | ||
deploy-to-pages: | ||
name: Build & deploy to GitHub Pages |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need a name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is clearer, cleaner and more descriptive in the UI, so I don't see how it is a net harm. If you'd like to suggest a different name, feel free of course.
files: '^pep-\d+\.txt|\.rst$' | ||
types: [text] | ||
|
||
# Manual codespell check |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're going to change all the "name" fields from here down, please can we say "Validate the PEP 'X' field"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These names need to be as short as practical while still conveying the key information, since they get printed to the often very limited width of the user's terminal in quick succession, and by general convention, are framed as titles/fragments rather than complete sentences. Therefore, including extraneous and redundant words like "the" is not really necessary here.
@@ -1,5 +1,5 @@ | |||
#!/usr/bin/env python3 | |||
# http://legacy.python.org/dev/peps/pep-0465/ | |||
# https://www.python.org/dev/peps/pep-0465/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was "legacy" legacy at the point this was written? If so I think we should use a wayback machine link.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope; I looked up the history on this and it was added during a period when the main site was hosted on the new system but some things, like the PEPs, were still on the old one and in the process of transitioning. This is simply a link to the PEP, so since I was moving the file anyway, it makes sense to use the most up to date link as of now.
@@ -308,7 +308,7 @@ References | |||
.. [2] tp_call/PyObject_Call calling convention | |||
https://docs.python.org/3/c-api/typeobj.html#c.PyTypeObject.tp_call | |||
.. [3] Using PY_VECTORCALL_ARGUMENTS_OFFSET in callee | |||
https://github.com/markshannon/cpython/blob/vectorcall-minimal/Objects/classobject.c#L53 | |||
https://github.com/markshannon/cpython/blob/815cc1a30d85cdf2e3d77d21224db7055a1f07cb/Objects/classobject.c#L53 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this guaranteed to exist forevermore? I'd prefer to use a wayback machine link over this at the time it was written.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, at least pretty forevermore. GitHub describes them as permanent links and permalinks:
A big downside of the Wayback Machine is it's much slower to load than GitHub.
And there doesn't appear to be any Wayback Machine link for that page, let alone for when the PEP was written.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the link, sounds reasonable. (I'd probably prefer that authors used tags or similar going forwards, but for "fixing" previous this may be the least worst solution).
A
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd probably prefer that authors used tags or similar going forwards, but for "fixing" previous this may be the least worst solution.
Tags are mutable and can change or be deleted, and require the user to either find a suitable tag (if one exists), or (if they control the repo) create one, which is far more effort than simply pressing the y
key. That's why always using proper permalinks is the standard, best practice approach for links that are, well, permanent. So long as authors are using inline links correctly (which we will make sure they will, going forward) what is the downside to explicitly specifying a commit hash?
@@ -97,8 +97,8 @@ but also on the value of some class attributes, like the ``BinOp`` | |||
example above. The Visitor pattern is insufficiently flexible for | |||
this: it can only select based on the class. | |||
|
|||
For a complete example, see | |||
https://github.com/gvanrossum/patma/blob/master/examples/expr.py#L231 | |||
See a `complete example |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you only change the link here. See preference for wayback above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I normally would have, but it makes the link much longer, "messier" and harder to read if as plain text, both in the source and in the rendered HTML, so I made it a standard inline link to perhaps better preserve the spirit, if not the literal text of the author's intent. I don't think Guido would object to that, but we can ask him if you really think its needed.
Welp, I said I wouldn't propose non-critical changes that were related to your work again and leave that to the PEP Editors so as to avoid a repeat of last time—but I didn't exactly count on the three of us being made PEP Editors, heh. I just really wish we didn't have to get so caught up in bikeshedding over seemingly every small detail (most of which boil down to personal preference) on routine infra maintenance PRs—that costs us both far more valuable time and creates much more "noise" and churn than any of the changes here, and I'm sure both of us have more important things to move on to instead (and I know I can be guilty of it too). But given this time I'm responsible for most of the infra involved, it wouldn't be fair to you or anyone to just decide it isn't worth dealing with, close this and put the onus on you to go implement everything your way instead.
It seems you already resolved this with Hugo above, so I'll collapse this. Commit hashes are immutable (well, outside of extraordinary and very deliberate action), so the only way the links break is if the repo owner either perma-deletes their repo (which would again require very unusual and deliberate action), moves it and creates another repo with the same name but missing that commit, or GitHub itself changes up its link format and doesn't provide redirects (which would break the world). Unlike modifying a file on a branch, all of these are highly unusual and require very purposeful action on the part of the repo owner, which on PEPs is generally either repos the PEP authors control, or major maintstream projects; this is about as likely as something happening to it in the Wayback machine's archive (a takedown, data loss, deletion of old archives, a technical problem, URL change, going offline, etc). Furthermore, let's put things into perspective—as I discussed with Hugo above, we're talking specifically about links to specific line numbers on VCS systems, which are far more mutable than a file on a branch, much less a commit hash. Using Wayback Machine links for just these cases but not links to whole files doesn't really make sense, so we'd want to enforce it at least on all VCS links,. But wait, why are we enforcing it only for VCS links? Surely a link to a commit hash on GitHub is much more likely to be stable than any arbitrarily link to a website, which break all the time. So it only makes sense to enforce it for all links (or at least many/most of them...but how to detect that). But that opens a whole other can of worms... And in any case, using permalinks for online VCS systems is a well accepted and used best practice, whereas using Wayback Machine links for the same isn't really seen outside of some parts of Wikipedia. Besides the greatly decreased usability of the resulting displayed site, it isn't nearly as reasonable and much more burdensome to require all PR authors to use the Wayback Machine instead of standard VCS links (or all links) and many more cases that would need an exception, making things more painful for PEP readers, authors and editors alike. If you want to propose it, that's fine and we can discuss it in an appropriate venue, but it is a much bigger than than proposed here, and thus well out of scope for this PR.
As mentioned, I tried to be pretty conservative and only enabled a limited number of built-in checks after careful investigation that were not just style-related and had an articulatable benefit for this repo, not enabling any that required fixing a few existing files in the repo, none that required dealing with any existing false positives throughout all past PEPs (except for the new PEP link check you specifically requested), and not adding any code checkers, linters or formatters (beyond simply verifying valid AST).
It only added the execute bit, following the standard convention that files with shebangs should be marked executable, and the converse, which all the other existing files in the repo respect, and which the new checks enforce (catching instances like on your recent PR, given this is easy to miss). Its just a mode change, so it doesn't create extra noise in the git blame, and is a smaller change than (and avoids the small but perpectual maintenance cost of) an explicit exception.
I didn't rename the files, I only tweaked their display names in the UI. If you have specific suggestions for revised or alternate names I'd be happy to hear them, but the current names are very confusing, at least to me as a PEP editor (and one quite familiar with the build system), to the point of being actively misleading (I only fully realized what they actually referred to after digging through the files, and I still have to remind myself):
It clearly and descriptively document what they do (and make the UI a bit more polished), as well as be consistent in the format of each workflow step, and between workflows. So if a maintainer is willing to do the work to add it as part of more substantive and needed changes, it adds more churn and wasted effort to ask them to remove them without a clear rationale for why they are harmful. |
If we still can't agree on things, I suggest that @hugovk step in and give his take rather than just us go back and forth debating, since I'm sure we both have much better things to do and he's someone we both know and trust, someone who's been considerably involved in both of our work on this repo, and who has a background in both of these areas. This PR already has nearly as many comments as it has lines of code changed, so I don't want to drag this on much longer (especially since it is blocking my next PR, to add the PEP 12 changes and corresponding check you requested). |
Co-authored-by: CAM Gerlach <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all of these changes, and sorry for not checking this PR earlier! While I might have preferred a series of more focused PRs, these changes all look reasonable and useful.
Thanks; I would have liked to have waited for @AA-Turner 's response, though I can address any remaining substantive concerns on the followup PR.
Just to make sure it didn't get lost in the discussion, I did split off the PEP 12 changes and the corresponding new checks, which I will rebase and follow up with now, leaving this one to just be bumping the various versions, enabling a few built-in checks and doing a bit of general maintenance cleanup. My future PRs will be much more focused on specific changes on specific PEPs, as generally discussed in advance. |
Opened as #2291 |
In #2209 , @AA-Turner batch-converted existing PEPs to use explicit:pep:
and:rfc::
roles to reference other PEPs and RFCs, rather than relying on hacky, fragile and limited implicit links, or hardcoding URLs (often in footnotes). However, some users may still be tempted to do the later, whether due to the limitations of GitHub's rendered previews (as saw in #2282 ), or simply being unaware of the role's existence. Ensuring we always reference PEPs with the appropriate roles rather than hard links is simpler, more concise and more reliable, ensures they are formatted correctly and (with the new build system proposed in PEP-0676) allows us to do things like automatically including the PEP title in the link titletext, as users have requested.Therefore, as originally prompted by @AA-Turner 's suggestion on #2282 , this PR adds a cheap Pre-Commit pygrep check for direct hardcoded URLs to other PEPs and RFCs. This avoids regressions in future PEPs (and spot-fixes the couple remaining instances not caught by that PR) and allowing authors and reviewers to easily spot where this occurs and replace it with the simpler, more concise, more reliable and more functional roles. Furthermore, this PR substantively updates PEP-0012 to reflect this new style, explaining how to use the roles and giving examples, as well as updating outdated content related to the old PEP link footnotes deprecated in #2130 .This was deferred to another PR at the request of @AA-Turner
Also, this piggybacks on an existing branch that upgrades of the linting infra, adds a handful of relevant new built-in checks (for a few common errors, format validity and that VCS links are permalinks, to ensure they don't break or point to something other than intended down the road) and fixes a small number of related issues. Like all checks (aside from codespell), these run in the CI and can optionally be run (on-demand or automatically) by contributors locally, identifying and fixing issues right as they commit.
Add new pre-commit check for bare/direct PEP and RFC linksDeferred by request of @AA-TurnerDescribe usingDeferred:pep:
and:rfc:
roles in PEP 12, and update/remove outdated references section contentUpdate a couple spots that still used the deprecated PEP/RFC referencesDeferred