Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft script to find diverging links #1966

Merged
merged 9 commits into from
Oct 7, 2024
Merged

Conversation

Carreau
Copy link
Collaborator

@Carreau Carreau commented Aug 29, 2024

See Quansight-Labs/czi-scientific-python-mgmt#88

Incomplete, in particular we should handle relative and anchor linsks, starting with #, and .

$ python tools/divergent_links.py docs/_build/html

@Carreau Carreau changed the title Draft script to find divergin links Draft script to find diverging links Aug 29, 2024
Copy link

Coverage report

This PR does not seem to contain any modification to coverable code.

@Carreau Carreau marked this pull request as ready for review September 12, 2024 09:20
@Carreau
Copy link
Collaborator Author

Carreau commented Sep 12, 2024

Not perfect, but in a usable state

See Quansight-Labs/czi-scientific-python-mgmt#88

Incomplete, in particular we should handle relative and anchor links,
starting with #, and .

$ python tools/divergent_links.py docs/_build/html
@trallard trallard added tag: accessibility Issues related to accessibility issues or efforts kind: maintenance Improving maintainability and reducing technical debt labels Sep 19, 2024
@Carreau
Copy link
Collaborator Author

Carreau commented Sep 20, 2024

As a note from yesterday's meeting,

This just prints group of diverging link:

  • content of the a tag, href, and which page ti was found on.

As there are too many valid reasons to have diverging links, it does not fail, and is not in CI.

Reasons for diverging links are for example: next, previous, view on github ...

Copy link
Collaborator

@gabalafou gabalafou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a nice tool. Should we add some instructions in our docs so that theme adopters can also use it?

I have a few review questions and comments, could you look into those?

My biggest overall question though is whether we need to look for links with same name but different URLs across pages (versus evaluating a single page at a time for such links). In the case of same name-different URL links, the issue is most pressing when such links occur on a single page. It puts up a barrier for someone who uses speech commands (like "go to [link name]"). Imagine you have a long article that mentions various APIs and the word "API" gets linked to two different pages. Then when the user says "go to API", the speech command software has to pull up a secondary menu so that the user can clarify which URL they actually want to go to. But this is much less of a problem for such links that occur on different pages.

PS. While it is true that one of the WCAG criteria is for UI elements that are repeated across pages on the site to be consistently labeled, that's a bit of a different concern than the one I raised. In that case, it's like, if you have a "contact us" link in the same place across all the pages on your site, then it should NOT say "contact us" on one page but then "send us feedback" on another.

docs/community/topics/dependencies-js.md Outdated Show resolved Hide resolved
docs/community/topics/manual-dev.md Outdated Show resolved Hide resolved
tools/divergent_links.py Show resolved Hide resolved
tools/divergent_links.py Show resolved Hide resolved
tools/divergent_links.py Show resolved Hide resolved
tools/divergent_links.py Show resolved Hide resolved
tools/divergent_links.py Outdated Show resolved Hide resolved
tools/divergent_links.py Outdated Show resolved Hide resolved
tools/divergent_links.py Outdated Show resolved Hide resolved
Comment on lines 1 to 4
"""This script help checking divergent links.

That is to say, links to the same page,
that have different titles.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My terminology was terrible (divergent versus convergent) but you got my definition backwards. A divergent link is same name, different URLs.

So this got me a bit confused when I reviewed the PR.

I think the code in this PR actually detects divergent links (same name, different URLs), in which case the comment is wrong.

Copy link
Collaborator

@gabalafou gabalafou Sep 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PS. Please let's ditch my terminology. It's terrible. Nobody is going to be able to remember which side of divergent or convergent they are on. We should probably stick with names like same-name-different-URL links and same-URL-different-name links, unless we can think of something creative and memorable.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe, less of a mouthful... name-consistent links (same name, different URLs) versus name-inconsistent links (different names, same URL)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I don't like that either because it makes it sound like one is bad (inconsistent) and the other good (consistent) whereas neither of them are good. So maybe, URL-inconsistent links versus name-inconsistent links.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reworded.

@Carreau
Copy link
Collaborator Author

Carreau commented Sep 30, 2024

I don't think we need this script to be perfect, as mentioned in the title of the PR this is mostly a draft script as it is not part of CI, so I think we should not strive for perfection, but something that works and we can refine, until we are at a state were we are both happy with the content of the docs and the output/content of the script. Otherwise we'll spend an eternity tweaking that with enomous changes instead of progressive updates that are easier to review.

@Carreau
Copy link
Collaborator Author

Carreau commented Sep 30, 2024

I added also a --all options so that one can check each page independently or all the pages at once.

Copy link
Collaborator

@gabalafou gabalafou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating @Carreau.

My one concern right now is that if we merge this PR to the repo, we will essentially be committing orphaned code. The code added in this PR isn't called anywhere else. It's not referenced in the docs. It's not exactly clear how it is intended to be used.

I don't really know what we do next with this. Do we add it to one of the Sphinx built-time hooks? Do we add a sentence or two about the tool in the accessibility section of the docs?

tools/divergent_links.py Outdated Show resolved Hide resolved
tools/divergent_links.py Outdated Show resolved Hide resolved
tools/divergent_links.py Outdated Show resolved Hide resolved
tools/divergent_links.py Outdated Show resolved Hide resolved
gabalafou
gabalafou previously approved these changes Oct 3, 2024
Copy link
Collaborator

@gabalafou gabalafou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We chatted about this in the October 3 czi sci py team sync and outlined some next steps:

  1. Use the script on the PST docs
  2. If useful, add it as a CI job to PST
  3. Further down the road, if it proves to be useful for us, we can think about ways to make it useful for theme adopters

@drammock
Copy link
Collaborator

drammock commented Oct 3, 2024

My one concern right now is that if we merge this PR to the repo, we will essentially be committing orphaned code. The code added in this PR isn't called anywhere else. It's not referenced in the docs. It's not exactly clear how it is intended to be used.

FWIW other repos I work on have a tools/ folder at the repo root, full of things just like this (scripts that are useful for maintainers, not really part of the package, excluded from the MANIFEST, etc). So I don't see that as inherently problematic. That said, the functionality added here could be of general interest / utility for theme users. So long term I'd aim to either (1) add it to the public API of the theme (e.g., from pydata_sphinx_theme import check_link_consistency) or (2) make it a small separate package and host it in the scientific-python organization.

@gabalafou
Copy link
Collaborator

Thanks for those thoughts @drammock

@Carreau
Copy link
Collaborator Author

Carreau commented Oct 3, 2024

The goal is to have it run in CI and refine. int just that right now there are more diverging links that needs to be updated. My hope is that with this merged, every few days someone can send and update to the script and docs, until it runs without divergences.

Then we can start to think about packaging it and making it stable/configurable. Otherwise I can keep pushing on this PR, but that means I have to find the time to fix everything ; and I would prefer to get user feedback on the Usage .

drammock
drammock previously approved these changes Oct 3, 2024
Copy link
Collaborator

@drammock drammock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just one clarification. +1 for merge

tools/divergent_links.py Outdated Show resolved Hide resolved
@Carreau Carreau dismissed stale reviews from drammock and gabalafou via 20f93bf October 4, 2024 08:18
@gabalafou gabalafou self-requested a review October 7, 2024 11:23
@drammock
Copy link
Collaborator

drammock commented Oct 7, 2024

Ok let's try it out!

@drammock drammock merged commit 5135b8f into pydata:main Oct 7, 2024
25 checks passed
@Carreau Carreau added this to the Empty milestone. milestone Nov 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind: maintenance Improving maintainability and reducing technical debt tag: accessibility Issues related to accessibility issues or efforts
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants