Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix latest version parsing for crates with hyphens in their name #147

Merged
merged 1 commit into from
Feb 14, 2022

Conversation

rrbutani
Copy link
Contributor

Crates with hyphens have their version extracted from the DOM (when viewing the latest version of a crate and adding it to the extension's index) incorrectly.

For example, adding the latest embedded-hal release to the extension's index results in this:

Screen Shot 2022-02-11 at 2 00 39 AM

This in turn causes the extension to produce invalid docs.rs links.


This snippet (which itself addressed fallout from rustdoc changing its version output) is the problematic bit:

if (crateVersion === 'latest') {
let versionText = document.querySelector('.nav-container a.crate-name>.title').textContent;
crateVersion = versionText.split('-')[1];
}

Updating this to take the last element after splitting on - instead of the second fixes this case but I think this leaves other edge cases unhandled.

For example, cargo and friends allow for pre-release versions which are allowed to have hyphens (i.e. 0.0.1-my-extremely-unstable-release). While it's unlikely that the docs.rs "latest" link for a crate will redirect to one of these, it is still possible – docsrs will search stable, unyanked releases first but will fall back to pre-releases.

The wasi crate is one such example of this (no "stable" releases as of this writing, pre-release version has a hyphen in it: 0.11.0+wasi-snapshot-preview1).


Reverting to the previous method (grabbing the version from the sidebar) and changing the query to 'nav.sidebar .version' is general enough to support pages generated before and after the rustdoc version change without being too general (and potentially picking up things in user-added HTML snippets) I think. This is the change I have implemented.

The downside to this approach is that it doesn't work on rustdoc output that predates the addition of the version in the sidebar; since docs.rs doesn't rebuild docs for older releases this can be a real concern for older stable crates that haven't had a release in a while.


Another approach is to snoop through some of the relative links on the page and to extract the version from the relative URLs there. There doesn't seem to be an obvious thing in the DOM to go after and we're definitely still susceptible to changes in rustdoc this way; I'm not sure if this is worth doing.

Yet another option is to pick an approach based on the rustdoc version in rustdoc-vars (i.e. document.querySelector("#rustdoc-vars").getAttribute("data-rustdoc-version")). This could help a little but it's worth noting that it itself is a relatively recent addition to the rustdoc HTML output, I think.

If either of the above or some other approach are preferable, please let me know; I'm happy to update this PR.

@Folyd
Copy link
Member

Folyd commented Feb 11, 2022

Hi, @rrbutani. Thanks for the PR. ❤️

The main purpose why commit 4e84385 changed the CSS selector is because nav.sidebar .version only works in the root HTML page. For example, you can't get the crate version on this page: https://docs.rs/embedded-hal/latest/embedded_hal/trait.Capture.html. However, the user can click the Add to Rust Search Extension button on any page of that crate. The .nav-container a.crate-name>.title is a crate-wide selector, hence I picked it.

We have to say the librustdoc changes rapidly and frequently, it's hard to keep pace with it. The messy condition checks in add-search-index.js are the proof. According to my experience, we should follow those steadiness things as possible. The .nav-container is just a header of docs.rs, which I believe is changed infrequently than librustdoc. Therefore, I still prefer the current way.

Undoubtedly, the versionText.split('-')[1] is a stupid bug. 😅 Would you mind fixing this?

@rrbutani
Copy link
Contributor Author

rrbutani commented Feb 11, 2022

The main purpose why commit 4e84385 changed the CSS selector is because nav.sidebar .version only works in the root HTML page.

Oh, I see; thanks for the context!

The .nav-container is just a header of docs.rs, which I believe is changed infrequently than librustdoc. Therefore, I still prefer the current way.

I agree; I think this also has the added benefit of being the same across all documentation output, regardless of it's generation date.

Undoubtedly, the versionText.split('-')[1] is a stupid bug. 😅 Would you mind fixing this?

I think a regex that picks out -[0-9]+.[0-9]+.[0-9]+.* would do the trick (crate names can end with numbers but can't have dots) and handle all the cases identified but I'd still rather avoid this if possible.


@Folyd How about going after the crate title (space separated crate name and version) in the menu?

embedded-hal drop down menu screenshot

document.querySelector(".nav-container #crate-title").textContent.trim().split(" ")[1] gets this, I think; is that acceptable?

rrbutani added a commit to rrbutani/rust-search-extension that referenced this pull request Feb 11, 2022
(huhu#147 (comment))

This lets us avoid needing to separate out the crate name and version ourselves and uses
the more stable and uniform docs.rs generated DOM elements.
@rrbutani
Copy link
Contributor Author

rrbutani commented Feb 11, 2022

@Folyd I've updated the PR to match ^

Copy link
Member

@Folyd Folyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is another file that needs fixing:

if (crateVersion === 'latest') {
let versionText = document.querySelector('.nav-container a.crate-name>.title').textContent;
crateVersion = versionText.split('-')[1];
}

Crates with hyphens have their version extracted from the DOM (when
viewing the latest version of a crate and adding it to the extension's
index) incorrectly.

This in turn causes the extension to produce invalid docs.rs links.

----

[This snippet](huhu@4e84385#diff-dc9969d9ec58ceb09765359c0caa6852a087b462d98bb9a7e45f1ac75c79b066L12-R14)
(which itself addressed[fallout](huhu@7483ba3#diff-dc9969d9ec58ceb09765359c0caa6852a087b462d98bb9a7e45f1ac75c79b066R12-R15)
from `rustdoc` [changing its version output](rust-lang/rust@6a5f8b1#diff-40a0eb025da61717b3b765ceb7fab21d91af3012360e90b9f46e15a4047946faL1768-L1776))
is the problematic bit.

Updating the logic linked above to take the _last_ element after
splitting on `-` instead of the second fixes this case but I think this
leaves _other_ edge cases unhandled.

For example, `cargo` and friends allow for [pre-release versions which
are allowed to have hyphens](https://semver.org/#spec-item-9) (i.e.
`0.0.1-my-extremely-unstable-release`). While it's unlikely that the
docs.rs "latest" link for a crate will redirect to one of these, it is
still possible – `docsrs` will [search stable, unyanked releases _first_
but *will* fall back to pre-releases](https://github.com/rust-lang/docs.rs/blob/dad5863093535004623df9e7d3789a11502313a5/src/web/mod.rs#L341-L368).
The [`wasi` crate](https://docs.rs/wasi/latest/wasi/) is one such
example of this (no "stable" releases as of this writing, pre-release
version has a hypen in it: `0.11.0+wasi-snapshot-preview1`).

Reverting to the previous method (grabbing the version from the sidebar)
and changing the query to `'nav.sidebar .version'` is general enough to
support pages generated before and after the `rustdoc` version change
without being _too_ general (and potentially picking up things in
user-added HTML snippets). This is the change this commit implements.

The downside to this approach is that it doesn't work on `rustdoc`
output that predates the addition of the version in the sidebar; since
docs.rs [doesn't rebuild docs for older releases](rust-lang/docs.rs#464)
this can be a real concern for older stable crates that haven't had a
release in a while.

---

Another approach is to snoop through some of the relative links on the
page and to extract the version from the relative URLs there. There
doesn't seem to be an obvious thing in the DOM to go after and we're
definitely still susceptible to changes in `rustdoc` this way; I'm not
sure if this is worth doing.

Yet another option is to pick an approach based on the `rustdoc` version
in [`rustdoc-vars`](https://github.com/rust-lang/rust/blob/502d6aa47b4118fea1e326529e71b25a99b0d6c5/src/librustdoc/html/templates/page.html#L147)
(i.e. `document.querySelector("#rustdoc-vars").getAttribute("data-rustdoc-version")`).
This could help a little but it's worth noting that it itself is a
relatively recent addition to the `rustdoc` HTML output, I think.
@Folyd Folyd merged commit dafa48a into huhu:master Feb 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants