Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(core): prevent 404 when accessing /page.html #7184

Merged
merged 13 commits into from
Apr 22, 2022

Conversation

Josh-Cena
Copy link
Collaborator

@Josh-Cena Josh-Cena commented Apr 17, 2022

Motivation

Because we use trailingSlash: false in production, https://docusaurus.io/docs/installation.html will send the correct HTML file, but the router doesn't understand it because it only normalizes paths with /index.html suffix. I think we can extend this to .html extensions in general.

Have you read the Contributing Guidelines on pull requests?

Yes

Test Plan

Interestingly yarn serve seems to be happy with /docs/installation.html. I don't know if it has anything to do with the cleanUrls or related settings. Anyways, I simply forced trailingSlash: false in the preview. https://deploy-preview-7184--docusaurus-2.netlify.app/docs/installation.html is now accessible.

@Josh-Cena Josh-Cena added the pr: bug fix This PR fixes a bug in a past release. label Apr 17, 2022
@facebook-github-bot facebook-github-bot added the CLA Signed Signed Facebook CLA label Apr 17, 2022
@netlify
Copy link

netlify bot commented Apr 17, 2022

[V2]

Name Link
🔨 Latest commit 2b82e53
🔍 Latest deploy log https://app.netlify.com/sites/docusaurus-2/deploys/6262b12eb5e018000867ce11
😎 Deploy Preview https://deploy-preview-7184--docusaurus-2.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

@github-actions
Copy link

github-actions bot commented Apr 17, 2022

⚡️ Lighthouse report for the changes in this PR:

Category Score
🟠 Performance 70
🟢 Accessibility 100
🟢 Best practices 92
🟢 SEO 100
🟢 PWA 90

Lighthouse ran on https://deploy-preview-7184--docusaurus-2.netlify.app/

@github-actions
Copy link

github-actions bot commented Apr 17, 2022

Size Change: +356 B (0%)

Total Size: 802 kB

Filename Size Change
website/build/assets/css/styles.********.css 107 kB +194 B (0%)
website/build/assets/js/main.********.js 606 kB +152 B (0%)
ℹ️ View Unchanged
Filename Size Change
website/.docusaurus/globalData.json 50.1 kB +10 B (0%)
website/build/index.html 38.8 kB 0 B

compressed-size-action

@Josh-Cena Josh-Cena changed the title fix(core): normalize /path.html to /path to prevent 404 fix(core): normalize /path.html location to /path to prevent 404 Apr 17, 2022
Copy link
Collaborator

@slorber slorber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 important things to consider:

@Josh-Cena
Copy link
Collaborator Author

Josh-Cena commented Apr 20, 2022

Many legacy doc sites moving to Docusaurus want to preserve older canonical urls for SEO.

That's the entire purpose of this PR. Before, the .html extension would make the page not render, because Netlify serves the HTML file without a redirect, but the router doesn't understand this location.

This page renders better than before but produces FOUC after hydration

Do you have a repro? It looks the same to me even under throttling.

@Simek
Copy link
Contributor

Simek commented Apr 21, 2022

Refs:

@Josh-Cena
Copy link
Collaborator Author

I even recommended some users to use slug: /xyz.html for site migration purposes

TBF, that looks like a horrible pattern. You shouldn't encode the extension in the location. We should have made this fix long ago, then, if people have to commit to such horrible solutions.

@slorber
Copy link
Collaborator

slorber commented Apr 21, 2022

I even recommended some users to use slug: /xyz.html for site migration purposes

TBF, that looks like a horrible pattern. You shouldn't encode the extension in the location. We should have made this fix long ago, then, if people have to commit to such horrible solutions.

There is no strict rule here.

Some users really want the links to contain the extension, and the canonical URL to contain it too, for SEO and migration reasons.

If you have an existing doc site and all the links contain the extension, and all the canonical URLs known by google contain the extension, then when upgrading you want to minimize the SEO impact and migrate to Docusaurus while keeping this extension in site links and canonical URLs.

Allowing /xyz to be served from /xyz.html is nice but it's not what users want to ensure a proper SEO migration.

That's why I'd like this PR to contain a dogfood doc with slug: "xyz.html", to ensure it's still possible to force the canonical URL and linking of the page to still contain the extension

@Josh-Cena
Copy link
Collaborator Author

Okay, I see

@slorber
Copy link
Collaborator

slorber commented Apr 21, 2022

This page renders better than before but produces FOUC after hydration

Do you have a repro? It looks the same to me even under throttling.

This link is good enough to reproduce: https://deploy-preview-7184--docusaurus-2.netlify.app/docs/installation.html

Just hit refresh, eventually throttle your CPU

video: https://cln.sh/RNi85L

This doesn't happen on https://deploy-preview-7184--docusaurus-2.netlify.app/docs/installation

@Josh-Cena
Copy link
Collaborator Author

Okay, I see what you mean. PendingNavigation is not capturing the route correctly and letting the component update, leading to react-loadable's loading view being displayed. Might be because the location is changed twice. Going to take a look later; I already messed up in 49a9fe2 😁

@Josh-Cena
Copy link
Collaborator Author

@slorber I've tested. Using slug: dummy.html is working as expected: https://deploy-preview-7184--docusaurus-2.netlify.app/tests/docs/dummy.html You can see that the canonical URL still have the .html extension, and so do the links linking to it. normalizeLocation also wouldn't do anything in this case.

Do you think I need to fix the flash of loading screen in this PR, or should we figure that out later, considering it's much more minor compared to the 404 page before?

@Josh-Cena Josh-Cena changed the title fix(core): normalize /path.html location to /path to prevent 404 fix(core): prevent 404 when accessing /page.html Apr 22, 2022
@Josh-Cena Josh-Cena requested a review from slorber April 22, 2022 11:30
@slorber
Copy link
Collaborator

slorber commented Apr 22, 2022

If it's easy to fix in this PR let's do this now, otherwise we can merge

Similarly, does it make sense to also ensure this URL works? https://deploy-preview-7184--docusaurus-2.netlify.app/tests/docs/dummy => should be quite easy and similar to current code?

Does all this work with trailing slash: true?
(not a big if it's not the case)

@@ -18,8 +20,16 @@ export default function normalizeLocation<T extends Location>(location: T): T {
};
}

// If the location was registered with an `.html` extension, we don't strip it
// away, or it will render to a 404 page.
const matchedRoutes = matchRoutes(routes, location.pathname);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also handle matchRoutes(routes, location.pathname + ".html"); ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't do that for now, because it makes less sense.

/docs/installation.html is a physical file that exists and can be served, so using this as a location makes sense. If the route is /docs/installation, we should try to normalize /docs/installation.html to that.

However, if both the route and the physical file is /docs/installation.html, then /docs/installation doesn't really make sense—it's nowhere to be found. Moreover, I'm worried that this would lead to more edge cases, especially around trailing slash and index.html.

website/docusaurus.config.js Outdated Show resolved Hide resolved
@Josh-Cena
Copy link
Collaborator Author

Does all this work with trailing slash: true?

We can test this before merging, when I revert the trailingSlash config change

@Josh-Cena
Copy link
Collaborator Author

Josh-Cena commented Apr 22, 2022

Everything works as expected @slorber

The PendingNavigation looks slightly non-trivial. I'd need to investigate it more.

@slorber slorber merged commit c4e92c8 into main Apr 22, 2022
@slorber slorber deleted the jc/fix-normalize-location branch April 22, 2022 15:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed Signed Facebook CLA pr: bug fix This PR fixes a bug in a past release.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants