Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Markdown content with custom slug fails to resolve relative links ("error Docusaurus found broken links!") #6210

Closed
4 of 7 tasks
Tracked by #632
Domiii opened this issue Dec 28, 2021 · 16 comments
Closed
4 of 7 tasks
Tracked by #632
Labels
bug An error in the Docusaurus core causing instability or issues with its execution closed: working as intended This issue is intended behavior, there's no need to take any action.

Comments

@Domiii
Copy link

Domiii commented Dec 28, 2021

Have you read the Contributing Guidelines on issues?

Prerequisites

  • I'm using the latest version of Docusaurus.
  • I have tried the npm run clear or yarn clear command.
  • I have tried rm -rf node_modules yarn.lock package-lock.json and re-installing packages.
  • I have tried creating a repro with https://new.docusaurus.io.
  • I have read the console error message carefully (if applicable).

Description

yarn build fails with error Docusaurus found broken links! if a markdown docs page with custom slug has a relative link in it.

Steps to reproduce

  • Add classic preset with docs enabled, like so:
      docs: {
        routeBasePath: '/',
        path: 'content'
    }
  • Add two docs:
    • content/01-ch1/01-sec1.mdx:
      ---
      slug: '/'
      ---
      
      hi - this is the landing page!
      
      More good stuff in [sec2](./sec2).
    • content/01-ch1/02-sec2.mdx:
      This is chapter1 -> section2.    

Expected behavior

Relative paths should not error out, even in case of custom slug.

Actual behavior

  • Run yarn build:
    error Docusaurus found broken links!
    Please check the pages of your site in the list below, and make sure you don't reference any path that does not exist.
    Note: it's possible to ignore broken links with the 'onBrokenLinks' Docusaurus configuration, and let the build pass.
    Exhaustive list of all broken links found:
    - On source page path = /ch1:
      -> linking to ./sec2 (resolved as: /sec2)
    

NOTE1: This does not seem to be a problem with a slug referring to a different path. It still errors out if slug: /ch1/

NOTE2: It stops complaining, if I use the absolute path /ch1/sec2.

Your environment

Reproducible demo

No response

Self-service

  • I'd be willing to fix this bug myself.
@Domiii Domiii added bug An error in the Docusaurus core causing instability or issues with its execution status: needs triage This issue has not been triaged by maintainers labels Dec 28, 2021
@Domiii Domiii mentioned this issue Dec 28, 2021
28 tasks
@Josh-Cena
Copy link
Collaborator

Relative links are only file paths when they end in .md or .mdx—we don't try to guess if you mean file paths or URL paths if the extension is not present because ./sec2 is also a valid URL. You should use full Markdown path: ./02-sec2.md

@Josh-Cena Josh-Cena added closed: working as intended This issue is intended behavior, there's no need to take any action. and removed status: needs triage This issue has not been triaged by maintainers labels Dec 28, 2021
@Domiii
Copy link
Author

Domiii commented Dec 28, 2021

@Josh-Cena If this is not supposed to work at all, why do relative links work if they are placed in a file without custom slug?

Also, just to clarify: the problem is linking from a custom slug file, not to it.

@Josh-Cena
Copy link
Collaborator

If you don't have a custom slug, then the two files are served with the same base path: /ch1/sec1 and /ch1/sec2. When you write a URL like ./sec2 your browser resolves that to /ch1/sec2. This is not a file path.

@Domiii
Copy link
Author

Domiii commented Dec 28, 2021

I am not sure I understand what you are saying. The /ch1/sec2 path (or ./sec2 from within the same parent path) exists and works fine. In fact, all paths work just fine.

The problem is only with sec1's complaining about the path not existing, even though I can link to it from any other file just fine.

@Josh-Cena
Copy link
Collaborator

Josh-Cena commented Dec 28, 2021

Sorry, was having dinner, didn't explain it clear enough.

In Docusaurus, there are two types of paths: file paths and URL paths. File paths are what exist on your system; URL paths are what your site's visitors see in their browser.

Without extra configuration, they usually map very intuitively:

  • docs/ch1/sec1.md => /docs/ch1/sec1
  • docs/ch1/sec2.md => /docs/ch1/sec2

What slug does is that it manually sets the URL.

  • docs/ch1/sec1.md (slug: /) => /docs
  • docs/ch1/sec2.md => /docs/ch1/sec2

A key observation is that these two path systems are isomorphic: .. goes one level above, . stays on the current level, ./slug goes to a slug subpath. If you are given a path, you have no way to tell if it's a file path or URL path.

So, how should you link pages? As implied in my previous response in #6208, Docusaurus wants to prioritize using file paths over URLs in general, because editors can help you update file references during refactoring, but URLs are a downstream feature. Therefore, in ch1/01-sec1.md, if you write [link](./02-sec2.md), Docusaurus understands that this is a file path, and automatically converts that to a URL path for you. It can do that because the docs plugin keeps track of the file path => URL mapping, and it knows that 02-sec2.md file eventually becomes the /ch1/sec2 URL.

However, if you write [link](./sec2), does Docusaurus know that this is a file path? No, because <a href="./sec2">link</a> is, in fact, a common pattern to link to another page, compared to <a href="./02-sec2.md">link</a> which is very unlikely to happen (.md files aren't present in the build output; they are converted to .html files). Therefore, Docusaurus sees this as a URL and leaves it as-is.

So, what page does <a href="./sec2">link</a> link to? We go back to your page's URL. Because you have set slug: /, docs/ch1/sec1.md is now mapped to http://example.com/docs. By the argument that URL and file paths are isomorphic in behavior, ./sec2 will be resolved as http://example.com/sec2 by the browser (remember: all this happens on the client side!), which is, well, a non-existent path.

How should you solve it? If you want to use file paths, explicitly tell Docusaurus that it's a file path by adding the extension: [link](./02-sec2.md). If you want to use URL paths, make sure it resolves correctly: [link](/docs/ch1/sec2).

Does that clear up some doubts?

@Josh-Cena
Copy link
Collaborator

Also, you could ask in our Discord server first if there's anything not working: https://discord.com/invite/docusaurus

We can help you figure out if it's a bug first, so we don't clog the GitHub issue tracker.

@Domiii
Copy link
Author

Domiii commented Dec 28, 2021

Again, @Josh-Cena. Your help has been much appreciated!

I will heed your advice and take future issues to Discord first. Thanks for the invite!

@Domiii
Copy link
Author

Domiii commented Dec 28, 2021

@Josh-Cena Again, thanks for the elaborate answer!

I guess, explaining how files and URL resolution works was not necessary. The main piece of information I was missing seems to be this:

What slug does is that it manually sets the URL.

  • docs/ch1/sec1.md (slug: /) => /docs

That implies that the slug config override just wipes default URL resolution. In order to fix my problems, I had to change all relative paths of the folder, to be relative to the root instead. Meaning that ./sec2 needs to be changed to ./ch1/sec1.

Some final thoughts:

To my mind, relative URL resolution should be relative to the file, not relative to the slug override. This current approach is somewhat non-intuitive.
I understand that file and URL resolution is different from another, and what I'm asking for is basically partially mixing the two.

Maybe, as a middleground, slug changing relative link behavior should be mentioned on https://docusaurus.io/docs/docs-introduction?

@Josh-Cena
Copy link
Collaborator

Josh-Cena commented Dec 28, 2021

URLs will never be relative to the file. When your file has been turned to HTML pages, how does the browser know what the original path is like? You are missing the key part here: file path is server-side, URL path is client-side. The server knows about the client because it builds the client, but the client is ignorant about the server. That's what I mean by saying that "URL is downstream".

Also remember that FS structure tells you nothing about the URL structure. Files that happen to be in the same folder can end up with drastically different URL. Can you even provide a well-defined heuristic of what "URL relative to the file" is like? If you can (in the form of pseudocode or some test cases) from there we can make another feature request.

And that's exactly the reason why I said you should use MD links instead of URL links. Changing the routes can lead to unexpected broken links.

@Domiii
Copy link
Author

Domiii commented Dec 28, 2021

As I said: I agree, that fixing the behavior is not an easy feat. My argument does not go against the difficulty of the technical implementation. As I said: I agree that this is not easy to change.

But from a documentation editor perspective, it is odd:

[a](b/c/d) behaves one way if slug is set, and another if it is not set.

I still think, It would be best if it would at least be mentioned in the slug documentation, no?

I guess, your argument is that, in general, relative urls should not be used. I can now agree with that. Using file-based linking also has the advantage that VSCode (or other IDEs) can easily track and fix them when files change.

If that is the case, maybe there should be a togglable warning for people who are as stupid as myself :D

@Josh-Cena
Copy link
Collaborator

[a](b/c/d) behaves one way if slug is set, and another if it is not set.

I hope you understand what I had been emphasizing above. b/c/d is not a file path, it's a URL path and it's up to the browser to decide what it resolves to. You should expect it to behave differently when slug is set because the routing has been changed.

URL links, relative or absolute, can still be useful in many cases because file links currently only work when you link files within one plugin. This might be fixed in the future.

I still think, It would be best if it would at least be mentioned in the slug documentation, no?

Mmm, we can. I've actually explained to multiple people about how routing works. I kind of added some details here: https://docusaurus.io/docs/next/docs-introduction#docs-only-mode but ultimately I think it would be useful to have a centralized place where we talk about all things related to routing: routeBasePath, conflicting routes, slugs, file path => URL path, relative / absolute URLs... If you want to take the first step, PR welcome

@slorber
Copy link
Collaborator

slorber commented Dec 28, 2021

Agree that users generally struggle with our routing system. The docs-only mode also makes it even more confusing as you have to combine routeBasePath + slug to set a doc as your homepage. I've seen recently a live stream where a user struggled, also encountered the "onBrokenLinks" error after changing a slug, and wasn't sure how to fix it.

In this case, however, we didn't invent anything, and relative links are relative to the current page they appear in, it's just how the web works. Just write 2 HTML files locally in different folders and test for yourself.

That's also why we tend to recommend using relative file paths:

https://docusaurus.io/docs/next/docs-markdown-features

image

@slorber
Copy link
Collaborator

slorber commented Dec 28, 2021

I guess, your argument is that, in general, relative urls should not be used. I can now agree with that.

This is not a guess, it is actually a documented recommendation, see above. Is there a better place where we should recommend this? I feel it's in the right place, and you just didn't read the whole doc. It's hard to make everything intuitive and in the end you have to read the doc 🤪

If that is the case, maybe there should be a togglable warning for people who are as stupid as myself :D

We have a system to detect broken links preventing you to deploy a broken site, isn't it covering this use-case?

@Domiii
Copy link
Author

Domiii commented Dec 28, 2021

@slorber About my "togglable warning" recommendation - the idea is to show a warning if it is a relative link under "certain circumstances". E.g., in my case I should not try to link to my docs content pages, unless I link the file directly. But I can see that there might be a lot more obscure cases that would not be easily detected automatically.

Maybe, to act as an early warning system - it could become a blanket warning in case of any links that the "server-side URL parser" fails to be able to link to a file?
NOTE: I am not quite sure I understand the entire system. My guess is that the server parses all URLs and tries to convert them if they match certain patterns, e.g. they don't directly map to a file. That is my description of what @Josh-Cena mentioned here:

in ch1/01-sec1.md, if you write link, Docusaurus understands that this is a file path, and automatically converts that to a URL path for you.

I guess, this would be an extension of onBrokenLinks: but instead of triggering only for broken links, it would trigger if the link is relative and cannot be directly mapped to a file.... maybe? It could even trigger in dev mode, from what I understand, which is always much appreciated (instead of having to wait until build before all the problems are shown).

@slorber
Copy link
Collaborator

slorber commented Dec 29, 2021

We already emit a warning for all links ending with .md or .mdx that we couldn't resolve, and there's even a config onBrokenMarkdownLinks: https://docusaurus.io/docs/api/docusaurus-config#onbrokenmarkdownlinks

All other links are left untouched and added as is to the final HTML, so it's your responsibility to be sure your own links work.

We can't add warnings everywhere, because in the end, using relative links is something legit to do, that some users currently use in production. Do we really want to show them new unwanted warnings while their system is working? I don't know, but if you have a warning in mind, please give a very concrete example of what to display to the user, under which circumstances.

Please try to be more specific, using terms like "under certain circumstances" is vague and not very helpful. If you want a new warning in Docusaurus, please create a repro repository highlighting a very concrete problem that you want us to solve.

@Domiii
Copy link
Author

Domiii commented Dec 29, 2021

@slorber I agree. My last comment was not sufficiently concise, and also tried to over-engineer things.

Ultimately, it is probably easiest to go with Josh's idea:

I think it would be useful to have a centralized place where we talk about all things related to routing: routeBasePath, conflicting routes, slugs, file path => URL path, relative / absolute URLs

And then, crucially, add a link to the onBroken*Links warning/error message, and it would probably fix the problem of people (like myself) not knowing what they did wrong :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug An error in the Docusaurus core causing instability or issues with its execution closed: working as intended This issue is intended behavior, there's no need to take any action.
Projects
None yet
Development

No branches or pull requests

3 participants