Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support setting a base for absolute links #452

Open
norswap opened this issue Jan 10, 2022 · 13 comments
Open

Support setting a base for absolute links #452

norswap opened this issue Jan 10, 2022 · 13 comments
Labels
enhancement New feature or request

Comments

@norswap
Copy link

norswap commented Jan 10, 2022

By default, lychee doesn't seem to check absolute links (/-prefixed), except when setting the --base argument (this is weird by itself, since --base should be for the resolution of absolute link as per documentation).

It would be great to:

  • check absolute links by default, using the directory that lychee is run from as the root
  • have a --root argument that works like --base but for absolute links
@norswap norswap changed the title Jan 10, 2022
@mre mre transferred this issue from lycheeverse/lychee-action Jan 10, 2022
@mre
Copy link
Member

mre commented Jan 10, 2022

Hi, we decided to not check absolute links by default, because it could generate a lot of false-positives when assuming the current directory as a root. For instance, many static site generators put files into a subfolder like public or dist and in this case, assuming using the directory that lychee is run from as the root would not work.

--base can be a base url (e.g. https://example.com) or a directory path (acting like the --root you proposed). Setting both would be confusing. When to use the url and when the path when encountering an absolute link?

@norswap
Copy link
Author

norswap commented Jan 11, 2022

Hey Mathias! Thanks for the prompt answer!

There might a bit of a confusion, my suggestion is not about path vs URLs. Here's what I propose:

  • both --base and --root can be set to either a path or an url
  • when encoutering a relative path, prepend the value of --base
  • when encountering an absolute path (starting with /, but not http://, file://, etc), prepend the value of --root
  • only check absolute paths if the --root option is provided

What do you think?

(oh and sorry for posting this in the wrong repo, I had both open and was distracted!)

@dkarlovi
Copy link

I've created a similar issue for another project: wjdp/htmltest#184

Basically, if you're generating for a specific subdirectory, it doesn't work as expected, for example:

# imagine a generator here which creates all URLs like <a href="/something">
$ ./my-generator --base=https://example.com --out=./public
$ lychee --offline --base ./public ./public
🔍 1464 Total ✅ 1269 OK 🚫 0 Errors 💤 195 Excluded

works as expected.

# imagine a generator here which creates all URLs like <a href="/some/path/something">
$ ./my-generator --base=https://example.com/some/path --out=./public
$ lychee --offline --base ./public ./public
🔍 1464 Total ✅ 16 OK 🚫 1253 Errors (HTTP:1253) 💤 195 Excluded

It's trying to find files at ./public/some/path which doesn't exist, since the content of the ./public folder will be deployed to that location as is.

If I hack it with:

$ mkdir public/some
$ ln -s ../../public public/some/path
$ ll -d public/some/path
lrwxrwxrwx 1 dalibor.karlovic dalibor.karlovic 12 sij  19 09:54 public/some/path -> ../../public

now it works since paths will be evaluated like ./public/some/path/images/logo.svg via the symlink.

It would be great if either base or root could be set when running in local mode to say

The content of this folder will be deployed to https://example.com/some/path, resolve links you find in it like this (treat them like offline links):

https://example.com/some/path/images/logo.svg => ./public/images/logo.svg
/some/path/images/logo.svg => ./public/images/logo.svg

@mre
Copy link
Member

mre commented Jan 31, 2022

Just revisited this issue and your suggestions make sense. We can use the examples for unit tests. I agree that there needs to be a separation between --base and --root as they don't serve the same purpose. --offline can probably deprecated once we have --root.

@mre mre added the enhancement New feature or request label Feb 4, 2022
@dkarlovi
Copy link

@mre don't know if this is a new issue, so I'll shortly discuss it here since it's (IMO) related to the feature described here, which doesn't (yet) exists:

if you're deploying to most static site hosting platforms, they support the index file, where the URL https://example.com/foo/bar will route to ./public/foo/bar/index.html. If you're validating URLs in files generated with this new --root param, the validation process should allow for this, namely

https://example.com/foo/bar  => ./public/foo/bar/index.html
https://example.com/foo/bar/index.html  => ./public/foo/bar/index.html
/foo/bar/index.html  => ./public/foo/bar/index.html
/foo/bar => ./public/foo/bar/index.html

@mre
Copy link
Member

mre commented Mar 3, 2022

Your examples are for lychee --root ./public, right?

BASE_URL=https://example.com/some/path ./my-generator --out=./public

You mentioned two examples in the other issue that we can use as tests:

https://example.com/some/path/images/logo.svg => ./public/images/logo.svg
/some/path/images/logo.svg => ./public/images/logo.svg

I think all of this can be done with the current proposal (?). So if we have support for both --base and --root, we should be able to express all cases -- unless I'm missing something here. So, no new issue needed I think. Correct me if I'm wrong.

@dkarlovi
Copy link

dkarlovi commented Mar 3, 2022

@mre OK, I'll create another issue to discuss the missing index file resolving, thanks!

@mre
Copy link
Member

mre commented Nov 28, 2022

Quick note to self:
we should call it --base-url and --root-path to be more explicit about the purpose of these options.

@vanbroup
Copy link

@mre any chance to get these --base-url and --root-path options prioritised?

It's currently difficult to get lychee working properly on a GitHub pages repository as GitHub pages publishes to a subdirectory unless you configure a custom domain.

@george-gca
Copy link

@vanbroup I just now figured how to solve it for me.

In my case, we have a template site that when deploying to GitHub pages create absolute links internally. The template have a parameter in _config.yml that is baseurl, which is used when creating the urls. In the template demo it is set to baseurl: /al-folio. After building the jekyll template the directory _site/ is created, which is the one that will be deployed to GitHub pages.

In it we have, for example, a page with path _site/projects/4_project/index.html that internally links to images with path /al-folio/assets/img/11.jpg. This absolute path /al-folio/ actually is the _site/ path in the final version, meaning it will point to _site/assets/img/11.jpg.

I am running lychee on the _site/ directory before sending it to GitHub pages, but when testing for this image link, for example, by default it tests it as file:///home/runner/work_site/al-folio/_site/projects/4_project/al-folio/assets/img/11.jpg instead of as file:///home/runner/work_site/al-folio/_site/assets/img/11.jpg. To force this behavior I give lychee the following parameters:

--offline --remap '_site(/?.*)/assets/(.*) _site/assets/$2' --verbose --no-progress '_site/**/*.html'

The remap argument helps me to use regex to fix the urls. You can check the GitHub action.

@mre
Copy link
Member

mre commented Jan 15, 2024

Great workaround.
In fact, I think --remap would work for most (all?) the cases that --base-url and --root-path would be used for.
@vanbroup, can you try to use --remap for your case?
Of course, I still think we should have --base-url and --root-path at some point, because it is a very common operation.

@dkarlovi
Copy link

It's nice the new remap feature is so flexible, but I agree it would be beneficial to provide a simpler UX for this. 👍

@mre
Copy link
Member

mre commented Dec 18, 2024

--root-dir got released with lychee 0.18.0 now thanks to @trask.
If anyone finds the time, we'd be thankful for some feedback. It should fix most problems mentioned in this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants