Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preventing 301 redirects on URLs with no trailing slashes (Netlify) #9207

Closed
lloydh opened this issue Oct 18, 2018 · 67 comments
Closed

Preventing 301 redirects on URLs with no trailing slashes (Netlify) #9207

lloydh opened this issue Oct 18, 2018 · 67 comments
Assignees
Labels
help wanted Issue with a clear description that the community can help with. type: bug An issue or pull request relating to a bug in Gatsby type: upstream Issues outside of Gatsby's control, caused by dependencies

Comments

@lloydh
Copy link
Contributor

lloydh commented Oct 18, 2018

Summary

URLs with no trailing slash on sites hosted by Netlify lead to an immediate 301 redirect to the page with a trailing slash.

foo.com/bar --> foo.com/bar/

This has a performance cost and implications for SEO.

Is there a Netlify configuration that resolves these URLs without redirecting?

Relevant information

While this question is specific to Netlify, I did a quick review of other Gatsby sites featured in the Showcase and saw the same behaviour in many, but not all cases, for example:

Hopper /company - 301 redirect (Netlify)
Impossible Foods /mission - 301 redirect (unknown)
Cajun Bow Fishing /bows - 301 redirect (Netllify)
Braun /shavers-for-men - 200 no redirect (unknown)

Environment (if relevant)

Same behaviour in Gatsby v1 and v2.
I'm using gatsby-plugin-remove-trailing-slashes and gatsby-plugin-netlify.
Within the project all Links point to the non-trailing slash version.

@Yurickh
Copy link
Contributor

Yurickh commented Oct 18, 2018

I'm not really familiar on how Netlify runs the static site, but I know that this is the default behaviour for express when serving static folders.
I'm not sure which are the performance costs you're implying here, can you link me to a reference? Would love to read more about that.

@lloydh
Copy link
Contributor Author

lloydh commented Oct 18, 2018

The performance cost is the synchronous delay for the first byte of useful data caused by the redirect.

Using the Hopper example above, visiting https://www.hopper.com/company takes 150-300ms for the redirect, before any page data is received. On cellular connections with high latency it can add up to 1s.

@Yurickh
Copy link
Contributor

Yurickh commented Oct 18, 2018

I see.

From my experience with static sites, we always redirect to the /-ending url whenever it represents a folder (with an implicit index.html), not a file (like shavers-for-men.html). If you don't, the static resolution will simply fail.

This is also what the "pretty url" option on netlify does.

I guess the best approach here is to use the /-ending url as the canonical one, unless you have really solid reasons to do otherwise.

EDIT: Note that my position here is nowhere near an official position from the gatsby team. This is solely my personal opinion on the subject.

@lloydh
Copy link
Contributor Author

lloydh commented Oct 20, 2018

@Yurickh I agree specifying the /-ending urls as canonical is a pragmatic option but it does seem like advocating a schizophrenic url scheme for something that was easily solved with nginx / apache but not today's popular static site hosts.

If this really is the best approach then perhaps gatsby-plugin-remove-trailing-slashes should have this as an option (at least mentioned in the docs). Alternatively gatsby-plugin-canonical-urls could have an option for trailing slashes. I haven't found any other plugins that can set canonical meta tags.

I did come across #9025 but to be honest I'm surprised this hasn't been a bigger issue for a lot of folks given the popularity of /-less Gatsby sites.

@kakadiadarpan
Copy link
Contributor

@lloydh did you had a chance to look at this documentation of Redirects|Netlify?

@kakadiadarpan kakadiadarpan added the type: question or discussion Issue discussing or asking a question about Gatsby label Oct 23, 2018
@luukdv
Copy link

luukdv commented Oct 26, 2018

Same issue here. Turned on 'Pretty URLs' in Netlify, but when I visit a page and remove the trailing slash in the address bar afterwards, I land on the non-trailing variant.

Maybe there should be an option (or by default?) in gatsby-plugin-canonical-urls to enforce a trailing slash. Right now it's not really a canonical, since the current pathname is used (https://github.com/gatsbyjs/gatsby/blob/master/packages/gatsby-plugin-canonical-urls/src/gatsby-browser.js#L9). @kakadiadarpan what do you think?

@Yurickh
Copy link
Contributor

Yurickh commented Oct 26, 2018

Just to clarify, the 'Pretty URLs' option in netlify WILL redirect you to the trailing slash variant:

In addition to forwarding paths like /about to /about/ (a common practice in static sites and single page apps), it will also rewrite paths like /about.html to /about/.

@denull0
Copy link

denull0 commented Nov 2, 2018

We are haveing exactly the same problem. And we are now getting penelties from google. Is there a solution?

@lloydh
Copy link
Contributor Author

lloydh commented Nov 2, 2018

Just to clarify, the 'Pretty URLs' option in netlify WILL redirect you to the trailing slash variant

This is true but I haven't noticed any difference in behaviour with or without "Pretty URLs"; /foo redirects to /foo/ in either case. The desired behaviour is to rewrite slashless urls to resolve without redirecting.

Making /foo/ canonical is a workaround for the SEO penalties but AFAIK a full solution would only be possible in Netlify's routing layer… or by abandoning Netlify in favour of custom url rewriting …or by adopting slash/ urls. It's an unfortunate situation.

@gatsbot
Copy link

gatsbot bot commented Jan 27, 2019

Old issues will be closed after 30 days of inactivity. This issue has been quiet for 20 days and is being marked as stale. Reply here or add the label "not stale" to keep this issue open!

@gatsbot gatsbot bot added the stale? Issue that may be closed soon due to the original author not responding any more. label Jan 27, 2019
@abohannon
Copy link
Contributor

I'm not having this issue with Netlify specifically, but my UTM query params are being deleted for this same reason. /mypage?utm_source=google becomes /mypage/ and this is causing tracking issues.

@gatsbot
Copy link

gatsbot bot commented Feb 9, 2019

Hey again!

It’s been 30 days since anything happened on this issue, so our friendly neighborhood robot (that’s me!) is going to close it.

Please keep in mind that I’m only a robot, so if I’ve closed this issue in error, I’m HUMAN_EMOTION_SORRY. Please feel free to reopen this issue or create a new one if you need anything else.

Thanks again for being part of the Gatsby community!

@gatsbot gatsbot bot closed this as completed Feb 9, 2019
@dja
Copy link

dja commented Mar 4, 2019

This is definitely still an issue, and we’ve experienced SEO penalties as well. Additionally, turning off Netlify’s pretty URLs feature seems to result in errors stating “Missing resources for /“ or “Missing resources for /slash/“. We’ve tried solutions recommended here: #11524 but haven’t had any luck.

@0505gonzalez
Copy link

Experiencing this issue as well.

@0505gonzalez
Copy link

@dja Are you also on Netlify or are you using Github pages?

@dja
Copy link

dja commented Mar 6, 2019 via email

@0505gonzalez
Copy link

@dja I'm on github pages, but the issue you're facing might be the same as mine. I've opened a new ticket and plan to create a PR shortly: #12364

@0505gonzalez
Copy link

TLDR: Github pages (and probably Netlify) add trailing forward slashes to folders. If you have something like /public/somepage/index.html and you visit https://yourpage.com/somepage, Github (and probably Netlify) will add a trailing slash because somepage is a directory.

@dja
Copy link

dja commented Mar 6, 2019 via email

@0505gonzalez
Copy link

@dja That's what I'm currently trying to figure out in the other github issue I opened

@himynameistimli
Copy link
Contributor

himynameistimli commented Apr 3, 2019

@0505gonzalez were you able to figure out a solution for this issue?

Just landed here as we're having the same problems with the url parameters getting lost in the redirect from the version without the trailing slash to the version with the trailing slash.

Only difference is that we're on S3 + Cloudfront.

We might look into using Lambda@Edge to handle the redirect unless we can figure out a way to get it to work with gatsby.

Update:

For our case, we implemented the fix from https://www.ximedes.com/2018-04-23/deploying-gatsby-on-s3-and-cloudfront/ with the following lambda js function:

const querystring = require('querystring');
exports.handler = (event, context, callback) => {
    const request = event.Records[0].cf.request;

    /* Parse request query string to get javascript object */
    const params = querystring.parse(request.querystring.toLowerCase());
    const sortedParams = {};
    const uri = request.uri;

    /* Sort param keys */
    Object.keys(params).sort().forEach(key => {
        sortedParams[key] = params[key];
    });

	/* Simple way return the index.html */
    if (uri.endsWith('/')) {
        request.uri += 'index.html';
    } else if (!uri.includes('.')) {
        request.uri += '/index.html';
    }

    /* Update request querystring with normalized  */
    request.querystring = querystring.stringify(sortedParams);

    callback(null, request);
};

I'm still trying to figure out what's the best thing to do here, because I think as-is, there's a negative impact to our SEO just because now we're delivering the same page for the trailing and non-trailing slash version. Likely I will add a permanent redirect for the trailing slash version here too.

If you're not using AWS/Cloudfront, I think you'd be able to accomplish this with Cloudflare Workers.

@0505gonzalez
Copy link

@himynameistimli I did find a solution, proposed a code change in another thread. But seems like it might not be accepted so have not created a PR.

The gist of it:

  • Hosting on github pages. Github follows the directory structure when serving files. E.g. if you hit /somepage, it will redirect to /somepage/ because it's a directory (actual file is /somepage/index.html.
  • My proposed solution was that gatsby generate /somepage.html instead of /somepage/index.html

@garethgd
Copy link

Still experiencing this trailing slash redirect with or without Netlify's pretty URL option enabled.

@haroldangenent haroldangenent removed the stale? Issue that may be closed soon due to the original author not responding any more. label Apr 17, 2019
@decimoseptimo
Copy link

@KyleAMathews

@himynameistimli I did find a solution, proposed a code change in another thread. But seems like it might not be accepted so have not created a PR.

The gist of it:

  • Hosting on github pages. Github follows the directory structure when serving files. E.g. if you hit /somepage, it will redirect to /somepage/ because it's a directory (actual file is /somepage/index.html.
  • My proposed solution was that gatsby generate /somepage.html instead of /somepage/index.html

@KyleAMathews this is seo danger
This subject isn't explained at all in https://www.gatsbyjs.org/docs/gatsby-link/
If someones decides to use no-trailing-slash-urls, a couple of days later the google serp becomes full of 301 redirections for your site.

@stldo
Copy link

stldo commented Aug 6, 2020

I made a plugin from a code that I normally use in websites hosted at Netlify. It creates a .html file for each page, which disables the 301 redirect for paths without trailing slashes. It works very well with simple websites, I hope it helps someone. More info can be found on the plugin page.

@ayZagen
Copy link

ayZagen commented Aug 27, 2020

this also happens with nginx server.

EDIT: this issue has nothing with gatsby. it is web server misconfiguration. I have checked output files and it seems gatsby creates a directory for each page and and index.html in it. So I had to change my nginx url resolving as following:

location / {
  try_files $uri $uri/index.html $uri.html =404;
}

The $uri/index.html is resolving correct file without redirect. If that doesn't exists or it is $uri/ ( most nginx conf examples uses that ) it will create a redirect with trailing slash. It is also stated in nginx documantation.

In response to a request with URI equal to this string, but without the trailing slash, a permanent redirect with the code 301 will be returned to the requested URI with the slash appended.

I have not used Netlify but I believe same thing would apply to it.

@lyxious
Copy link

lyxious commented Sep 7, 2020

this also happens with nginx server.

EDIT: this issue has nothing with gatsby. it is web server misconfiguration. I have checked output files and it seems gatsby creates a directory for each page and and index.html in it. So I had to change my nginx url resolving as following:

location / {
  try_files $uri $uri/index.html $uri.html =404;
}

The $uri/index.html is resolving correct file without redirect. If that doesn't exists or it is $uri/ ( most nginx conf examples uses that ) it will create a redirect with trailing slash. It is also stated in nginx documantation.

In response to a request with URI equal to this string, but without the trailing slash, a permanent redirect with the code 301 will be returned to the requested URI with the slash appended.

I have not used Netlify but I believe same thing would apply to it.

This is not the case, even if you have a blank nginx config, any attempt to access a valid directory will result in a 301 with the trailing slash. It's a common misconception that it is related to the try_files.
The documentation is also poor surrounding this, the location documentation makes it seem like the 301 redirect is only for routes that are proxied.

@atkinson
Copy link

atkinson commented Sep 9, 2020

This is a bug in the Netlify UI.

Here's a fix: https://community.netlify.com/t/remove-trailing-slash-redirect-for-gatsby-gatsby-cloud-netlify-website/20976/8

@ayZagen
Copy link

ayZagen commented Sep 9, 2020

This is not the case, even if you have a blank nginx config, any attempt to access a valid directory will result in a 301 with the trailing slash. It's a common misconception that it is related to the try_files.
The documentation is also poor surrounding this, the location documentation makes it seem like the 301 redirect is only for routes that are proxied.

There is nothing said about try_files it is about how file resolving works in nginx. try_files is just a helper directive. That code is just an example to show my usage and solution. I agree with you about documentation being poor. The redirection to a directory with a slash will be performed by an undocumented module named ngx_http_static_module. If you want to disable that behaviour you need to compile nginx yourself. I believe no one here would try this hard for it.

@lyxious
Copy link

lyxious commented Sep 17, 2020

This is not the case, even if you have a blank nginx config, any attempt to access a valid directory will result in a 301 with the trailing slash. It's a common misconception that it is related to the try_files.
The documentation is also poor surrounding this, the location documentation makes it seem like the 301 redirect is only for routes that are proxied.

There is nothing said about try_files it is about how file resolving works in nginx. try_files is just a helper directive. That code is just an example to show my usage and solution. I agree with you about documentation being poor. The redirection to a directory with a slash will be performed by an undocumented module named ngx_http_static_module. If you want to disable that behaviour you need to compile nginx yourself. I believe no one here would try this hard for it.

OP's question is about why foo.com/bar redirects to foo.com/bar/. Your initial answer suggests it has to do with the try_files config, but it doesn't. The documentation you provided also has nothing to do with why this request returns a 301. The location documentation, with respect to 301, is in regards to specific *_pass processes which is independant of try_files.

@wardpeet wardpeet self-assigned this Sep 17, 2020
@wardpeet
Copy link
Contributor

Hey, sorry I don't have an update right now but I'll read over this thread again and see if I can get some action items out of it and create a task list so y'all can help us fix these issues 🙏

@mlenser
Copy link

mlenser commented Sep 21, 2020

This is a bug in the Netlify UI.

Here's a fix: https://community.netlify.com/t/remove-trailing-slash-redirect-for-gatsby-gatsby-cloud-netlify-website/20976/8

This is indeed the case.

Here is how your netlify config should look like:
image

Disabling optimization at the top level apparently turns on the pretty URLs, even though it visually looks like that isn't the case:
image

So don't check the checkbox next to "Disable asset optimization"

@LekoArts LekoArts added status: needs more info Needs triaging and reproducible examples or more information to be resolved type: upstream Issues outside of Gatsby's control, caused by dependencies labels Sep 22, 2020
johnnyoshika added a commit to jobcast/jobcast-www that referenced this issue Sep 30, 2020
@flackjap
Copy link

This is a bug in the Netlify UI.
Here's a fix: https://community.netlify.com/t/remove-trailing-slash-redirect-for-gatsby-gatsby-cloud-netlify-website/20976/8

This is indeed the case.

Here is how your netlify config should look like:
image

Disabling optimization at the top level apparently turns on the pretty URLs, even though it visually looks like that isn't the case:
image

So don't check the checkbox next to "Disable asset optimization"

I've just lost 3 hours because of this.

BOLEST.

@ghost
Copy link

ghost commented Oct 25, 2020

@jlengstorf Do you know if this is intended behaviour ?

@Shanonbaker
Copy link

slash

@jlengstorf
Copy link
Contributor

@alvinometric I'm not sure — I've sent this over to our UI team for review. it does look like if this isn't a bug, it could do with some clarification

@leomelzer
Copy link

Thanks for all the helpful comments which lead us in the right direction.

If it still doesn't work after setting it to @mlenser's comment (#9207 (comment)), make sure to check your netlify.toml for pretty_urls.

Settings in the toml take precedence, see the docs: https://docs.netlify.com/configure-builds/file-based-configuration/#deploy-contexts

UI settings are overridden if a netlify.toml file is present in the root folder of the repo and there exists a setting for the same property/redirect/header in the toml file.

@ascorbic ascorbic removed the status: needs more info Needs triaging and reproducible examples or more information to be resolved label Nov 19, 2020
@LekoArts
Copy link
Contributor

LekoArts commented Nov 27, 2020

Since multiple people have reported this as a bug in Netlify's UI / the behavior being the result of a misconfigured hosting, I'll close this one here as resolved (and not an issue with Gatsby). Please follow the linked issues to see how/when it's resolved. Thanks for providing your context and solutions here for future Google users (hello 👋 ).

If you see this issue on another platform than Netlify, please create a new issue with a reproduction -- as this issue here is specific to Netlify, it's resolved.

@jon-sully
Copy link

jon-sully commented Dec 29, 2020

Hey 👋🏻

I know this issue lapsed and got closed but I really think it's important to recap on a couple of things here. The impetus for this issue is that a) there's a disconnect between Gatsby's routing defaults and Netlify's routing configurations, and b) there are serious SEO penalties in play if a Gatsby site doesn't have the trailing-slash / no-trailing-slash issue solved, since a site serving the same content on both URLs (duplicate content) gets knocked on SEO pretty hard. Technically this isn't Gatsby's fault, but as it pertains to all Gatsby users hosting on Netlify, it does seem like a major issue.. or a major risk at the very least.

Solving this problem by disabling "Pretty URLs" in the (yes, awfully borked / painful UX'd) Netlify Asset Optimization panel can open your site up to the duplicate content issue since content may be available at both the un-slashed and the slashed version of your URL path. It's important too to note that if a Gatsby site is available on both /test and /test/ but 'fixes itself', you may just be seeing the Gatsby runtime adjust your address bar via the Browser History API - the super important part has nothing to do with what happens when Gatsby actually runs in the browser - it's the part where Netlify is serving the same content on multiple URLs - the slash and the non-slash paths.

This is fixable and there is a way to get everything working smoothly and on a unified path / slash structure, but it's not disabling 'Pretty URLs'. The tl;dr: is that Netlify really works best / has biases toward using the trailing slash, and unified content pathing on Netlify requires the trailing slash. I elaborated on this in another Gatsby GH thread here:

#27889 (comment)

But I would definitely urge folks to carefully check (from a CLI HTTP tool preferably) which paths (slash and/or no-slash) are resolving to their content on their sites. If both the slash and no-slash paths are resolving to your content, your SEO will hurt for it.

Hope that helps 😕

@yanneves
Copy link
Contributor

Given gatsbyjs.com itself resolves HTTP 200 with or without trailing slash (duplicate content), it seems this issue is a bit of an afterthought.

$ curl -I https://www.gatsbyjs.com/plugins/
HTTP/2 200
$ curl -I https://www.gatsbyjs.com/plugins
HTTP/2 200

I noticed this issue like others here when I migrated from WordPress to Gatsby, where the previous WordPress configuration stripped trailing slashes. The only way I can see to avoid duplicate content is to cave to the 301 redirects and introduce trailing slashes. Is the SEO penalty for redirecting existing pages like this still relevant? It could be this is a lingering idea in SEO that no longer matters. But otherwise it's a dangerous assumption in Gatsby.

This appears to be baked into the directory structure of the static generated site:

public/
├── index.html
├── some-other-page
│   └── index.html
└── some-page
    └── index.html

A browser would interpret that as example.com/some-page/ and it would be hacky to force it to remove the trailing slash. Generated content would need to respect a configuration option to instead output the following when we want to remove trailing slash:

public/
├── index.html
├── some-other-page.html
└── some-page.html

The above would then be interpreted by the browser as example.com/some-page.

Do we know if Gatsby core is strictly expecting a directory structure for pages? There may be assumptions elsewhere that these are always output as directories. If we can identify those assumptions (or the lack of) configuring a non-trailing slash output like above would solve this issue.

@krzysieqq
Copy link

Any updates?

@MakowskiHubert
Copy link

The same problem with the nginx server and solved by #9207 (comment)

@hazem3500
Copy link
Contributor

hazem3500 commented Jul 30, 2021

If you want your website to not have any trailing slash and also work with Netlify you can use the gatsby-plugin-netlify and in gatsby-node.js add

const replacePath = path => (path === `/` ? path : path.replace(/\/$/, ``))

exports.onCreatePage = ({ page, actions }) => {
  const { createRedirect } = actions
  if(!page.path.includes('.html') && page.path !== '/') {
    createRedirect({ fromPath: `${page.path}/`, toPath: page.path, isPermanent: true })
  }
}

this will redirect all trailing slash to non-slash paths with 301 status code.
Note if you are also using createPages Gatsby node API you'll need to add it there also

exports.createPages = async ({ actions, graphql }) => {
    const { createPage, createRedirect } = actions
       // ...
       pages.forEach(page => {
          // ...
          createRedirect({ fromPath: `${page.path}/`, toPath: page.path, isPermanent: true })
       })
    })
}

@WhiteHoodHacker
Copy link

By disabling "Pretty URLs" in Netlify and ending up with duplicate content at trailing-slash URLs and non-trailing-slash URLs, wouldn't a simple fix be to add a <link> tag with the canonical URL using the preferred scheme? Plenty of sites serve duplicate content fine with this.

@wojtekidd
Copy link

I'm not having this issue with Netlify specifically, but my UTM query params are being deleted for this same reason. /mypage?utm_source=google becomes /mypage/ and this is causing tracking issues.

Is anyone else having this problem still? There has to be a solution for a self-hosted gatsby to somehow accept utm's?
I tried to do it this way in gatsby-node.js:

createRedirect({ fromPath: '/banana', toPath: '/shop?utm_source=banana&utm_medium=podcast&utm_campaign=banana', isPermanent: true });

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Issue with a clear description that the community can help with. type: bug An issue or pull request relating to a bug in Gatsby type: upstream Issues outside of Gatsby's control, caused by dependencies
Projects
None yet
Development

Successfully merging a pull request may close this issue.