Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

All search results on Google (only mobile) now have "?no-cache=1" appended #9355

Closed
hackhat opened this issue Oct 24, 2018 · 25 comments · Fixed by #9907
Closed

All search results on Google (only mobile) now have "?no-cache=1" appended #9355

hackhat opened this issue Oct 24, 2018 · 25 comments · Fixed by #9907
Labels
type: bug An issue or pull request relating to a bug in Gatsby

Comments

@hackhat
Copy link

hackhat commented Oct 24, 2018

I'm on mobile so I can't put too much info, but my search results on Google are all ?no-cache=1 I don't know why it's so messed up. I now have a lot of duplicate content indexed and I hope I won't get blocked or similar.

What is going on?

Thank you

Description

Describe the issue that you're seeing.

Steps to reproduce

Clear steps describing how to reproduce the issue. Please please please link to a demo project if possible, this makes your issue much easier to diagnose (seriously).

Expected result

What should happen?

Actual result

What happened.

Environment

Run gatsby info --clipboard in your project directory and paste the output here. Not working? You may need to update your global gatsby-cli - npm install -g gatsby-cli

@hackhat
Copy link
Author

hackhat commented Oct 24, 2018

Anything that I can do to fix it quickly before I lose my Google ranking?

@hackhat
Copy link
Author

hackhat commented Oct 24, 2018

I can see from Google web console that this is happening

@hackhat
Copy link
Author

hackhat commented Oct 24, 2018

Only happens if you come from Google on a mobile. Works fine on desktop

@hackhat
Copy link
Author

hackhat commented Oct 24, 2018

I think google somehow indexed most of my pages with ?no-cache and they started to replace my pages without ?no-cache because these rank better. If I'm on mobile and I select "Desktop site" I receive results without ?no-cache in google.

@hackhat hackhat changed the title All search results on Google now have ?no-cache=1 appended All search results on Google (only mobile) now have "?no-cache=1" appended Oct 24, 2018
@piotrkwiecinski
Copy link
Contributor

@hackhat There is a way to exclude no-cache in Google Webmaster tools: https://support.google.com/webmasters/answer/6080550?hl=en&authuser=1&ref_topic=6080547

Also implementing canonical urls should help too.

@zslabs
Copy link
Contributor

zslabs commented Oct 29, 2018

I've seen some mutterings within this repo around no-cache, but was wondering if there was a more official place I could go to find out the main purpose of it. Since this URL param is inherently seen as a duplicate URL/content and would need to be specifically removed from every single Gatsby website in Google's eyes through webmaster tools; is there something we can do to prevent the use of this in general?

I may not be understanding the core concept; and the parameter itself has caused a bit of a headache for us internally as of late.

cc @KyleAMathews

@KyleAMathews
Copy link
Contributor

It's a work around to get gatsby-plugin-offline working. I agree it's not a great solution. @davidbailey00 could you look into other options for checking if a page has been seen?

@zslabs
Copy link
Contributor

zslabs commented Oct 29, 2018

@KyleAMathews Appreciate the response; we're not using that plugin currently, but sounds like it's just kinda baked-in there in case it is. Do you happen to know of any workarounds to disable that entirely, or is that "too core" to touch at the moment until an alternative solution is figured out?

@KyleAMathews
Copy link
Contributor

It's in core right now

@zslabs
Copy link
Contributor

zslabs commented Oct 29, 2018

Gotcha; while I'm not as versed into that part of the app in particular, if there is any headway on testing out a different solution; I'll be ready to give things a whirl on my end!

@DSchau
Copy link
Contributor

DSchau commented Oct 29, 2018

@hackhat we're sorry to hear this! Is there any way you could let us know if you're able to reproduce this? I just tried with gatsbyjs.org and wasn't able to replicate. I also checked out Google Analytics logs, and didn't really see any issues, definitely not ?no-cache=1 on all of our logs!

Could be an outdated gatsby-plugin-offline plugin issue, perhaps?

@vtenfys
Copy link
Contributor

vtenfys commented Oct 30, 2018

Hi @hackhat @zslabs, I'm the one responsible for the ?no-cache=1 stuff so just to let you know, I'll be checking this out ASAP.

The purpose of this query is to prevent service workers gobbling up pages on your site which aren't generated by Gatsby - e.g. if I visit Netlify CMS /admin/ with a service worker installed, Gatsby will show a 404 because the SW returns a basic offline shell for all pages, and Gatsby can't find a page called /admin/ which it generated (even though it exists on the server, it wasn't created by Gatsby).

So to work around this, we check the HTTP status code for that page if Gatsby can't find it, and then unless it's a 404, we redirect there appending ?no-cache=1 to prevent the SW handling it again. (If it's a 404 then we just show the Gatsby 404 page as usual.)

However, what's happening here is that Gatsby isn't loading the resources properly for a page, and therefore thinking that it's a page not generated by Gatsby, just like Netlify CMS. So if we were to look in the Googlebot console we'd probably see an error saying "Found ?no-cache=1 while attempting to load page directly" - this is logged whenever Gatsby detects the just problem described, and should never be logged on a working app.

Until now I thought we'd ironed out all edge cases and prevented this from happening, but from the looks of things there are still some cases where it might occur - which might be difficult to debug if it's only problematic with Googlebot. Hopefully it'll also be problematic in some desktop browsers so we can actually tell what's going wrong rather than just guessing - last time I checked in Chrome there were no issues, and we're currently working on adding tests to prevent this sort of thing.

@zslabs
Copy link
Contributor

zslabs commented Oct 30, 2018

👋 @davidbailey00 Thanks for jumping in and the rundown of how that feature works!

Googlebot certainly doesn't make it easy, does it?! We also ran into some weirdness with it's mobile-bot not being able to read our pages because we were using the CSS vh unit for certain hero images https://stackoverflow.com/questions/38103636/fetch-as-google-googlebot-desktop-not-rendering-page-correctly 👈 That was a fun one.

You're correct though; desktop seems to be working pretty consistently, but I'm wondering if there's something going on with mobile that we just can't see. We're hosting the site on Netlify and not really doing anything out of the ordinary there, but I've included our gatsby info if that helps shed some light on how things are put together on our end.

  System:
    OS: macOS 10.14
    CPU: x64 Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz
    Shell: 2.7.1 - /usr/local/bin/fish
  Binaries:
    Node: 10.12.0 - /usr/local/bin/node
    Yarn: 1.12.1 - ~/.yarn/bin/yarn
    npm: 6.4.1 - /usr/local/bin/npm
  Browsers:
    Chrome: 70.0.3538.77
    Firefox: 62.0.3
    Safari: 12.0
  npmPackages:
    gatsby: 2.0.34 => 2.0.34
    gatsby-image: 2.0.18 => 2.0.18
    gatsby-plugin-canonical-urls: 2.0.7 => 2.0.7
    gatsby-plugin-catch-links: 2.0.6 => 2.0.6
    gatsby-plugin-feed: 2.0.9 => 2.0.9
    gatsby-plugin-google-tagmanager: 2.0.6 => 2.0.6
    gatsby-plugin-manifest: 2.0.7 => 2.0.7
    gatsby-plugin-netlify: 2.0.3 => 2.0.3
    gatsby-plugin-netlify-cms: 3.0.5 => 3.0.5
    gatsby-plugin-node-fields: 1.0.0 => 1.0.0
    gatsby-plugin-react-helmet: 3.0.1 => 3.0.1
    gatsby-plugin-sass: 2.0.2 => 2.0.2
    gatsby-plugin-sharp: 2.0.10 => 2.0.10
    gatsby-plugin-sitemap: 2.0.2 => 2.0.2
    gatsby-plugin-twitter: 2.0.7 => 2.0.7
    gatsby-remark-autolink-headers: 2.0.9 => 2.0.9
    gatsby-remark-copy-linked-files: 2.0.6 => 2.0.6
    gatsby-remark-custom-blocks: 2.0.1 => 2.0.1
    gatsby-remark-embed-video: 1.4.0 => 1.4.0
    gatsby-remark-images: 2.0.5 => 2.0.5
    gatsby-remark-prismjs: 3.0.3 => 3.0.3
    gatsby-remark-relative-images-v2: 0.1.5 => 0.1.5
    gatsby-remark-relative-links: 0.0.1 => 0.0.1
    gatsby-remark-responsive-iframe: 2.0.6 => 2.0.6
    gatsby-source-filesystem: 2.0.6 => 2.0.6
    gatsby-transformer-remark: 2.1.11 => 2.1.11
    gatsby-transformer-sharp: 2.1.7 => 2.1.7
    gatsby-transformer-yaml: 2.1.4 => 2.1.4
  npmGlobalPackages:
    gatsby-cli: 2.4.3

Would you be able to expand on what "resources" could cause a page to fail and show that cache? I've seen some oddities where (even on desktop) there's some type of decoding issue with the JSON payload that is "the content" of the page (/static/d/244/path---privacy-06-e-82f-UbMjLGBJ3CMBK5a0x1kvn6Bq8c.json), although the page itself shows up fine. When visiting the file; it says it can't be loaded, but automatically refreshes and shows the correct content -- which is pretty weird and extremely inconsistent. And all this while never showing the no-cache param. Not sure if that's useful. We're using the yaml transformer for a lot of our static content coming from Netlify; aside from remark for the blog posts, tutorials, etc.

Thanks again for digging into this!

@hackhat
Copy link
Author

hackhat commented Nov 5, 2018

@davidbailey00 If you add your website to analytics.moz.com campaign it will also be redirected to no-cache, so frustrating. I'm hosting on cloudflare + s3 and again nothing fancy here as well.

@vtenfys
Copy link
Contributor

vtenfys commented Nov 5, 2018

@DSchau and I have decided to remove the ?no-cache=1 URL parameter entirely, in favour of better docs which explain to users how to blacklist non-Gatsby pages. Hopefully this will be available by next week, sorry for all the problems this has caused

@hackhat
Copy link
Author

hackhat commented Nov 5, 2018

@davidbailey00 excellent decision, it seemed to be a bit tricky and with many edge cases. Thank you very much for your support.

@zslabs
Copy link
Contributor

zslabs commented Nov 12, 2018

👋 @davidbailey00 Just wanted to check-in to see if there was anything I could help test. Thanks again for looking into this!

@DSchau
Copy link
Contributor

DSchau commented Nov 12, 2018

@zslabs is there more info on your end about this issue? We actually solved this issue previously as there was an issue in the source code, i.e. arrow functions weren't being transpiled in Chrome 41 (what Googlebot uses). You shouldn't see many of the no-cache hits in your normal flow!

@zslabs
Copy link
Contributor

zslabs commented Nov 12, 2018

@DSchau Oh, interesting! Did you have the commit for that handy? I've been holding off on upgrading a few deps until this was resolved, so wanted to see where we were at currently.

@DSchau
Copy link
Contributor

DSchau commented Nov 12, 2018

@zslabs it was not an issue on our end, but rather with a node_modules lib making its way into the browser untranspiled!

To clarify, is this at all related to any Gremlin stuff?

@zslabs
Copy link
Contributor

zslabs commented Nov 12, 2018

Partly - our SEO guy mentioned he was still seeing these sporadically after that other issue was fixed, so I was hoping a fresh-look at the no-cache feature might shed some light on a different (or better) way to handle those.

@DSchau
Copy link
Contributor

DSchau commented Nov 12, 2018

Yeah - we've seen it very infrequently on our end on gatsbyjs.org, as well. I believe @davidbailey00 is working on a PR that will avoid the no-cache stuff, so we'll just keep you in the loop on that? Should be something to see very soon here!

@zslabs
Copy link
Contributor

zslabs commented Nov 12, 2018

Sure thing, sounds great! Thanks as always.

@DSchau
Copy link
Contributor

DSchau commented Nov 12, 2018

Not a problem :) Thanks for using Gatsby! We're excited to get this fixed and all the issues ironed out :)

@vtenfys
Copy link
Contributor

vtenfys commented Nov 12, 2018

I believe @davidbailey00 is working on a PR that will avoid the no-cache stuff, so we'll just keep you in the loop on that? Should be something to see very soon here!

Yeah that's right, I'm working on the new approach right now actually - hopefully it should be ready later today or tomorrow!

@kakadiadarpan kakadiadarpan added the type: bug An issue or pull request relating to a bug in Gatsby label Nov 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug An issue or pull request relating to a bug in Gatsby
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants