Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(sw): use NetworkFirst strategy for page-data #24940

Merged
merged 3 commits into from
Jun 18, 2020
Merged

fix(sw): use NetworkFirst strategy for page-data #24940

merged 3 commits into from
Jun 18, 2020

Conversation

nibtime
Copy link
Contributor

@nibtime nibtime commented Jun 11, 2020

Fixes #24930

@nibtime nibtime requested a review from a team as a code owner June 11, 2020 23:31
@gatsbot gatsbot bot added the status: triage needed Issue or pull request that need to be triaged and assigned to a reviewer label Jun 11, 2020
@nibtime
Copy link
Contributor Author

nibtime commented Jun 15, 2020

I just found out that app-data.json is not considered in caching strategies. According to its function described in the official docs, it should be cached with NetworkFirst as well. Without explicit configuration, it will get StaleWhileRevalidate from here:

{
// Add runtime caching of various other page resources
urlPattern: /^https?:.*\.(png|jpg|jpeg|webp|svg|gif|tiff|js|woff|woff2|json|css)$/,
handler: `StaleWhileRevalidate`,
},

@pvdz pvdz added status: needs core review Currently awaiting review from Core team member topic: plugins-PWA Issues related to PWA: the gatsby-plugin-offline and gatsby-plugin-manifest plugins type: bug An issue or pull request relating to a bug in Gatsby and removed status: triage needed Issue or pull request that need to be triaged and assigned to a reviewer labels Jun 16, 2020
@pvdz
Copy link
Contributor

pvdz commented Jun 16, 2020

Relevant history around this: #19342

Bottom line, the change of handler is non-trivial and probably really depends on user preference. We have to pick a default that makes sense for most.

The upside to NetworkFirst is that the data is always up to date, according to the caching rules anyways. The downside is that it triggers an extra, blocking, round trip when the resource is stale.

Also, note that this setting can be changed by a user. The docs could need a little updating around this.

If we really wanted to change the handler setting we would first need to properly investigate the impact across the board.

For now we would like to keep the handler as is. However, the addition of app-data is good. Could you please update the PR to add app-data with StaleWhileRevalidate? Thank you!

@pvdz pvdz added status: awaiting author response Additional information has been requested from the author and removed status: needs core review Currently awaiting review from Core team member labels Jun 16, 2020
Copy link
Contributor

@pvdz pvdz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Please revert the change to handler, but do add the app-data.json section, with its handler also set to StaleWhileRevalidate)

@nibtime
Copy link
Contributor Author

nibtime commented Jun 16, 2020

Hi @pvdz

I set StaleWhileRevalidate for both page-data and app-data in c85695a. However, I do not think it is the right default.

I looked into #19342 and of course StaleWhileRevalidate would be beneficial, I just don't think that it is applicable for page-data and app-data.

@KyleAMathews describes the possible implications there and those are exactly the kinds of problems I ran into. Updated Markdown data was missing, and uncaught unexpected TypeErrors were thrown from intermixing outdated with updated code.

I got those same errors reported from several internal users in a field test scenario (only with as much information as "text missing" and "white page") who all have visited the site before. I also observed those issues on my mobile devices where I did not clear up application caches.

This kind of stuff is very hard to reproduce (it's kind of possible like I described in #24930) and very detrimental to UX. Since I use NetworkFirst for both page-data and app-data via the plugin configuration I wasn't able to reproduce any of the problems.

I can hardly imagine that anybody wants their users to see missing content or white pages on a regular basis after code updates, so I think this should definitely be reconsidered, since it kind of breaks things out-of-the-box. At least the docs should be updated and mention those kinds of problems and explain the implication of caching strategies in more detail.

@21c-HK
Copy link

21c-HK commented Jun 16, 2020

Quoting @pvdz:

Relevant history around this: #19342

Bottom line, the change of handler is non-trivial and probably really depends on user preference. We have to pick a default that makes sense for most.

That can't be the right default and in this case, it cannot depend on user preference if it is causing caching bugs in the real world. The quoted issue you referenced above changed the handler from originally NetworkFirst to StaleWhileRevalidate, which is the cause of these extremely hard to reproduce caching bugs as explained above by @nibtime that ruin the user experience (by suggesting to the user that the web site is completely broken). The current configuration even contradicts the official documentation:

Similar to HTML files, the JSON files in the public/page-data/ directory should never be cached by the browser. In fact, it’s possible for these files to be updated even without doing a redeploy of your site. Because of this, browsers should be instructed to check on every request if the data in your application has changed.

Issue #19342 basically kills any kind of caching reliability for web sites in production, so this cannot be the right default. Gatsby feels so fast because of its aggressive caching and preloading, which is only employable if you can expect it to be reliable, but this PR makes Gatsby's caching unreliable.

@pvdz
Copy link
Contributor

pvdz commented Jun 18, 2020

At least the docs should be updated and mention those kinds of problems and explain the implication of caching strategies in more detail.

A PR on that is welcome (this is for the learning team to process). I agree this could be exposed better and described more clearly.

I also agree with the inconsistency pointed out by @21c-HK on caching json files. This may be the case of outdated docs or it may be the case of incorrect implementation. I'm not sure tbh.

However, I do not think it is the right default.
That can't be the right default and in this case, it cannot depend on user preference if it is causing caching bugs in the real world.

I appreciate those thoughts. I do understand your point and am not saying we're not going to change it. If and when we do, we want to make sure all implications are considered. A double blocking round trip could have severe impact on site perf. If we can prevent that by other means then that might have a preference. But other means have a cost too, so it's an equation we have to solve.

My suggestion on this would be to open a discussion issue and hash out the pro's, cons, and alternative solutions. Answer the question: Why should AND why should this not be changed? Then weigh the cost of each option.

I'm going to merge this PR. Thank you for making the changes and the PR in the first place 💜

Copy link
Contributor

@pvdz pvdz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💜

@pvdz pvdz merged commit d427663 into gatsbyjs:master Jun 18, 2020
@21c-HK
Copy link

21c-HK commented Jun 18, 2020

This may be the case of outdated docs or it may be the case of incorrect implementation. I'm not sure tbh.

I am pretty sure that the current implementation (even after the merge) is incorrect with StaleWhileRevalidate (see my explanation below).

A double blocking round trip could have severe impact on site perf. If we can prevent that by other means then that might have a preference.

I am not arguing that there might not be a better caching strategy or implementation, but StaleWhileRevalidate is causing bugs after merging the PR (#19342) that was intended to improve performance and that problematic PR just 6 months old. It was NetworkFirst before that PR. The explanation in the current documentation is clear and consistent with regards to caching the page data and app data (see my earlier quote), so StaleWhileRevalidate cannot be correct. You might not notice the caching bugs that this caching strategy causes because these bugs are extremely hard to reproduce if you don't know exactly how they are caused, but those bugs are still there.

My suggestion on this would be to open a discussion issue and hash out the pro's, cons, and alternative solutions. Answer the question: Why should AND why should this not be changed? Then weigh the cost of each option.

What is confusing to me is that this was not done for that problematic PR, which is causing bugs that went unnoticed until now. In any case, the cost of these caching bugs is so high that they outway any possible performance benefit, IMO.

@KyleAMathews had doubts about that PR back then because he anticipated these problems:

The big downside to this PR is that now page data won't be updated until the second time you visit a page as the browser will first get the cached page (however old it is) & then while the SW gets the newer version, the browser won't.

Also... on data changing shape. With the webpack compilation hash now not being in the page-data.json — how do we know what version of page data we're working with? Someone could have cached a version weeks ago and then the page component changed & we'd still try to use it which would throw an error...

This PR might need reverted.

I hope @KyleAMathews can weigh in because he probably knows what would be the best solution.

@21c-HK
Copy link

21c-HK commented Jun 19, 2020

Issue #24930 is not fixed despite the commit and merge, because only NetworkFirst actually fixes the reported bug. You would notice this if you tried to reproduce it.

@21c-HK
Copy link

21c-HK commented Jul 9, 2020

I clicked on a link to https://www.gatsbyjs.org/ today, which showed a blank (black) page. So there you have it. This bug even shows up on your own web site.

pragmaticpat pushed a commit to pragmaticpat/gatsby that referenced this pull request Apr 28, 2022
* fix(sw): use NetworkFirst strategy for page-data

Fixes gatsbyjs#24930

* fix(sw): NetworkFirst strategy for app-data.json

* fix(sw): StaleWhileRevalidate for page-data and app-data
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: awaiting author response Additional information has been requested from the author topic: plugins-PWA Issues related to PWA: the gatsby-plugin-offline and gatsby-plugin-manifest plugins type: bug An issue or pull request relating to a bug in Gatsby
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[gatsby-plugin-offline]: Wrong default caching strategy for page-data
3 participants