Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make cache file writing sync and hashed by query #346

Merged
merged 12 commits into from
May 15, 2020

Conversation

thescientist13
Copy link
Member

Related Issue

resolves #345

Summary of Changes

  1. Write files by query, not by page
  2. And write them synchronously to avoid overwriting

For other mentioned issues #345, i will split them off into their own issues

@thescientist13 thescientist13 added bug Something isn't working P0 Critical issue that should get addressed ASAP Content as Data labels May 7, 2020
@thescientist13
Copy link
Member Author

Ugh... getting these stupid invariant errors in the Netlify preview environment. :/
Screen Shot 2020-05-06 at 8 29 42 PM

I don't know if it's related to this change though...?

Had a thought on it here #331 (comment)

@hutchgrant
Copy link
Member

hutchgrant commented May 7, 2020

Ugh... getting these stupid invariant errors in the Netlify preview environment. :/

Your Menu Query for the side menu is not being cached. Check cached files and you'll see theres only a navigation menu query, config, and graph queries.

Notice how our testing didn't catch this? Because it relies on mock graph.

@hutchgrant
Copy link
Member

hutchgrant commented May 7, 2020

When using:

const md5 = crypto.createHash('md5').update(query).digest('hex');

I saw the following error:

SyntaxError: /media/skynet/DATA/workspace/evergreen/greenwood/public/plugins/2bc8e256a25844b37c22af93673c67e3-cache.json: Unexpected token d in JSON at position 2830

It's not being caught. I was only able to catch it by switching back to async fs.writeFile and fs.mkdirs.

If you switch back to:

const md5 = crypto.createHash('md5').update(cache).digest('hex');

While still using synchronous as per this PR, it works fine, no errors.

@thescientist13
Copy link
Member Author

thescientist13 commented May 8, 2020

Your Menu Query for the side menu is not being cached. Check cached files and you'll see theres only a navigation menu query, config, and graph queries.

Oof... 🤦‍♂️

Notice how our testing didn't catch this? Because it relies on mock graph.

Yeah, noticed that too. Will see if I can do something about that somehow.

It's not being caught. I was only able to catch it by switching back to async fs.writeFile and fs.mkdirs.

Not quite sure I understand, but you are saying saw the JSON parse error when testing this branch and using query? That's a bummer, I ran it a bunch of times on my local machine.

So this might be a bit more complex of an issue, but I then since every page gets serialized in parallel, all these GraphQL queries are happening as well. I think it would be similar to what we did here for #271 ?

So the most sure fire way would be to make it all synchronous then. Or i guess try the file locking approach? Other options?

@hutchgrant
Copy link
Member

hutchgrant commented May 8, 2020

The reason you aren't creating a side menu cache is simple: you're using the same data to generate a hash and therefore it only writes one file for two different(albeit similar) queries!

menu query for side and navigation:

query ($name: String, $route: String, $order: MenuOrderBy) {
  menu(name: $name, pathname: $route, orderBy: $order) {
    item {
      label
      link
      __typename
    }
    children {
      item {
        label
        link
        __typename
      }
      children {
        item {
          label
          link
          __typename
        }
        __typename
      }
      __typename
    }
    __typename
  }
}

The fix is to include the query's variables to make it unique for the separate menus:

const md5 = crypto.createHash('md5').update(query + JSON.stringify(variables)).digest('hex');

downside to this is we get a unique menu for each pathname as well. Therefore every page generates a menu cache. Try the above and then take a look at the cache files in the /public/plugins folder. Note you'll see 7 cache files, 4 of which are for each unique pathname and page within that directory.

@thescientist13 thescientist13 changed the title make file writing sync make cache file writing sync May 9, 2020
@thescientist13 thescientist13 changed the title make cache file writing sync make cache file writing sync and hashed by query May 9, 2020
Copy link
Member Author

@thescientist13 thescientist13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

menu query for side and navigation:

ah, right, of course! two menu queries. thanks, now it all makes sense

downside to this is we get a unique menu for each pathname as well. Therefore every page generates a menu cache.

Hmm, yeah. I think I fixed this as well? One thing that helps make this cleaner is that I refactored the shelf to use the "prop" from the page-template.js for route and using that within the update lifecycle. So now GraphQL will get called using this.page; pretty neat!.

Now, we still have the multiple fetch calls due to over-rendering, but all the file sizes are now better.
Screen Shot 2020-05-09 at 2 47 28 PM


However, now although this seems to be working locally, on Netlify, I sometimes seems to get a shelf that just wont load for a page. Maybe once every 5 clicks? Just keep clicking the header from left to right, and back. eventually one will be blank and no errors.

Screen Shot 2020-05-09 at 5 31 29 PM

Getter closer though. :crossed

if (changedProperties.has('page') && this.page !== '' && this.page !== '/') {
await this.fetchShelfData();

this.expandRoute(window.location.pathname);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking I would even like to go so further as to just pass the full route from the page-template.js, then the shelf would have access to the full path via "props", and that could be used here instead of window.location.pathname).

Just could help keep the shelf a little more self contained.

console.log('ENTER updated - changedProperties', changedProperties);
console.log('updated - this.page', this.page);
if (changedProperties.has('page') && this.page !== '' && this.page !== '/') {
await this.fetchShelfData();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're expecting to return a promise that doesn't exist. Maybe just return the query(which is a promise) like this:

const response = await this.fetchShelfData();
this.shelfList = response.data.menu.children;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still happening, I guess I should just revert back the connectedCallback / window.location.pathname approach then?

Not sure if we'll need to start recommending / documenting certain patterns around this stuff then, though I would hope we don't have to be any different than SPA development. Like if having unpredictable lifecycles could be a concern, or just needing to guard initial values of your variables.

@@ -26,12 +26,14 @@ class PageTemplate extends LitElement {
}

updated() {
console.log('ENTER page template updated', window.location.pathname);
this.route = window.location.pathname;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is this lifecycle is run after the component has been rendered. I tested using the connectedCallback() to set the this.page variable from the page-template and had better results (I wasn't able to break it). Maybe that's just chance, I don't know.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this.page was actually undefined at that point, so updated seemed like a good way to make the sure the value is "ready".

@thescientist13 thescientist13 self-assigned this May 11, 2020
Copy link
Member Author

@thescientist13 thescientist13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hutchgrant

My understanding is this lifecycle is run after the component has been rendered. I tested using the connectedCallback() to set the this.page variable from the page-template and had better results (I wasn't able to break it). Maybe that's just chance, I don't know.

💥

I didn't release you meant the page-template.js! Thought you meant the shelf. It's working now though!

Will remove the console logs and commented out code. Nice, thanks for the help on this. The async kid saves the day again. 🏆

Copy link
Member Author

@thescientist13 thescientist13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hutchgrant
Unless there are any objections, would like to get this approved / merged soonish (today / tomorrow) now that the disappearing shelf issue is fixed, so we can cut an 0.5.1 release ASAP.

We can then work on the other related issues we've uncovered in parallel, but this one certainly being the most egregious due to the increased cache.json size coupled with the over fetching. (although to combat the index.js bundle size, I'll be working on #305 as a means to an end on that)

Copy link
Member Author

@thescientist13 thescientist13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, what a huge difference, comparing the Getting Started page in prod vs this preview. Cut it down by 1.1MB!

Before

Screen Shot 2020-05-13 at 10 36 34 AM

After

Screen Shot 2020-05-13 at 10 37 10 AM

@thescientist13 thescientist13 merged commit 51e0898 into master May 15, 2020
@thescientist13 thescientist13 deleted the bug/issue-345-large-cache-json-files branch May 15, 2020 14:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CLI Content as Data P0 Critical issue that should get addressed ASAP
Projects
None yet
Development

Successfully merging this pull request may close these issues.

cache.json is large due to containing multiples copies of the same data
2 participants