Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"source and transform nodes" is taking a long time with gatsby-source-wordpress #4293

Closed
RobertBolender opened this issue Mar 1, 2018 · 29 comments

Comments

@RobertBolender
Copy link
Contributor

Context: I'm using gatsby-source-wordpress with lots of ACF fields. I have about 1,000 posts with a custom post type, and attached to those posts are about 1800 total images currently. The images are attached with a gallery ACF field. I have a bunch of other fields, but I expect the images and far and away the most resource-intensive. I have a few custom taxonomies but they don't have very many terms.

My current issue is with "source and transform nodes":

success source and transform nodes — 459.213 s

The little command line spinner just sits there for 459 seconds without indication of what it's doing.
What can I do to optimize this compile time specifically related to node sourcing?

@KyleAMathews
Copy link
Contributor

That time is probably mostly downloading images.

Add some logging to createRemoteFileNode in gatsby-source-filesystem and you'll have more visibility there.

I want to add a generic jobs logging framework to core so that plugins like this could update their progress there.

@RobertBolender
Copy link
Contributor Author

@KyleAMathews good call, that does seem to be what's going on. Gatsby already caches downloaded images, so I'm not sure if there's anything else I can really do to speed that up.

@RobertBolender
Copy link
Contributor Author

I've added logging to createRemoteFileNode and found that there is still additional time spent during source and transform nodes that happens after all the images are downloaded. Where else should I look for long-running tasks that would happen during this step?

@KyleAMathews
Copy link
Contributor

How much more time?

@RobertBolender
Copy link
Contributor Author

10+ minutes in develop mode. I've let it run for awhile then killed it several times, and at some point it will actually complete. The production build is still only taking 9 minutes in total though.

@KyleAMathews
Copy link
Contributor

Try editing https://github.com/gatsbyjs/gatsby/blob/master/packages/gatsby/src/bootstrap/page-hot-reloader.js (replace src with dist) in your node_modules folder and log whenever CREATE_NODE is called with the action.

What does the logging of the bootstrap process look like so I can see the times?

@RobertBolender
Copy link
Contributor Author

I've added logging to CREATE_NODE in that file but haven't seen it do anything yet, the develop command hasn't finished yet and never gets to a point where it can live reload.

I also added a log on actions.createNode in gatsby/dist/redux/actions.js and that shows me the file nodes that are created when the images are loaded for the remote file nodes, but nothing during the time that it's hanging.

@RobertBolender
Copy link
Contributor Author

I'm still looking at the createNode logs.

After the removeFileNode are finished, there is a long wait, followed by wordpress__acf_posts, wordpress__acf_pages, WordPressAcf_image_and_text and a bunch of other ACF fields.

Maybe the delay is Gatsby trying to infer the data schema of all my flexible content fields? I only have 4 layouts in the single Flexible Content field, on about 36 pages total, not a whole lot.

@KyleAMathews
Copy link
Contributor

Schema inference doesn't happen until it reaches that point in the bootstrap process.

@KyleAMathews
Copy link
Contributor

This might be a good time to get profiling working ;-)

#4218

See what code is running.

@RobertBolender
Copy link
Contributor Author

Until those profiling changes are merged into the repo, is there anything I can do now to figure out what's taking so long on my build step?

@KyleAMathews
Copy link
Contributor

There's nothing that needs merged other than instructions. I link from that issue to a gist that walks through how to setup profiling. Profiling is generic as I understand for any node app.

There's also this https://medium.com/@paul_irish/debugging-node-js-nightlies-with-chrome-devtools-7c4a1b95ae27

@arminnaimi
Copy link

arminnaimi commented Apr 17, 2018

Hi there! Is it possible to turn off fetching images and saving them locally? It would save a ton of local development time. I am using gatsby-source-contentful. My challenge is that I am working with two different spaces with each having more than 1300 assets. At the moment it takes about 40 minutes to complete "source and transform nodes".

@szimek
Copy link
Contributor

szimek commented Apr 23, 2018

@arminnaimi Unless you're using some image related plugin, gatsby-source-contentful shouldn't download any images. I've got 6 spaces with ~500 posts and ~1500 images each and it takes ~10 minutes to complete the source and transform nodes step. I created a ticket to speed it up as well: #5079.

@arminnaimi
Copy link

One thing I am noticing is that when excluding additional locales in the API response from Contentful, the whole build goes down to something more manageable like 40 seconds. The fact that it struggles with locales might be an issue with how effectively GraphQL can handle large JSON responses.

@szimek
Copy link
Contributor

szimek commented Apr 23, 2018

@arminnaimi It might be it - in my case each space has only 1 locale, which is different for each space.

@KyleAMathews
Copy link
Contributor

Due to the high volume of issues, we're closing out older ones without recent activity. Please open a new issue if you need help!

@Blumed
Copy link

Blumed commented Aug 28, 2019

@KyleAMathews I am running into this as well, but I don't need images to be downloaded. We are using wordpress offload to s3 to keep images out of our builds. Is there a way to stop it from downloading and just allow the url file path that points to our s3 bucket?

@r1q
Copy link

r1q commented Sep 3, 2019

I know its closed but In my case when i tried all solutions and nothing seems to fix it .. the only thing that worked was:

Resizing the Terminal window !! when it gets stuck.

@Frithir
Copy link
Contributor

Frithir commented Sep 5, 2019

@KyleAMathews @r1q I can't understand how this method works?
Resizing the Terminal window will run the build to success.
How can I do this on Netlify?

@elfatherbrown
Copy link

I have this exact behaviour. Such wierd thing the terminal resizing part. Perhaps some "screen" stuff going on there?

@r1q
Copy link

r1q commented Sep 10, 2019

Maybe someone from Netlify or iTerm team can help in debugging?

@elfatherbrown
Copy link

elfatherbrown commented Sep 10, 2019

If its any help, I can reproduce it in osx terminal and platformio-ide on atom. Not a terminal thing. It happens in the gatsby stages where a terminal spinner character appears. The spinner seems to freeze and then if one resizes the terminal, things go forward.

Like so:
image

Im monitoring in activity monitor to make sure what node is doing. When things freeze, node is at 0%:

image

Then upon resize, it starts going at it again.

image

Happens both on gatsby build and develop. OSX mojave here.

@marciplan
Copy link

I know its closed but In my case when i tried all solutions and nothing seems to fix it .. the only thing that worked was:

Resizing the Terminal window !! when it gets stuck.

This fixed it for me, too. How weird. But thank you so much :)

@jmcbee
Copy link
Contributor

jmcbee commented Oct 6, 2019

Is there an open issue regarding this bug which gets "fixed" by resizing the window?

@shanejones
Copy link
Contributor

Just adding in something that I've just found.

I found this would hang for a long time when I used iTerm, for a test I booted up the classic terminal for mac and this process stopped hanging. Wondering if it is a memory or buffer issue in iTerm causing this to hang?

@pvdz
Copy link
Contributor

pvdz commented Dec 3, 2020

Is there an open issue regarding this bug which gets "fixed" by resizing the window?

Yeah, there's #27325

@pvdz
Copy link
Contributor

pvdz commented Dec 3, 2020

I'm going to lock this issue.

@gatsbyjs gatsbyjs locked and limited conversation to collaborators Dec 3, 2020
@pvdz
Copy link
Contributor

pvdz commented Dec 3, 2020

@shanejones would love it if you could figure out what might cause it. None of us can repro it. Please continue that discussion in the linked PR.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests