Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(gatsby): Instrument partial writes to page data #24808

Merged
merged 25 commits into from
Jun 18, 2020

Conversation

sidharthachatterjee
Copy link
Contributor

@sidharthachatterjee sidharthachatterjee commented Jun 5, 2020

Currently, writing page-data.json for a page is atomic. But this doesn't work for cases where we need to collect other information (upcoming module dependencies from @pieh and static queries from me)

This instruments a mechanism to add a pending update and then flush all pending updates to a page's page-data.json when ready.

This also adds some naive coordination for webpack so we can time flushing correctly.

@sidharthachatterjee sidharthachatterjee added the type: feature or enhancement Issue that is not a bug and requests the addition of a new feature or enhancement. label Jun 5, 2020
@sidharthachatterjee sidharthachatterjee requested a review from a team as a code owner June 5, 2020 11:29
@gatsbot gatsbot bot added the status: triage needed Issue or pull request that need to be triaged and assigned to a reviewer label Jun 5, 2020
@sidharthachatterjee sidharthachatterjee removed the status: triage needed Issue or pull request that need to be triaged and assigned to a reviewer label Jun 5, 2020
packages/gatsby/src/utils/webpack-status.ts Show resolved Hide resolved
packages/gatsby/src/commands/types.ts Outdated Show resolved Hide resolved
packages/gatsby/src/query/index.js Outdated Show resolved Hide resolved
packages/gatsby/src/query/index.js Outdated Show resolved Hide resolved
packages/gatsby/src/commands/develop-process.ts Outdated Show resolved Hide resolved
packages/gatsby/src/utils/page-data.js Outdated Show resolved Hide resolved
packages/gatsby/src/utils/webpack-status.ts Outdated Show resolved Hide resolved
packages/gatsby/src/query/index.js Outdated Show resolved Hide resolved
Copy link
Contributor

@freiksenet freiksenet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM now, deferring to @pieh for final approval.

blainekasten
blainekasten previously approved these changes Jun 17, 2020
Copy link
Contributor

@blainekasten blainekasten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's ship it!

Copy link
Contributor

@ascorbic ascorbic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@sidharthachatterjee
Copy link
Contributor Author

Waiting on a blessing from Master Jedi @pieh

@sidharthachatterjee sidharthachatterjee added the bot: merge on green Gatsbot will merge these PRs automatically when all tests passes label Jun 18, 2020
@gatsbybot gatsbybot merged commit b2bf298 into master Jun 18, 2020
@delete-merged-branch delete-merged-branch bot deleted the feat/page-data-flush branch June 18, 2020 08:25
Comment on lines +148 to +150
store.dispatch({
type: `CLEAR_PENDING_PAGE_DATA_WRITES`,
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sidharthachatterjee we introduced a bug with this (thank @TylerBarnes for raising this issue)

we generate pagesToWrite at the beginning of function (sync), and then the loop is async, which means that some concurrent code can add pending pages/templates while the loop is running and after that we clear all pending data writes which means that pending writes added mid function run are never handled.

What we did with Tyler to debug here is we rerun code used to generate pagesToWrite after the loop and realized that that it doesn't match with one we generated at the beginning of the function.

I think we either need to make this entire flush sync (yikes) or add some listener (e.g. emitter.on("ADD_PENDING_PAGE_DATA_WRITE", ...) that would flip some switch and rerun flush or something before dispatching CLEAR_PENDING_PAGE_DATA_WRITES

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For context this bug makes data updates break for WP due to missing page data but only on some sites. It's not reproducible on the default starter and seemingly needs a more complex site to reproduce

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if that would work - but maybe we should dispatch "clear pending" action immediately after constructing pagesToWrite (as code is sync up to this point), so after the loop we could just check if there is anything pending (meaning it was added since the clear) and then re-run pending? This way we won't need to listen to ADD_PENDING_* actions or make things sync?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe a good use-case for a generator? after each iteration check to see if anything has been added and put it on the end of the processing queue

pragmaticpat pushed a commit to pragmaticpat/gatsby that referenced this pull request Apr 28, 2022
* Add a pending page data reducer

* Add mechanism to flush page data after collecting pending writes

* Linting

* Fix

* Update snapshot

* move to program._ instead of command

* do not emit page data in query queue

* fix check for develop

* add a null check

* Prevent flush races

* Only flush if in queue

* Enqueue instead of flushing

* Linting

* Make enqueue not await

* Reset enqueue

* Update packages/gatsby/src/commands/build.ts

Co-authored-by: Michal Piechowiak <[email protected]>

* Linting

* immediately flush on build

* Add a getter

* Move page data back to TS

* Use named imports for page data

* Use better names

* fix typings and remove async

Co-authored-by: Michal Piechowiak <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bot: merge on green Gatsbot will merge these PRs automatically when all tests passes type: feature or enhancement Issue that is not a bug and requests the addition of a new feature or enhancement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants