Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(graphcache) - Fix API results from being reused incorrectly for queries #1196

Merged
merged 6 commits into from
Dec 4, 2020

Conversation

kitten
Copy link
Member

@kitten kitten commented Dec 4, 2020

This was originally reported by @sarmeyer.

Tracking down what the root culprit and intended fix is was difficult since the effects of this bug were subtle, the cause was hard to find, and this can poorly by fixed by several small issues. Ultimately, tl;dr, the bug can be seen here: https://github.com/FormidableLabs/urql/blob/eee828914fb6083b2aaa3203484ed6b36b7beff5/exchanges/graphcache/src/operations/query.ts#L93-L97

Summary

Fix reusing original query data from APIs accidentally, which can lead to subtle mismatches in results when the API's incoming query results are being updated by the cacheExchange, to apply resolvers. Specifically this may lead to relations from being set back to null when the resolver returns a different list of links than the result, since some null relations may unintentionally exist but aren't related. If you're using relayPagination then this fix is critical.

Investigation

When a result comes back from the API it is both written to the cached, but the cacheExchange also queries it again from the API, to update its data using resolvers. This is a great technique to get any API result from looking the same as if it was directly queried from the cache.

For mutations and subscriptions this is also nice, because we have a special readRoot case that copies values over from the originalData, since not all fields on "root results" are cached, since they're not normalised. This is easily confused with readSelection's concept of data which is the target where we write results too. At some point we must've gotten either confused or assumed that it'd be great for readSelection (which is for normalised data) to also use the original API data as its data input.

This becomes a problem because it's not necessary and can cause bugs. It's not necessary because we're dealing with normalised data, so the data should be completely queryable from the cache and reusing the original data will just cause confusion. It also causes bugs — which is how @sarmeyer discovered this — because if a resolver returns a list of items that in turn have links (i.e. relations) to other entities, then the result's original data may have items in a different order or length. This data will still be used, but if it's set to null then another special case causes the field to be considered null again, whether that's actually correct or not, see: https://github.com/FormidableLabs/urql/blob/eee828914fb6083b2aaa3203484ed6b36b7beff5/exchanges/graphcache/src/operations/query.ts#L499

In this line we check whether prevData === null. This is important because a past selection set may have contained uncached fields but a future one for the same path in the query may not, which would mean that the data would be returned even though it contained uncached fields. This test illustrates why this check exists: https://github.com/FormidableLabs/urql/blob/eee828914fb6083b2aaa3203484ed6b36b7beff5/exchanges/graphcache/src/operations/query.test.ts#L225-L233

But when we use the API result as an input then prevData isn't just the data that Graphcache built up from its cache, it's also the data from the API result that's being reused. And when a list contains unrelated items with null fields then this check will instead cause this relation to become null although the cached data exists.

Another side-effect of this bug is that because API results are reused, some references/objects/instances are overwritten by Graphcache, which is bad. It should never mutate data it gets from the API, but just write it to its cache then create a new query result.

Set of changes

  • Add a reproduction test case
  • Fix some tests that should've warned us about this being a problem 🤦
  • Ensure that readSelection in read always gets a new, empty object to write to.

@kitten kitten requested a review from JoviDeCroock December 4, 2020 19:59
@changeset-bot
Copy link

changeset-bot bot commented Dec 4, 2020

🦋 Changeset detected

Latest commit: 554cbcc

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@urql/exchange-graphcache Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@kitten kitten changed the title Fix/graphcache reused original data (graphcache) - Fix API results from being reused incorrectly for queries Dec 4, 2020
? readRoot(ctx, rootKey, rootSelect, data)
: readSelection(ctx, rootKey, rootSelect, data);
? readRoot(ctx, rootKey, rootSelect, input || ({} as Data))
: readSelection(ctx, rootKey, rootSelect, {} as Data);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the relevant line for review :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JoviDeCroock Merging as a hotfix; the review should be simple, but @sarmeyer has tested this and it works just fine 🙌

@kitten kitten merged commit 1168694 into main Dec 4, 2020
@kitten kitten deleted the fix/graphcache-reused-original-data branch December 4, 2020 20:54
@github-actions github-actions bot mentioned this pull request Dec 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant