Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance of mutating large collections: arrays vs objects #649

Closed
2 tasks done

Comments

@keriwarr
Copy link

keriwarr commented Jul 30, 2020

🙋‍♂ Question

Hello!

When I first read the immer performance docs, it didn't occur to me at the time that choice of Array vs Object for storing data might make a large impact on performance.

I use immer to manage a redux store in which records are keyed by their ID inside of an object. This performs perfectly well when the number of records is in the low 1000s, however we are looking to push our application to the level of supporting tens or possibly even hundreds of thousands of records. What I've found is that for sufficiently large collections it can take vastly longer to update a single record if the collection is an object as opposed to an array.

  1. Is there any known workaround such that I can get array-like performance, but continue to use an object? (Or some other data structure that gives us O(1) retrieval by ID?)
  2. If there is no such workaround, would you accept a PR to the docs clarifying this?

Link to repro

https://codesandbox.io/s/immer-sandbox-g64y4

In this demo, I create an array collection and an object collection of todos (just like in the test:perf code) but each with size 200,000, and then measure the time it takes to update a single record within the collection.

image

Update: Maps seem to be about an order of magnitude faster than objects but still a few times slower than arrays (I tried out a few different collection sizes). However, I don't think that I would want to take the hit of using an unserializable data structure in redux.

https://codesandbox.io/s/immer-sandbox-09zyb

image

Environment

We only accept questions against the latest Immer version.

  • Immer version: 7.0.7
  • Occurs with setUseProxies(true)
  • Occurs with setUseProxies(false) (ES5 only)

P.S. thanks for this awesome package <3

@mweststrate
Copy link
Collaborator

Sorry, didn't really have time to look into it so far. Some things to check:

  1. autofreeze is no longer faster by default, and actually slower in many cases (the docs are outdated in this regard). Actually keeping the state frozen is often faster on the long term as immer can bailout faster on frozen objects when looking for changes in the draft
  2. if your data structure is so large that the immer overhead starts to count, you are always fighting an uphill battle, even if things can be improved (feel free to investigate where the surprising difference is coming from). I'd skip drafting for such a perf critical large object, and rather only draft the individual item you are about to update, and then merge it back into the large collection using the vanilla js. (so you could wrap only sub reducers in produce, without using it for the root reducer for example)

@keriwarr
Copy link
Author

keriwarr commented Oct 7, 2020

thanks Michel, I'll drop a comment here when I arrive at a solution I'm happy with.

mweststrate added a commit that referenced this issue Nov 17, 2020
BREAKING CHANGE: always freeze by default, even in production mode. Use `setAutoFreeze(process.env.NODE_ENV !== 'production')` for the old behavior. See #687 (comment) for the rationale. Fixes #649, #681, #687
@aleclarson
Copy link
Member

🎉 This issue has been resolved in version 8.0.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

@bfelbo
Copy link

bfelbo commented Oct 31, 2023

This is the top result on Google for Immer arrays vs. objects benchmarking so sharing info here in case it's useful for future readers.

Thanks @keriwarr for the benchmarking scripts. I just tried them with Immer 10 to see if there's still a performance difference between array and object. Seems like there is, but much reduced.

I just changed Immer to 10.0.3 and removed setUseProxies(true), which was failing. Your benchmark now produces:

[ARRAY] immer (proxy) - without autofreeze: 9ms 
[OBJECT] immer (proxy) - without autofreeze: 34ms

and

[OBJECT] immer (proxy) - without autofreeze: 7ms 
[OBJECT] immer (proxy) - with autofreeze: 15ms
[ARRAY] immer (proxy) - without autofreeze: 2ms 
[ARRAY] immer (proxy) - with autofreeze: 24ms

so it seems like it's still worth using arrays w/o autofreezing for large performance-critical applications that need Immer. Hope this is helpful for anyone else looking into Immer performance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment