Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance issues with large collections #1751

Closed
JAndritsch opened this issue May 20, 2016 · 23 comments
Closed

Performance issues with large collections #1751

JAndritsch opened this issue May 20, 2016 · 23 comments

Comments

@JAndritsch
Copy link

JAndritsch commented May 20, 2016

Update: Disregard. Performance with large data sets is not an issue. Read the closing remarks for more info.

I'm seeing a pretty huge performance hit when attempting to dispatch several updates to state when the state contains a very large data set.

Background

I'm developing an image slider that is composed of two separate components: the Slider and the Preview. The Slider renders a list of 10 small thumbnails and allows a user to rapidly scroll through them via keybindings. The Preview displays a larger and higher quality version of the currently active slide in the Slider. The Preview updates as the Slider's active slide changes because they both subscribe to the same piece of state: currentSlideIndex. In addition to this state, the Redux store contains the entire collection of images.

The state object looks something like this:

{
  currentSlideIndex: 0,
  images: [
    { _id: '1234', path: '/path/to/image.jpg', thumb: '/path/to/thumb.jpg' },
   ...etc
  ]
}

I have one parent component, Index, which houses both the Slider and Preview components. The Index component uses React Redux's connect function to map the two pieces of state to properties. These properties are then passed down to both the Slider and Preview components.

Here's a simple example:

class Index extends React.Component {
  render() {
    return (
      <div>
        <Preview {...this.props} />
        <Slider {...this.props} />
      </div>
    );
  }
}

const mapStateToProps = (state) => {
  return {
    images: state.images,
    currentSlideIndex: state.currentSlideIndex
  };
};

export default connect(mapStateToProps)(Index);

When a user changes the current slide, the Slider component dispatches an action to update the currentSlideIndex in the Redux store. This tells the Index component to re-render itself since currentSlideIndex changed, which means the Preview and Slider also re-render.

The problem

I've noticed as my collection of images grows, the performance of the Slider and Preview components degrade pretty dramatically.

At 500 images, things move pretty quickly. To illustrate how fast, imagine that each image represented a single frame in a movie. If I were to press and hold down the right arrow key, the Slider and Preview component would re-render fast enough that it would look like you're actually watching a movie. It's pretty sweet!

However, at 1500 images things begin to lag and get really choppy. Any number larger than that and things go from bad to worse. The store dispatches seem to lag considerably and the animation of the Preview and Slider updates goes from "movie-like" quality to "flip book with pages stuck together" quality.

Just for kicks, I decided to disconnect my Index component from the currentSlideIndex state and instead move that down into the Slider component as an internal piece of state. Instead of dispatching an action on keypress (which would update the Redux store), I changed it to run this.setState({ currentSlideIndex: newIndex }). Obviously keeping this as a piece of state that's not in the Redux store isn't ideal because the Preview component won't automatically update as the current index changes. But just for academic reasons, I decided to see what would happen.

The results are pretty interesting. After moving that piece of state out of the Redux store and into the Slider, I can now scroll seamlessly and with "movie-like" quality when my collection of images contains 7000+ items.

The conclusion/question

It seems that dispatching updates to a Redux store when the state contains a large collection of fairly-sized objects does not perform well. What's puzzling is that the piece of state I'm updating is entirely separate from the images collection. The only piece of data that's changing on store.dispatch() is currentSlideIndex (which is just a simple number), so I don't suspect that my reducer for the images state is at fault because it should just be a simple passthrough.

Is the behavior that I described expected or a known issue, or is there some fundamental thing I'm missing that could address this?

@markerikson
Copy link
Contributor

The first question is how you're rendering the children. Based on that snippet, I'm assuming that the Slider component is taking in the entire array of image objects, and passing an entire image object to each child. That's probably not going to be very performant overall. Also, I'm guessing that you're actually rendering an Image component for every image object in that array, even though most of them wouldn't be visible. (Please note that I'm making a bunch of assumptions here - pointers to actual code would help me be a lot more sure).

Generally speaking, software runs faster when you 1) do less work per thing, and 2) do work for fewer things overall. For the first, Recent perf tests have shown that an optimized pattern for managing lots of children is to have the parent only pass an ID to each child as a prop, and have each child itself be connected and request its own data. That way the parent only re-renders when the number of children changes, and each child only re-renders when its own data changes. The reduction in work per item drastically outweighs the extra work being done in each connected component's mapStateToProps. For the second (and again, I'm making assumptions about behavior), can you try to only render children that are actually in the viewport? That seems like it would also really cut down on the amount of work.

Beyond that, I'll point you to my list of React/Redux perf articles, and in particular, this recent slideset: http://somebody32.github.io/high-performance-redux/, which happens to cover a very similar type of scenario.

Ah... the crossed-out advice is what I get for not reading the issue completely before jumping in to writing a response. That said, the suggestion to connect each individual child still applies.

You will probably need to normalize that state shape. Image entries should be in an object keyed by ID, with an array of image IDs alongside that to represent ordering.

And finally: when it comes to performance, MEASURE MEASURE MEASURE! :) Use the React Perf Tools, browser profiling, maybe some of the component update monitoring or other logging tools listed at https://github.com/markerikson/redux-ecosystem-links/blob/master/devtools.md , etc. Get an idea where the bottleneck actually is.

@JAndritsch
Copy link
Author

Hello and thank you for you quick response! Let me address a few of your questions/assumptions to help clear things up.

The first question is how you're rendering the children. Based on that snippet, I'm assuming that the Slider component is taking in the entire array of image objects, and passing an entire image object to each child.

Actually, there are no child components of the Slider. I'm just slicing the array of images and creating a 10-item viewport of plain <img> tags. Here's a rough sketch of the Slider component:

class Slider extends React.Component {
  constructor() {
    super();
    this.handleKeydown = this.handleKeydown.bind(this);
    this.goToSlide = this.goToSlide.bind(this);
  }

  componentDidMount() {
    this.setupKeybindings();
  }

  componentWillUnmount() {
    this.removeKeybindings();
  }

  goToSlide(index) {
    // This just calls store.dispatch, passing the new index to the Slider reducer so it can update state.
    SliderActions.setCurrentIndex(index);
  }

  makeSlides(start, end) {
    let self = this;
    start = start || 0;
    end = end || 10;
    return this.props.images.map((image, index) => {
      image.index = index;
      return image;
    }).slice(start, end).map((image) => {
      return (
        <div className="slide" key={image._id}>
          <div className="file-thumbnail">
            <img
              src={'file://' + image.thumbnailPath}
              onClick={function(e) { e.preventDefault(); self.goToSlide(image.index); }}
              />
          </div>
        </div>
      );
    });
  }

  setupKeybindings() {
    this.container.addEventListener('keydown', this.handleKeydown);
  }

  removeKeybindings() {
    this.container.removeEventListener('keydown', this.handleKeydown);
  }

  handleKeydown(e) {
    let newIndex;
    let keyCode = e.which;

    switch (keyCode) {
      case left:
        newIndex = Math.max(this.props.currentSlideIndex - 1, 0);
        this.goToSlide(newIndex);
        break;
      case right:
        newIndex = Math.min(this.props.currentSlideIndex + 1, this.props.images.length - 1);
        this.goToSlide(newIndex);
        break;
      default:
    }
  }

  render() {
    let self = this;
    return (
      <div
        id="slider"
        className="file-list"
        ref={function(node) { self.container = node; }}
        tabIndex="0">
        {this.makeSlides(this.props.currentSlideIndex, this.props.currentSlideIndex + 10)}
      </div>
    );
  }
}

export default Slider;

Next, here is the basic outline of the Preview component:

class FilePreview extends React.Component {
  shouldComponentUpdate(nextProps) {
    return nextProps.currentSlideIndex !== this.props.currentSlideIndex;
  }

  currentFile() {
    return this.props.images[this.props.currentSlideIndex];
  }

  render() {
    let imagePreview;
    let currentFile = this.currentFile();

    imagePreview = <div></div>;
    if (currentFile) {
      imagePreview = (
        <div className="image-preview-container">
          <img
            id="preview"
            className="preview"
            src={'file://' + currentFile.thumbnailPath}
            />
        </div>
      );
    }
    return (
      <div>
        {imagePreview}
      </div>
    );
  }
}

export default FilePreview;

And finally, just to be complete, here is the altered Slider component after I change it to update currentSlideIndex as a local piece of state rather than publishing the update to the Redux store via store.dispatch():

class Slider extends React.Component {
  constructor() {
    super();
    this.handleKeydown = this.handleKeydown.bind(this);
    this.goToSlide = this.goToSlide.bind(this);

    // No longer receiveing this as a prop from Index. Still getting images from index via props, 
    // however.
    this.state = {
      currentSlideIndex: 0
    };
  }

  componentDidMount() {
    this.setupKeybindings();
  }

  componentWillUnmount() {
    this.removeKeybindings();
  }

  goToSlide(index) {
    this.setState({ currentSlideIndex: index });
    // SliderActions.setCurrentIndex(index);
  }

  makeSlides(start, end) {
    let self = this;
    start = start || 0;
    end = end || 10;
    return this.props.images.map((image, index) => {
      image.index = index;
      return image;
    }).slice(start, end).map((image) => {
      return (
        <div className="slide" key={image._id}>
          <div className="file-thumbnail">
            <img
              src={'file://' + image.thumbnailPath}
              onClick={function(e) { e.preventDefault(); self.goToSlide(image.index); }}
              />
          </div>
        </div>
      );
    });
  }

  setupKeybindings() {
    this.container.addEventListener('keydown', this.handleKeydown);
  }

  removeKeybindings() {
    this.container.removeEventListener('keydown', this.handleKeydown);
  }

  handleKeydown(e) {
    let newIndex;
    let keyCode = e.which;

    switch (keyCode) {
      case left:
        newIndex = Math.max(this.state.currentSlideIndex - 1, 0);
        this.goToSlide(newIndex);
        break;
      case right:
        newIndex = Math.min(this.state.currentSlideIndex + 1, this.props.images.length - 1);
        this.goToSlide(newIndex);
        break;
      default:
    }
  }

  render() {
    let self = this;
    return (
      <div
        id="slider"
        className="file-list"
        ref={function(node) { self.container = node; }}
        tabIndex="0">
        {this.makeSlides(this.state.currentSlideIndex, this.state.currentSlideIndex + 10)}
      </div>
    );
  }
}

export default Slider;

Again, even though though this component is still being given the entire 1500 (or 7000) item collection, it re-renders blazingly fast when calling this.setState as opposed to invoking store.dispatch.

You will probably need to normalize that state shape. Image entries should be in an object keyed by ID, with an array of image IDs alongside that to represent ordering.

The shape of each image in state is normalized (for the most part). Each image object contains a unique ID and has roughly the same attributes (minus a few fields of exif metadata).

And finally: when it comes to performance, MEASURE MEASURE MEASURE! :) Use the React Perf Tools, browser profiling, maybe some of the component update monitoring or other logging tools listed at https://github.com/markerikson/redux-ecosystem-links/blob/master/devtools.md , etc. Get an idea where the bottleneck actually is.

Based on my observations, it seems that the bottleneck is store.dispatch(). Regardless of what data I send through, calling that method with a state tree that contains a large collection is abysmally slow.

I'll take a look at some of those tools you suggested so I can throw a bit more science at this problem and hopefully come back with some data that may help us understand what's happening.

@JAndritsch
Copy link
Author

JAndritsch commented May 20, 2016

Well, just ran the Chrome profiler and took a look at where the browser was spending a good chunk of time. Are you ready for this (because this is ridiculous)?

PropType validation!

One of the pieces of data I left out of my previous snippets was this:

Slider.propTypes = {
  images: React.PropTypes.arrayOf(
    React.PropTypes.shape({
      thumbnailPath: React.PropTypes.string
    })
  ).isRequired,
  currentSlideIndex: React.PropTypes.number.isRequired
};

After removing that, all the performance sadness I was seeing disappeared. I suspect this is due to the props being passed down from the Index component to the Slider component every time store.dispatch() is invoked. That would explain why changing to this.setState made it seem like things were faster.

I think this warrants a re-quote of your previous suggestion:

And finally: when it comes to performance, MEASURE MEASURE MEASURE! :)

Closing this issue due to PEBKAC. Thanks again for your help!

@JAndritsch
Copy link
Author

Here is the profiler results...for science!

screen shot 2016-05-20 at 5 45 07 am

@nvartolomei
Copy link

@JAndritsch don't forget about this: Note that for performance reasons propTypes is only checked in development mode.

From Prop Validation | Reusable Components

@JAndritsch
Copy link
Author

@JAndritsch don't forget about this: Note that for performance reasons propTypes is only checked in development mode.

Man, this issue and all the hours that got sucked into performance tuning is the result of skimming documentation instead of fully reading it.

I can't brain. I have the dumbs.

@markerikson
Copy link
Contributor

markerikson commented May 20, 2016

Glad you got that figured out. For what it's worth, I think you can make a few more improvements if you need more speed:

  • The state as you showed it is not completely normalized. Ideally, you'd want something like :
{
    images : {
        byId : {
            "1234" : { _id: '1234', path: '/path/to/image1234.jpg', thumb: '/path/to/thumb1234.jpg' },
            "1235" : { _id: '1235', path: '/path/to/image1235.jpg', thumb: '/path/to/thumb1235.jpg' },
        },
        allImages : ["1234", "1235"],
    },
}
  • Next, if you're only ever showing N images anyway, there's no reason to pass the entire images array into your Slider. Do the slicing in your mapStateToProps. That way, even if PropType validation is taking a while, it's taking a constant amount of time and not growing O(n). Note that if you do that, you'd likely be returning a new array reference from MSTP every time even if the index hasn't changed, so you might want to memoize the operation using a Reselect-based selector function.
  • Third, per my earlier comment, what you could try is breaking it down further. As a rough sketch:
const imageMapState = (state, ownProps) => ({image : state.images.byId[ownProps.imageId]})
const SliderImage = (props) => {
    const {image, index, goToSlide} = props;
    return (
        <div className="slide">
            <div className="file-thumbnail">
                <img
                    src={'file://' + image.thumbnailPath}
                    onClick={e => { 
                        e.preventDefault(); 
                        goToSlide(props.index); 
                    }}
                />
            </div>
        </div>
    );
}

const ConnectedSliderImage = connect(imageMapState)(SliderImage);

const sliderMapState = (state, ownProps) => ({
    const start = state.currentSlideIndex || 0;
    const size = ownProps.size || 10;
    const end = start + size;

    return {
        images: state.images.allImages.slice(start, end),
        currentSlideIndex: state.currentSlideIndex,
        start,
        end,
    };
})

class Slider extends Component {
    // snip
    constructor() {
        super();
        this.goToSlide = this.goToSlide.bind(this);
    }

    goToSlide(index) {
        SliderActions.setCurrentIndex(index);
    }

    render() {
        const {images, currentSlideIndex} = this.props;

        const renderedImages = images.map( (imageId, imageIndex) => (
            <ConnectedSliderImage 
                key={imageId} 
                imageId={imageId} 
                index={currentSlideIndex + imageIndex}
                goToSlide={this.goToSlide}
            />
        ));

        return (
            <div
                id="slider"
                className="file-list"
                ref={function(node) { self.container = node; }}
                tabIndex="0">
                {renderedImages}
            </div>
        );
    }
}

And finally:

I can't brain. I have the dumbs.

Naw :) You had a legitimate perf issue, you just needed to dig a bit further into what was going on. And I'll definitely admit that PropTypes validation wasn't actually anything I was aware could be expensive, either, so I've learned something out of this discussion.

@JAndritsch
Copy link
Author

JAndritsch commented May 20, 2016

Those are excellent suggestions! I especially dig the idea of restructuring state such that images are indexed via database id. That would eliminate the findIndex logic I've got in my reducer and potentially speed things up.

I appreciate the advice. Thanks again!

@gaearon
Copy link
Contributor

gaearon commented May 20, 2016

FAQ-worthy thread.

@markerikson
Copy link
Contributor

Yeah. Also we're at the point where a "Performance Optimizations" recipe page would be useful. Going to be pretty far back in my queue, though.

@JAndritsch
Copy link
Author

Yeah. Also we're at the point where a "Performance Optimizations" recipe page would be useful. Going to be pretty far back in my queue, though.

That would be awesomesauce. I'm definitely willing to help out if needed, although I'm no guru on Redux and performance. Let me know if there's something I can do.

@markerikson
Copy link
Contributor

Well, the general idea would be to comb through several of the better articles and discussions on Redux performance, and pull some of that info into a page that would go into the official docs. The "High Performance Redux" article linked above is a great starting point, as is Dan's answer in http://stackoverflow.com/questions/37264415/how-to-optimize-small-updates-to-props-of-nested-component-in-react-redux/ .

@iam-peekay
Copy link

@markerikson you mentioned that

"Recent perf tests have shown that an optimized pattern for managing lots of children is to have the parent only pass an ID to each child as a prop, and have each child itself be connected and request its own data."

I haven't seen this pattern before. Is this something that you tested and saw better results with?

@markerikson
Copy link
Contributor

@iam-peekay : I happened to start using that pattern myself in my own prototype independently, but more to the point, Dan used it to optimize the Redux vs MobX TodoMVC benchmark that Michael Weststrate did recently. The links I've posted in this discussion demonstrate that approach.

@gaearon
Copy link
Contributor

gaearon commented May 22, 2016

@iam-peekay

These links describe the approach:

@iam-peekay
Copy link

@markerikson @gaearon thank you! super helpful

@JAndritsch
Copy link
Author

JAndritsch commented May 22, 2016

@markerikson @gaearon Would there be any performance degradation as the number of connected components goes up? I'm trying to understand if I should be connecting container components and then passing the properties to the child components that need them or just connecting the individual child components instead.

Edit: I just read the SO post.

It’s better to have more granular connect() on several components in your view hierarchy that each only listen to a relevant slice of the state.

FAQContent.push(this)

@markerikson
Copy link
Contributor

Yeah, technically speaking, notification of subscribers is very much O(n). Not sure if I've seen any specific benchmarks on how expensive that step actually is. I would expect the primary concern to be how expensive calling mapStateToProps is for each subscriber.

That said, given the samples we've seen of the 10K connected list item case, that definitely still seems to be cheaper than re-rendering and diffing all the React components themselves.

I'd say that fewer subscribers and somewhat chunkier props is probably easier to start with, and switch to more subscribers and just passing item IDs if you definitely see you need better perf.

@JAndritsch
Copy link
Author

Hello again!

In an effort to help get that FAQ written I have created a short list of things one could do to help improve performance when using React and Redux. Please let me know if there's anything you would add (or remove) from this list.

  1. Connect components to state at the lowest level possible to prevent cascading renders from parent to children.
  2. Normalize your data and pass collections of IDs instead of collections of large objects to components.
  3. Use ImmutableJS.
  4. Take advantage of "initialProps".
  5. Re-define shouldComponentUpdate() when necessary.
  6. Use React.addons.Perf to find and eliminate wasted renders.
  7. Use Chrome's built-in profiling tools.

Once I get a good list of topics collected then I'll work on explanations and examples for each.

@markerikson
Copy link
Contributor

Looks like a reasonable start. Go ahead and create an issue to track this work. Thanks!

@gaearon
Copy link
Contributor

gaearon commented May 27, 2016

I would add memoization (e.g. Reselect) and per-instance memoized selectors. They are mentioned in http://redux.js.org/docs/recipes/ComputingDerivedData.html.

I also think it’s worth explaining how to think about performance in general. That subscriptions are O(n), that React Redux shallowly compares props, that memoization helps us avoid recalculations, etc.

This was referenced May 30, 2016
@markerikson
Copy link
Contributor

FAQContent.push(this)

Reviewing this thread while doing FAQ updates. As a highly belated note, I am compelled to point out that this should really be FAQContent.concat(thread751). Your example is both mutative and inserting a class reference into the store :)

@raRaRa
Copy link

raRaRa commented Oct 15, 2018

I have a question about performance with large data set and Reselect. Is there a good way to avoid having the Reselect compute (filter, sort, etc.) through all the items every time a new item is added? Because the selector gets both byId and allIds, which are updated every time a new item is added.

My concern is that I have over 10k items, and I'm worried that every time a new item is added, it will cause a slight hiccup in the rendering of the slideshow component. E.g. if it's playing a video, or transitioning between slides.

Before using Redux, I simply filtered out items as they came in, but that meant my data wasn't pure and a browser refresh was required if the slideshow settings were changed, e.g. a different start date was set, or the sort order changed, etc.

Thanks.

EDIT: When testing my selector with 500k items, the selector will take approx 821ms. So whenever a new item is added, I'll experience a hiccup of almost a second.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants