Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve typing performance by splitting attributes in state tree. #12312

Merged
merged 2 commits into from
Dec 20, 2018

Conversation

sgomes
Copy link
Contributor

@sgomes sgomes commented Nov 26, 2018

Description

Attributes have been moved to attributesByClientId in order to reduce the impact of typing throughout the state tree (See #11782).

Together with this, several selectors have been rewritten to take advantage of this separation by depending only on byClientId, which is now a less expensive branch of the state tree.

How has this been tested?

Unit tests and manual verification.

Performance has been tested by recording the quick typing of a short word in a large document and measuring the time taken handling each input event.

Screenshots

Not applicable.

Checklist:

  • My code is tested.
  • My code follows the WordPress code style.
  • My code follows the accessibility standards.
  • My code has proper inline documentation.

@sgomes sgomes added [Status] In Progress Tracking issues with work in progress [Type] Performance Related to performance efforts [Package] Editor /packages/editor User Experience (UX) labels Nov 26, 2018
@sgomes sgomes requested a review from youknowriad November 26, 2018 11:55
@youknowriad youknowriad requested a review from a team November 26, 2018 11:55
@@ -61,12 +62,10 @@ export function AlignmentToolbar( { isCollapsed, value, onChange, alignmentContr
);
}

const getClientIdOnly = memize( ( clientId ) => ( { clientId } ) );
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of using memize here I wonder if we should do a shallow comparison in withBlockEditContext instead as it would benefit more components. (similar to how we do in withSelect)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove this optimization for now; it's distracting from the main purpose of the PR anyway.

[ blockId ]: { attributes: flattenedBlocks[ blockId ].attributes },
} ), {} );
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A nice little refactoring was suggested here by @noisysocks #12268 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great idea! Done.

...state,
[ action.clientId ]: {
...state[ action.clientId ],
...omit( action.updates, 'attributes' ),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my PR, I did a little optimization here where I avoid generating a new state if the only key in action.updates is attributes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea! Added that in.


case 'UPDATE_BLOCK':
// Ignore updates if block isn't known
if ( ! state[ action.clientId ] ) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should avoid generating a new state here if there's no attributes updates in action.updates

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, another useful early return.

...state,
[ action.clientId ]: {
...state[ action.clientId ],
attributes: { ...action.updates.attributes },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The attributes key is redundant here, we can just remove it and update the state to be { [ clientID ]: attributes } instead of { [ clientID ]: { attributes } }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opted for maintaining the attributes key as a way of having the same general shape in both byClientId and attributesByClientId, but I'm not married to that choice.

I'm happy to change that if you feel strongly about it, but do let me know, as it will involve many changes to the test files.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a blocker for me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opted for maintaining the attributes key as a way of having the same general shape in both byClientId and attributesByClientId, but I'm not married to that choice.

But they're not meant to be the same sort of thing. The former is a block object, the latter (by its name) is exclusively attributes.

What else are we planning to include here? The spread of ...state[ action.clientId ] seems to imply that there might be other non-attribute properties.

Given the name attributesByClientId, I would assume the contents of the object are in-fact attributes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries, I'll make the change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in latest commit.

@@ -439,7 +440,7 @@ export function isEditedPostSaveable( state ) {
* @return {boolean} Whether post has content.
*/
export function isEditedPostEmpty( state ) {
const blocks = getBlocksForSerialization( state );
const blocks = getBlocksForSerializationWithoutAttributes( state );
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For more clarity, maybe this could be refactored to return just the blockNames : getBlocksNamesForSerialization

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed. Also removed the export.

*
* @return {Object} Parsed block object.
*/
export const getBlockWithoutAttributes = createSelector(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not certain why we need all these new selectors getBlockWithoutAttributes getBlocksWithoutAttributes, getBlocksByClientIdWithoutAttributes. I'd like to avoid exposing new selectors in this PR. Is this only for getBlocksForSerializationWithoutAttributes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It gets used directly or indirectly in a bunch of other selectors (such as getBlockNamesForSerialization, getInserterItems, etc.) , but you're right that we don't need to export these new selectors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latest commit removes the new selectors, in exchange for the approach of using the same full selectors everywhere, but changing the dependencies where possible. This way, they won't invalidate on attribute changes, but that's ok if the attributes are not being used.

@@ -1944,7 +2085,7 @@ export const getInserterItems = createSelector(
},
( state, rootClientId ) => [
state.blockListSettings[ rootClientId ],
state.editor.present.blocks,
state.editor.present.blocks.byClientId,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we probably need .order here as well right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I'm not certain

Copy link
Contributor Author

@sgomes sgomes Nov 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I can tell, block order should not affect inserter items, but I could be wrong. Is there someone with more domain-specific knowledge of inserter items that could weigh in?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nested blocks might have some impact on that, but as noted in the other PR, I would love us to explore whether we can skip it here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getInserterItems() relies on .order in two ways:

  1. It uses getClientIdsWithDescendants() which in turn uses .order. This is to set an inserter item's disabled attribute which is true when a block has already been used.
  2. It uses canIncludeReusableBlockInInserter() which in turn uses isAncestorOf() which in turn uses .order This is to stop reusable column blocks from being inserted into themselves which causes an infinite loop.

(1) should be safe to ignore because inserting or removing a block modifies .byClientId as well. (2) I'm less sure about—I think it's safe to ignore because one cannot create that infinite loop scenario by only re-ordering blocks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not too concerned with the performance implications of depending on .order, from what I've seen, so I'll add in that dependency just to be on the safe side. If this were to produce any issues due to the intricate dependencies you're describing, it could get extremely difficult to debug, so it's not worth the risk.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so I'll add in that dependency just to be on the safe side.

It still needs to be added?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had forgotten to push my changes.

@sgomes sgomes force-pushed the update/improve-typing-performance branch 3 times, most recently from a26dce0 to de80620 Compare November 26, 2018 12:56
@sgomes sgomes removed the [Status] In Progress Tracking issues with work in progress label Nov 26, 2018
@sgomes
Copy link
Contributor Author

sgomes commented Nov 26, 2018

All tests should now pass, and the code should be ready for a full review.

Addressed all existing comments and fixed everything except for removing the nesting of attributes inside attributesByClientId. @youknowriad: if you'd still like me to go ahead with that let me know, and I'll create a new commit in this PR for that.

@sgomes
Copy link
Contributor Author

sgomes commented Nov 26, 2018

With the current set of fixes, performance is improved by responding to keyboard input events ~17% faster.

@youknowriad
Copy link
Contributor

Unit tests are passing but the e2e tests are not. I'll see if I can find why.

@sgomes sgomes force-pushed the update/improve-typing-performance branch from 25f39ef to 85ff9fb Compare November 26, 2018 15:05
getBlockDependantsCacheBust( state, clientId ),
state.editor.present.blocks.order[ clientId ],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reasoning here? I think we're depending on order in general and not only order[ clientId ] because the nesting in this state is only one level deep.

Copy link
Contributor Author

@sgomes sgomes Nov 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless I'm mistaken, since getBlockDependantsCacheBustcalls getBlock for each child, we'll recursively depend on the entire set of IDs. Therefore, each getBlock only needs to worry about its own direct children, where it comes to order.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless I'm mistaken, since getBlockDependantsCacheBustcalls getBlock for each child, we'll recursively depend on the entire set of IDs. Therefore, each getBlock only needs to worry about its own direct children, where it comes to order.

Isn't this already encompassed by including getBlockDependantsCacheBust as a dependant? i.e. this line shouldn't be necesary at all?

Copy link
Contributor Author

@sgomes sgomes Nov 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, it's redundant. Removed.

( state ) => [
state.editor.present.blocks,
]
( state, rootClientId ) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should always be state.editor.present.blocks because even if we have a rootClientId we still want to access the attributes and content of the nested blocks. I suspect this change is the culprit breaking the e2e tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, right, because of the nesting! Excellent catch!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modified this to reuse the dependants of getBlock if a rootClientId is provided.

state.editor.present.edits.meta,
state.initialEdits.meta,
state.currentPost.meta,
state.editor.present.blocks,
...mapClientIds( clientIds, ( clientId ) => getBlock( state, clientId ) ),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to call getBlock here`. I suspect this harm performance more than it helps.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in general the dependencies should be left as simple as possible. Even if state.editor.present.blocks changes often, it's probably more performant to run the selector again instead of doing the less-performant memoization on each call. I have a similar concern for getMultiSelectedBlocks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem. If this affects things too much we can always change it back.

...state,
[ action.clientId ]: {
...state[ action.clientId ],
...omit( action.updates, 'attributes' ),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This value is already assigned above as changes. Avoiding a second omit would be ideal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

return state;
},

attributesByClientId( state = {}, action ) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bikeshedding: The naming of the properties of blocks already doesn't feel very consistent, and I'm worried this moves us further in that direction. If order is also keyed by client IDs, why not name it orderByClientId ? Or conversely, name this one just attributes ? What are the contents byClientId anyways, particularly relevant to the proposed changes of pulling some values out. It makes me wonder if we'd be better served by granularity on all of the properties, rather than some ambiguous "byClientId" block-ish object shape, i.e.

{
	blocks: {
		isValid, # Object<string, boolean>
		name, # Object<string, string>
		attributes, # Object<string, Object>
		order, # Object<string, Array>
	}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is probably a good idea :) each property is mapped by ClientID, making it easier to memoize selectors based on the properties used. Maybe better as a follow-up to keep the size of this PR contained?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe better as a follow-up to keep the size of this PR contained?

I raised because I think the proposed changes shine light on the fact that we might be better served by considering all of these block values in the same way (hash keyed by client ID), which could lead us to some solution for generalizing the reducer implementation rather than copy/paste-ing large parts of it across as we're doing here. I'd worry we'd continue this pattern in some possible future refactor if set as a precedent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to make the change either in this PR or in a follow-up one. Let me know the final decision (when and which of the proposed conventions to go for) and I'll make it happen :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's try to do it here then.

Copy link
Member

@aduth aduth Nov 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a more general comment, following the proposed approach would essentially encode the various properties of a block as separate branches in the state tree. The benefit is performance and separation, as discussed, but this comes at the cost of more difficulty in adding more properties to block in the future (or changing them), I would say.

In my mind, I'd hoped it'd be some way we could programmatically generate the objects (reducers) based on some set of expected keys (isValid, name, attributes, and I suppose originalContent and any others), and that by doing it generically we would avoid any incurred cost in adding additional properties in the future, since it'd just be a key to add to the set.

This would assume that the logic for handling each key is consistent, which I'm not entirely confident is something we can assume (at least for attributes?). In which case, the refactoring may be more difficult and thus potentially not worthwhile.

The big issue for me was in evaluating this part of state, how would you or anyone else answer the prompt "What is represented in the value of byClientId?" It's making less sense over time as we pull out separate properties to be able to call it as anything remotely resembling a block object.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More practically speaking, in my mind I'd had in mind some higher-order reducer akin to onSubKey, specific to handling the reducer actions relevant for blocks, on each block key.

So something roughly like:

BLOCK_KEYS = [ 'name', 'isValid', 'originalContent', 'attributes' /* ? */ ]
blocks: combineReducers( BLOCK_KEYS.map( ( key ) => [ key, onBlockSubKey( key ) ] ) ) )

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, that may be doable with a few exceptions here and there. I'll take a look!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If order is also keyed by client IDs, why not name it orderByClientId ? Or conversely, name this one just attributes ?

Even if we're deferring a greater refactor, I'm still inclined to align these two. Thoughts on renaming to just attributes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, no problem! I have no strong opinions there. Added a commit with the change.

@@ -754,6 +766,11 @@ export const getGlobalBlockCount = createSelector(
]
);

const mapClientIds = ( clientIds, fn ) => map(
castArray( clientIds ),
( clientId ) => fn( clientId )
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: This line could be just fn ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method doesn't exist anymore.

@@ -754,6 +766,11 @@ export const getGlobalBlockCount = createSelector(
]
);

const mapClientIds = ( clientIds, fn ) => map(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not loving a randomly-placed, un-documented, un-tested utility function located here. Thinking either placement in a utilities file, or placement at top of function to have some arrangement of selectors vs. non-selectors. Is it just handling the coercion of castArray?

I don't see a ton of value in pulling this out as a function vs. just inlining.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed, no longer needed.

@@ -1140,7 +1156,7 @@ const isAncestorOf = createSelector(
return possibleAncestorId === idToCheck;
},
( state ) => [
state.editor.present.blocks,
state.editor.present.blocks.order,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these revisions relevant to the proposed change? Or are these things which are technically separate issues?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, these changes are one of the important changes in this PR as we don't invalidate these selectors unless the inserted blocks (order) change instead of any attribute change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see it would have an important impact on the overall benefit of the memoization; but is it specifically related to pulling out attributes to a separate state path?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, you're right, this one in particular is not directly related.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy enough to create a separate PR for this, but it doesn't seem like it would be outlandish to keep it in this PR either.

@@ -273,6 +300,48 @@ const withBlockReset = ( reducer ) => ( state, action ) => {
return reducer( state, action );
};

/**
* Higher-order reducer which targets the combined blocks reducer and handles
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "handles" entail? It's not immediately obvious to me as a reviewer now, let alone a future hypothetical maintainer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify a bit: This HoR only updates the attributes of blocks not the "byClientId" part but we need to access "byClientId" to know the name of the blocks as we only want to change the core/block blocks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add a clarification that this HoR exists because it needs access to both branches of the state tree simultaneously.

newState.attributesByClientId = {};

Object.keys( state.byClientId ).forEach( ( blockId ) => {
const block = { ...state.byClientId[ blockId ] };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we create shallow clones?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good question. This must have stayed behind as I was figuring out the reducer. Sorry about that. Fixed!

@sgomes
Copy link
Contributor Author

sgomes commented Nov 26, 2018

Current status: all tests pass, improvement vs. master (at branching point) at 20%.

Copy link
Member

@noisysocks noisysocks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is working well in my testing. Nice work! 😍

I left some mostly stylistic comments.

newState.attributesByClientId = {};

Object.keys( state.byClientId ).forEach( ( blockId ) => {
const block = newState.byClientId[ blockId ];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: _.forEach could simplify this a little as it passes the key to the given callback.

forEach( state.byClientId, ( block, blockId ) => {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I don't think I had ever used lodash's forEach before.

* @return {Function} Enhanced reducer function.
*/
const withSaveReusableBlock = ( reducer ) => ( state, action ) => {
let newState = state;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this variable necessary? We could re-assign state, instead. That would make it a little clearer below on lines 327 and 328 that state.byClientId[ blockId ] is the same object as newState.byClientId[ blockId ].

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, absolutely right. If we're just reassigning, we don't need the extra variable.


Object.keys( state.byClientId ).forEach( ( blockId ) => {
const block = newState.byClientId[ blockId ];
let attributes = { ...state.attributesByClientId[ blockId ] };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't need to clone this object both here and then again on line 333.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed as part of using mapValues.

newState = { ...state };
newState.attributesByClientId = {};

Object.keys( state.byClientId ).forEach( ( blockId ) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Naming this param clientId would be more consistent with how we refer to these strings elsewhere in Gutenberg.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

return state;
}

newState = { ...state };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm honest, I'm having a hard time keeping track of the object cloning and mutation in this function. Are we optimising for performance here? SAVE_REUSABLE_BLOCK_SUCCESS is dispatched relatively rarely, so I'm not sure that optimising this is worth sacrificing readability.

Could we use mapValues, as before?

newState.attributesByClientId = mapValues( newState.attributesByClientId, ( attributes, clientId ) => {
	const { name } = newState.byClientId[ clientId ];
	if ( name === 'core/block' && attributes.ref === id ) {
		return {
			...attributes,
			ref: updatedId,
		};
	}

	return attributes;
} );

Copy link
Contributor Author

@sgomes sgomes Nov 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I over-engineered this reducer as I was figuring out what it was doing, and never went back to see if I could rewrite it in a simpler way. Thanks for that, it's much better! 👍

if ( nextAttributes === state[ action.clientId ].attributes ) {
// Do nothing if only attributes change.
const changes = omit( action.updates, 'attributes' );
if ( keys( changes ).length === 0 ) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Could use _.isEmpty here.

if ( isEmpty( changes ) ) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -1069,6 +1078,15 @@ export const getMultiSelectedBlockClientIds = createSelector(
],
);

const mapMultiSelectedBlockClientIds = ( state, fn ) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the benefit of moving this to its own helper function?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We generally prefer to avoid using abbreviations when naming variables in Gutenberg as they aren't all that descriptive for future maintainers. Something like callback or iteratee would be a better name for fn here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no advantage anymore. This was being used in multiple places, but after other changes, that's no longer the case. Good catch! Moving into getMultiSelectedBlocks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving into getMultiSelectedBlocks.

It still needs to be moved?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like I had forgotten to push my changes.

@@ -1944,7 +2085,7 @@ export const getInserterItems = createSelector(
},
( state, rootClientId ) => [
state.blockListSettings[ rootClientId ],
state.editor.present.blocks,
state.editor.present.blocks.byClientId,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getInserterItems() relies on .order in two ways:

  1. It uses getClientIdsWithDescendants() which in turn uses .order. This is to set an inserter item's disabled attribute which is true when a block has already been used.
  2. It uses canIncludeReusableBlockInInserter() which in turn uses isAncestorOf() which in turn uses .order This is to stop reusable column blocks from being inserted into themselves which causes an infinite loop.

(1) should be safe to ignore because inserting or removing a block modifies .byClientId as well. (2) I'm less sure about—I think it's safe to ignore because one cannot create that infinite loop scenario by only re-ordering blocks.

@sgomes
Copy link
Contributor Author

sgomes commented Nov 27, 2018

Thanks for the review, @noisysocks ! I made the changes you suggested and will now explore separating all block keys into their own branch (not just attributes), as suggested by @aduth.

@sgomes
Copy link
Contributor Author

sgomes commented Nov 27, 2018

I've successfully implemented the changes suggested by @aduth in my local machine, but as a result, performance has absolutely tanked.

I'll spend some time tomorrow trying to figure out why, but in the interest of keeping the PR in a good state, I'm not submitting those changes until I figure out the root cause and whether it's fixable.

@aduth
Copy link
Member

aduth commented Nov 27, 2018

@sgomes Could you put the contents of git diff | pbcopy into a Gist so that others could potentially evaluate in the meantime?

@sgomes
Copy link
Contributor Author

sgomes commented Nov 28, 2018

@sgomes Could you put the contents of git diff | pbcopy into a Gist so that others could potentially evaluate in the meantime?

Here's the WIP diff: https://gist.github.com/sgomes/1b38919a07ba02a0712e390fa6b263cc

@sgomes
Copy link
Contributor Author

sgomes commented Nov 28, 2018

I spent all of yesterday implementing and validating the proposed changes by @aduth, and a considerable amount of time today trying to understand why they don't perform as expected.

My conclusion so far is that getBlock isn't memoizing as expected, but I can't figure out why, as the items that are added to the dependants array stay equal by reference, as far as I can tell.

My suggestion is to review the PR as it is, and leave the proposed changes for later if needed for clarity, which I believe was the main argument. Changes of this nature are very well contained, affecting only the reducers, the selectors, and their corresponding tests, so it doesn't seem to me like too much of an issue to change things twice if needed.

@sgomes
Copy link
Contributor Author

sgomes commented Nov 28, 2018

It appears that rememo doesn't handle native-value (e.g. string, boolean) dependants the way I expected it would; that is, by directly comparing them with their past values, taking into account the getDependant arguments.

It's not clear to me whether this is a bug in rememo or an intentional design decision, but the fact that it tries to record the lastDependents (but does so apparently regardless of the getDependant arguments) leads me to believe it might be a bug. I haven't fully grokked the library code, though, so I could be wrong.

@aduth
Copy link
Member

aduth commented Nov 28, 2018

It appears that rememo doesn't handle native-value (e.g. string, boolean) dependants the way I expected it would; that is, by directly comparing them with their past values, taking into account the dependant function arguments.

Would you be able to put together a minimal example demonstrating the issue? I'm not sure I follow what's unexpected.

FYI, the rememo package is my own, so I'd be glad to help clarify or resolve any issues.

There is some distinct treatment (more an optimization) for objects and arrays as dependants, since these can be used as keys of a WeakMap to avoid explicitly invalidating a cache value.

@sgomes
Copy link
Contributor Author

sgomes commented Nov 28, 2018

Would you be able to put together a minimal example demonstrating the issue? I'm not sure I follow what's unexpected.

No problem, here you go: https://codepen.io/anon/pen/zMmPJo?editors=0011

Let me know if there's anything unclear about the example!

@aduth
Copy link
Member

aduth commented Nov 28, 2018

Thanks for the example @sgomes . On further evaluation, I think this is working as intended, though I would agree that it is quite confusing, and is one of the reasons I was reluctant to provide arguments to the getDependants function (in fact, the very earliest implementation upon which it was inspired did not pass them).

The result of getDependants is effectively a statement of "if any of these values change, then clear the cache". "Clear the cache" in a naive case can mean the entire cache for the selector, but it was "enhanced" (arguably in retrospect 😄 ) to limit the cache invalidation when the dependants are all object-like. I think this explains the difference in behavior you're seeing for object vs. primitive types.

See also: aduth/rememo#1

@sgomes
Copy link
Contributor Author

sgomes commented Nov 28, 2018

I see. That's... problematic. 😅 As far as I can tell that mostly invalidates the state structure you suggested, since it would effectively not be memoizable per-clientId, unless I'm missing something? At least without custom memoization.

blocks: {
  name: { A: 'foo', B: 'bar' },
  attributes: { A: {...}, B: {...} },
  isValid: { A: true, B: true },
  ...
  order: [ ... ]
}

I mean, from what you're saying, per-clientId memoization in getBlock currently only works because all of the dependants in there happen to be non-primitive types, correct?

FWIW, wp-calypso has a simple createSelector library that achieves memoization based on the dependant function parameters, by generating a cache key based on a simple join of the parameters. The restriction there becomes that the parameters should all be primitive types, but that's usually not an issue.

In any case, this all makes me wonder more whether it's possible to keep this PR as-is for the performance benefits, and look into how to achieve the perfect state structure later? 😁

@aduth
Copy link
Member

aduth commented Nov 28, 2018

I mean, from what you're saying, per-clientId memoization in getBlock currently only works because all of the dependants in there happen to be non-primitive types, correct?

Yes, that's correct.

As far as I can tell that mostly invalidates the state structure you suggested, since it would effectively not be memoizable per-clientId, unless I'm missing something? At least without custom memoization.

I suppose this is true, yes.

FWIW, wp-calypso has a simple createSelector library that achieves memoization based on the dependant function parameters, by generating a cache key based on a simple join of the parameters. The restriction there becomes that the parameters should all be primitive types, but that's usually not an issue.

It still destroys the entire cache if dependants references change:

https://github.com/Automattic/wp-calypso/blob/77f67b116e45d1c1cb75e50aae77b01a34fa39c1/client/lib/create-selector/index.js#L102-L104

This is where Calypso's treeSelect is meant to serve to have more granular cache invalidation:

https://github.com/Automattic/wp-calypso/tree/master/client/lib/tree-select
Automattic/wp-calypso#20547

rememo is effectively implemented as an amalgamation of the two, adopting optimizations like those found in treeSelect when it detects being possible.

In any case, this all makes me wonder more whether it's possible to keep this PR as-is for the performance benefits, and look into how to achieve the perfect state structure later? 😁

I'll plan to give this another pass over, but yes, I'd generally be agreeable to a more immediate resolution.

*
* @param {Array} blocks Blocks to flatten.
* @param {*} blocks Blocks to flatten.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was the type changed to a wildcard * ? It's an Array.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's odd, I don't remember changing that. I need to check my IDE settings for JSDoc. Good catch, thanks!

*
* @param {Array} blocks Blocks to flatten.
* @param {*} blocks Blocks to flatten.
* @param {*} transform Transforming function to be applied to each block.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a wildcard *, it's a Function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@@ -1069,6 +1078,15 @@ export const getMultiSelectedBlockClientIds = createSelector(
],
);

const mapMultiSelectedBlockClientIds = ( state, fn ) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving into getMultiSelectedBlocks.

It still needs to be moved?

@@ -1944,7 +2085,7 @@ export const getInserterItems = createSelector(
},
( state, rootClientId ) => [
state.blockListSettings[ rootClientId ],
state.editor.present.blocks,
state.editor.present.blocks.byClientId,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so I'll add in that dependency just to be on the safe side.

It still needs to be added?

return state;
},

attributesByClientId( state = {}, action ) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If order is also keyed by client IDs, why not name it orderByClientId ? Or conversely, name this one just attributes ?

Even if we're deferring a greater refactor, I'm still inclined to align these two. Thoughts on renaming to just attributes?

getBlockDependantsCacheBust( state, clientId ),
state.editor.present.blocks.order[ clientId ],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless I'm mistaken, since getBlockDependantsCacheBustcalls getBlock for each child, we'll recursively depend on the entire set of IDs. Therefore, each getBlock only needs to worry about its own direct children, where it comes to order.

Isn't this already encompassed by including getBlockDependantsCacheBust as a dependant? i.e. this line shouldn't be necesary at all?

];
}
return [
getBlock.getDependants( state, rootClientId ),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is wrong because it will produce a nested array. getDependants returns a newly-generated array every time it's called, so the dependants will always be different. This should either (a) spread ... into the return array or (b) be returned directly as the return value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent catch! Silly mistake; fixed.

// Skip update if nothing has been changed. The reference will
// match the original block if `reduce` had no changed values.
if ( nextAttributes === state[ action.clientId ].attributes ) {
// Do nothing if only attributes change.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does / should this do if the update is for innerBlocks?

Copy link
Contributor Author

@sgomes sgomes Nov 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it valid semantics to modify inner blocks with an UPDATE_BLOCK instead of MOVE_* and other actions? I assumed not, since that possibility isn't covered in the existing code. Or rather, I assumed no special attention would need to be paid to it. Should I add a check to discard any innerBlocks changes?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it valid semantics to modify inner blocks with an UPDATE_BLOCK instead of MOVE_* and other actions?

It's a downside of the "update" action being so generic, but hypothetically it should be able to expect receiving any properties of a block object (the value returned by createBlock).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I think it's something we should respect, we already don't have consideration for a changed innerBlocks from the order subtree, so I think it'd be fine to defer.

};

case 'REPLACE_BLOCKS':
if ( ! action.blocks ) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this was inherited, but looking at it, it seems quite wrong. We call replaceBlocks with an empty array when, for example, removing a paragraph:

// If before content is omitted, treat as intent to delete block.
onReplace( [] );

If I'm reading this right, this behavior would leave lingering state for the removed block?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An empty array is truthy, so it seems to me that this should work correctly, unless I'm missing something?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An empty array is truthy, so it seems to me that this should work correctly, unless I'm missing something?

That's a good point. Makes me wonder why the condition exists at all then 🤷‍♂️

@sgomes
Copy link
Contributor Author

sgomes commented Nov 29, 2018

@aduth I believe I addressed all comments, even though the GitHub UI makes it look otherwise. Let me know if I missed anything!

@sgomes
Copy link
Contributor Author

sgomes commented Nov 29, 2018

E2E tests seem to be consistently timing out. Not sure if this reflects a problem in the code or merely CI issues.

@sgomes
Copy link
Contributor Author

sgomes commented Nov 29, 2018

This time the measured improvement is ~26%.

@aduth
Copy link
Member

aduth commented Dec 3, 2018

There were end-to-end failures across the board last week which have since been resolved. They should be cleared up if you rebase your branch or merge the latest master.

In the meantime, I'll give this another review pass.

@aduth
Copy link
Member

aduth commented Dec 3, 2018

I think there's some reconciliation needed between this and the since-merged #11899, which had introduced a new getBlockAttributes selector.

@sgomes sgomes force-pushed the update/improve-typing-performance branch from 3fbd342 to 0a2fc82 Compare December 4, 2018 11:33
@sgomes
Copy link
Contributor Author

sgomes commented Dec 4, 2018

Squashed all commits to simplify rebasing. Working on that now.

Attributes have been moved to `attributesByClientId` in order to
reduce the impact of typing throughout the state tree.
See #11782.

Review fixes pass #1

Simplify block flattening code in reducers.

Remove alignment toolbar optimization; should be a different PR

Fix minor bug in test

Fix failing getBlockDependantsCacheBust test

Remove new `*WithoutAttributes` selectors.

We'll go with a different approach: use the existing selectors, but
keep the dependencies as they are. The attributes may get stale, but
it doesn't matter if they're not being used.

Change attributesById structure to not have an inner attributes

Simplifying some selector dependencies

Simplify withSaveReusableBlock a bit further and clarify its existence

Reuse constant instead of running omit again

Remove no longer needed mapClientIds

Simplifying reducers after review comments.

Further changes to selectors after review

Fix types in JSDoc

Renaming attributesByClientId to attributes

Further selector fixes after review
@sgomes sgomes force-pushed the update/improve-typing-performance branch from 0a2fc82 to 99f8a2b Compare December 4, 2018 11:51
@sgomes
Copy link
Contributor Author

sgomes commented Dec 4, 2018

Rebased on master and made to work well with getBlockAttributes.

@ellatrix
Copy link
Member

ellatrix commented Dec 4, 2018

@sgomes Will this help with garbage collection at random function calls that I'm seeing in master, sometimes multiple times per key press, other times after several key presses?

Sometimes this adds up to over 14MB and I've seen this to take over 130ms.

screen shot 2018-12-04 at 12 33 57

screen shot 2018-12-04 at 13 01 10

@sgomes
Copy link
Contributor Author

sgomes commented Dec 4, 2018

@iseulde It's possible, but that's hard to answer definitively; garbage collection is somewhat opaque and hard to trace to what's taking place in memory exactly. This PR does reduce the amount of work done when typing (particularly where it comes to selectors), but it's difficult to say whether that translates into any benefits in terms of fewer reallocations taking place.

My advice would be to measure the PR branch versus master several times and see if there seems to be a statistical difference between the two.

},
};

case 'UPDATE_BLOCK_ATTRIBUTES':
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kinda makes me wish we'd normalized UPDATE_BLOCK_ATTRIBUTES to use UPDATE_BLOCK, but it's also not something I think we ought to be addressing here.

state.editor.present.blocks,
]
( state, rootClientId ) => {
if ( ! rootClientId ) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the prior noted observation about dependants caching not occurring per arguments set, I'm not sure this is a change we should want to make. Is it even relevant to the changes, or could it at least be considered separately?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing like measuring :) I couldn't find any meaningful difference between the two, so I reverted back to the simpler code as per your suggestion.

Copy link
Member

@aduth aduth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good 👍 Let's hold off on merging until the code freeze is released.

@SchneiderSam
Copy link

Technically I have no idea, but I wonder if it doesn't make sense to add this PR to the 5.0.1 roadmap, especially since this bug should be solved with 5.0.1:

screenshot_9

If that has nothing to do with it: Sorry ;-)

You are great! Thank you very much for your great work!!!

@aduth
Copy link
Member

aduth commented Dec 11, 2018

Given it is higher risk than other related performance effort, it's been slated for 4.8 (WP 5.0.2) rather than the more-immediate 5.0.1.

@youknowriad youknowriad merged commit 4f6fdbd into master Dec 20, 2018
@youknowriad youknowriad deleted the update/improve-typing-performance branch December 20, 2018 08:34
@youknowriad
Copy link
Contributor

Awesome work on this PR @sgomes Thanks.

youknowriad pushed a commit that referenced this pull request Jan 3, 2019
…2312)

* Improve typing performance by splitting attributes in state tree.

Attributes have been moved to `attributesByClientId` in order to
reduce the impact of typing throughout the state tree.
See #11782.

Review fixes pass #1

Simplify block flattening code in reducers.

Remove alignment toolbar optimization; should be a different PR

Fix minor bug in test

Fix failing getBlockDependantsCacheBust test

Remove new `*WithoutAttributes` selectors.

We'll go with a different approach: use the existing selectors, but
keep the dependencies as they are. The attributes may get stale, but
it doesn't matter if they're not being used.

Change attributesById structure to not have an inner attributes

Simplifying some selector dependencies

Simplify withSaveReusableBlock a bit further and clarify its existence

Reuse constant instead of running omit again

Remove no longer needed mapClientIds

Simplifying reducers after review comments.

Further changes to selectors after review

Fix types in JSDoc

Renaming attributesByClientId to attributes

Further selector fixes after review

* Revert changes to getBlocks dependants
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Package] Editor /packages/editor [Type] Performance Related to performance efforts
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants