Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recycling heuristics have false positives and false negatives #1653

Closed
pygy opened this issue Feb 21, 2017 · 24 comments · Fixed by #2122
Closed

Recycling heuristics have false positives and false negatives #1653

pygy opened this issue Feb 21, 2017 · 24 comments · Fixed by #2122
Assignees
Labels
Type: Bug For bugs and any other unexpected breakage

Comments

@pygy
Copy link
Member

pygy commented Feb 21, 2017

Currently, vnodes are added to a recycling pool if:

  1. they are not a component (typeof vnode.tag === "string")
  2. they are not a fragment or trusted (vnode.domSize == null)
  3. there are no hooks that can modify vnode.dom in vnode.attrs, at the time of removal

This excludes nodes that could be recycled (fragments and components), and includes nodes that may have had problematic vnode.attrs in the past.

These heuristics are only applied to vnode that are detached (those where onbeforeremove can fire). However, once a vnode has joined the pool, its children are recycled as well even if they don't meet those criteria, thus possibly recycling DOM nodes that have been modified in the hooks.

Proposed solution:

  • Add a vnode.reuse = true field.
  • Set it to false for trusted nodes and when a problematic hook fires (by modifying initLifecycle() and friends)
  • Only add nodes that have vnode.reuse to the pool;
  • In updateNodes(), when recycling === true, verify that vnode.reuse === true as well.

AFAICT these heuristics would allow to recycle components, fragments and don't have the holes described above.

@dead-claudia dead-claudia added the Type: Breaking Change For any feature request or suggestion that could reasonably break existing code label Feb 22, 2017
@pygy
Copy link
Member Author

pygy commented Feb 28, 2017

@isiahmeadows I'm a bit confused by the tag, AFAICT, the only thing the proposal would break is buggy behavior.

Unless you refer to the addition of a field to vnode?

@dead-claudia dead-claudia added Type: Enhancement For any feature request or suggestion that isn't a bug fix and removed Type: Breaking Change For any feature request or suggestion that could reasonably break existing code labels Feb 28, 2017
@dead-claudia
Copy link
Member

dead-claudia commented Feb 28, 2017

@pygy I swapped the labels. While I was on the bus, I had a similar idea myself, as a basis for other optimizations. (I'll detail them later in a gist.) It was mostly the part regarding changing pooling, though.

I have a better idea, though, although it's a little more breaking in some places:

  • Add a new optional vnode.state.pooled boolean. This can even be specified as a property on object components, so you don't need oninit to specify it when you're using oncreate and onupdate.
  • Copy this to vnode.pooled after init, documented as read-only.
  • Check vnode.pooled when recycling.
  • If vnode.state.pooled is undefined, keep the current default.

That way, we are using a constant that we can track for our own purposes, and we can keep our appropriate default. Note that performance-wise, it makes no difference.

@dead-claudia
Copy link
Member

Note that we shouldn't pool custom elements by default, since they aren't purely host objects (they do usually execute JavaScript).

@pygy
Copy link
Member Author

pygy commented Feb 28, 2017

I'd add [bug] as well... False negatives in the heuristics for adding an object to the pool lead to slower diff with correct semantics, but false positives are bugs...

vnode.pooled, when non-null, would take precedence on vnode.reuse?
We can skip the vnode.state.pooled transfer, and use a function decorator:

function pooled(status, vnode) {
  vnode.pooled = status
  return vnode
}

Or a pooled(vnode)/ nonPooled(vnode) pair that sets the field to true/false, respectively.

If we get it right , nonPooled should be superfluous.

And good catch re. custom elements.

@dead-claudia
Copy link
Member

dead-claudia commented Feb 28, 2017 via email

@dead-claudia
Copy link
Member

dead-claudia commented Feb 28, 2017 via email

@pygy
Copy link
Member Author

pygy commented Feb 28, 2017

👎 for adding another reserved name on vnode.state (the hooks are sufficient as gotchas)...

I don't understand why you'd want pooled or reuse to be read-only... If we want my original proposal to work, it has to be set to false by the engine when problematic situations are encountered (hooks that allow access to the vnode, etc...). Or do you mean that user code should not mess with it after initialization (in other words, that it is owned by the engine... another 👎 from me if it is on the state, which is the only user-owned object on vnodes)?

@dead-claudia
Copy link
Member

dead-claudia commented Mar 1, 2017

My other idea is this:

Make vnode.reuse a nullable boolean (i.e. boolean, null or undefined). You can then set appropriate defaults about vnode.reuse when you're ready to collect. Of course, that'll result in a boxed boolean, but I'm not sure whether it'll require an allocation or not. (Most engines use special constants for null, undefined, true, and false, but I don't know whether engines specialize for nullable booleans because of this.)

@pygy
Copy link
Member Author

pygy commented Mar 1, 2017

So it would default to null, users could set it to either true or false, and the engine would only mutate it if it were null.

Nodes that have either reuse === null or true would be recycled, whereas nodes with false wouldn't?

What we could have is a pair of boolean fields, one for the engine, and one for users.

The engine flag (reuse) would behave as described in the original post (i.e. be as conservative as needed with the info it has). The user flag (pooled) would default to false which would let the engine do its thing. If set to true the node would be recycled, even if internal heuristics have flagged the node as non-recyclable.

i.e. the logic would be in removeNode: if (vnode.pooled || vnode.reuse) {/*add to the pool*/}. Likewise in updateNode, we'd have if (recycling && (vnode.pooled || old.reuse)) {/*recycle here*/}

@dead-claudia
Copy link
Member

@pygy I'd like to clarify:

  • vnode.reuse is initially set to undefined.
  • oninit is called.
  • Things happen.
  • onremove is called, and the element removed.
  • If vnode.reuse is true or no oncreate/etc. hooks are defined and the node isn't a custom element, reuse the element
  • If vnode.reuse is false, any oncreate/etc. hooks are defined, or the node is a custom element, don't reuse it.

So what would happen is a single reuse: undefined field is added to the vnode (minimal memory hit), and at the end, the recycle check would look closer to this:

function shouldReuse(vnode) {
    if (vnode.reuse != null) return !!vnode.reuse
    if (vnode.dom.tagName.indexOf("-") >= 0 && !isSpecialNode(vnode.dom)) return false
    if (typeof vnode.state.oncreate === "function") return false
    if (typeof vnode.state.onupdate === "function") return false
    // etc...
    return true
}

// Helper for brevity
function isSpecialNode(node) {
    if (node.tagName === "color-profile") {
        return node.namespaceURI === "http://www.w3.org/2000/svg"
    } else if (node.tagName === "annotation-xml") {
        return node.namespaceURI === "http://www.w3.org/1998/Math/MathML"
    } else {
        return false
    }
}

@pygy
Copy link
Member Author

pygy commented Mar 1, 2017

You can't rely on the presence of hooks at removal time. The attrs hooks (and now even the state hooks) may be added/removed at any time. You must set reuse to false when the hooks are called/scheduled (this is all detailed in the original post).

@dead-claudia
Copy link
Member

Oh, good point. We can set the default after oninit is run, then.

@pygy
Copy link
Member Author

pygy commented Mar 2, 2017

I think we can get away with a single, non-nullable boolean field on vnode.

If the engine sets vnode.reuse to false before calling the hooks, the hooks can set it back to true for the pool to process.

	function initLifecycle(source, vnode, hooks) {
		if (typeof source.oninit === "function") source.oninit.call(vnode.state, vnode)
		if (typeof source.oncreate === "function") hooks.push(source.oncreate.bind(vnode.state, vnode))
	}

Would become

	function initLifecycle(source, vnode, hooks) {
		if (typeof source.oninit === "function") source.oninit.call(vnode.state, vnode)
		if (typeof source.oncreate === "function") {
			vnode.reuse = false
			hooks.push(source.oncreate.bind(vnode.state, vnode))
		}
	}

etc...

This would allow the hooks that only read from vnode.dom to mark themselves as 'safe'.

@dead-claudia
Copy link
Member

@pygy Eh...

I still would rather defer resolving the default as late as possible, even if it's via object member. Just in case it's set outside of oncreate and friends.

@pygy pygy self-assigned this Mar 3, 2017
@pygy pygy added the Type: Bug For bugs and any other unexpected breakage label Mar 3, 2017
@pygy
Copy link
Member Author

pygy commented Apr 2, 2017

@lhorie I'm trying to make sense of isRecyclable()

function isRecyclable(old, vnodes) {
	if (old.pool != null && Math.abs(old.pool.length - vnodes.length) <= Math.abs(old.length - vnodes.length)) {
		var oldChildrenLength = old[0] && old[0].children && old[0].children.length || 0
		var poolChildrenLength = old.pool[0] && old.pool[0].children && old.pool[0].children.length || 0
		var vnodesChildrenLength = vnodes[0] && vnodes[0].children && vnodes[0].children.length || 0
		if (Math.abs(poolChildrenLength - vnodesChildrenLength) <= Math.abs(oldChildrenLength - vnodesChildrenLength)) {
			return true
		}
	}
	return false
}

In plain English, it returns true when the new vnode list is closer in length to the pool than to the old vnode list (and if the same goes for the children arrays of the first vnode in each of them).

Simplified:

function test(old, vnodes, pool) {
  return Math.abs(pool - vnodes) <= Math.abs(old - vnodes)
}
test(1, 1, 1) // true
test(0, 1, 1) // true, used in most tests
test(0, 1, 4) // false
test(0, 3, 4) // true

I don't understand the reason for this. It means that a very large pool will be ignored, for example. I also don't understand why it looks at the children of each first element.

o("element nodes are not recycled when the pool is much larger than the new/old vnode lists (non-keyed)", function(){
	var initial = [{tag: "div"}, {tag: "div"}, {tag: "div"}, {tag: "div"}]
	var empty = []
	var next = {tag: "div"}

	render(root, initial)
	render(root, empty)

	o(empty.pool.length).equals(4)
	empty.pool.forEach(function(vnode){vnode.dom.wasRecycled = true})

	render(root, [next])

	o(next.dom.wasRecycled).equals(undefined) // passes, I wish it didn't
})
o("here recycling works:", function(){
	var initial = [{tag: "div"}, {tag: "div"}, {tag: "div"}, {tag: "div"}]
	var empty = []
	var next = {tag: "div"}

	render(root, initial)
	render(root, empty)

	o(empty.pool.length).equals(4)
	empty.pool.forEach(function(vnode){vnode.dom.wasRecycled = true})

	render(root, [next, {tag: "div"}, {tag: "div"}])

	o(next.dom.wasRecycled).equals(true)
})

Edit: couldn't we use something along these lines?

old.pool != null && (vnodes[0].key != null ? true : vnodes.length > old.length) (with more null checks)

@pygy
Copy link
Member Author

pygy commented Apr 2, 2017

75b45a6#diff-000bdfae56251b0715111d2b18c9de3cL349 implements the last suggestion and seems harmless.

@pygy pygy added Type: Bug For bugs and any other unexpected breakage and removed Type: Bug For bugs and any other unexpected breakage Type: Enhancement For any feature request or suggestion that isn't a bug fix labels Apr 4, 2017
pygy added a commit to pygy/mithril.js that referenced this issue Dec 5, 2017
pygy added a commit to pygy/mithril.js that referenced this issue Dec 5, 2017
pygy added a commit to pygy/mithril.js that referenced this issue Dec 5, 2017
pygy added a commit to pygy/mithril.js that referenced this issue Dec 5, 2017
pygy added a commit to pygy/mithril.js that referenced this issue Dec 5, 2017
pygy added a commit to pygy/mithril.js that referenced this issue Dec 9, 2017
@CarlLeth
Copy link

I'm finding it very easy to get into a state where mithril will try to recreate entire portions of the DOM even when literally nothing in the vnode tree has changed. I'm just starting to delve into why but it's clear it has something to do with recycling false positives, possibly due to pools sticking around on vnodes when they shouldn't. It seems like once a vnode is ever recycled, its DOM gets recreated every
render cycle whether it's been touched or not.

I applied pygy's diff to the current npm version (1.1.6) and it did not fix the issue.

I'll post here again if I learn more about what's causing it.

@dead-claudia
Copy link
Member

@CarlLeth If you have code to repro your issue, that'd be perfect. 👍

@CarlLeth
Copy link

CarlLeth commented Jan 24, 2018

Ok I tried to trim down my scenario to the bare bones and I was able to recreate the issue. You'll need to put a breakpoint in the removeNode function to see it happening, since measuring it in-place with an onremove event stops the issue from happening. That could actually be a temporary workaround when problem nodes are identified.

It happens when columns in a table shift around. To see the issue, add a column, then remove the first column. Calls to redraw will hit removeNode several times despite no difference in the vnode tree. Note that "Keyed Table" is the thing I'm actually trying to draw; everything else is there to help diagnose the issue.

https://jsfiddle.net/9myxqwha/2/

Of course it's not actually clear that mithril is recreating any DOM here, but in my situation I think it's responsible for at least a forced reflow.

@dead-claudia
Copy link
Member

That smells like the old tree is being retained when it shouldn't, or something - it's seeing old keys, not new ones.

@CarlLeth
Copy link

I'm not persisting vnodes between render cycles. Just data. The keys are just integers. What makes a key "old"? Shouldn't the vnode corresponding to the same data have the same key?

@pygy
Copy link
Member Author

pygy commented Jan 24, 2018

I'll look into this tomorrow, thanks for the repro @CarlLeth.

@dead-claudia
Copy link
Member

@CarlLeth What I'm seeing is that it's trying to remove the previously removed nodes twice - to take a row of that table, it's seeing [{key: 0}, {key: 3}] when it should be seeing [{key: 3}] on the subsequent redraw.

@pygy Forewarning: this is going to be a beast to test. You'll need to add the raw DOM node to the parent DOM node manually after it gets removed from m.render, then check for its existence after redrawing with the same tree (lacking it). (I hate ugly hacks in tests...)

@pygy
Copy link
Member Author

pygy commented Jan 25, 2018

This is actually fixed in the next branch. The issue is that the pool is merged at the end of the old node list, and pooled nodes are then re-removed then put back in the pool (they were in v1.1.6, at least, in next the onremove phase is correctly skipped for the nodes in the pool).

here are a reduced example: v1.1.6, next.

Keyed nodes can leak memory when recycled, the pool is disabled for them by default in next (hence the node.reuse=true).

Edit: updated the samples.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Bug For bugs and any other unexpected breakage
Projects
None yet
3 participants