Skip to content

Commit

Permalink
Editor : Fix deadlock caused by garbage collection of Settings
Browse files Browse the repository at this point in the history
The problematic sequence of operations was this :

1. Destroy Editor. But Settings node lives on, because it is a wrapped   RefCounted object and hence requires garbage collection.
2. Start unrelated BackgroundTask, which inadvertently triggers `IECore.RefCounted.collectGarbage()` on a background thread.
3. Settings node is destroyed on background thread by the garbage collection. All plugs are disconnected before destruction, including the `__scriptNode` plug.
4. Disconnections cause cancellation of background tasks associated with
the ScriptNode, via `BackgroundTask::cancelAffectedTasks()`. Although the Settings node has no parent, the ScriptNode
is still found due to the (about to be removed) connection to the
`__scriptNode` plug.
5. `BackgroundTask::cancelAndWait()` never returns, because it is being called from the task's own thread.
6. The UI thread then waits for the task to finish, and we have complete deadlock.

This is worked around by removing the `__scriptNode` plug connection on the main thread at the time the Editor is destroyed.

Why is this only happening now? Because we only introduced the Settings node and the `__scriptNode` plug mechanism recently in 830de76.

But we have always had lots of other Python-derived nodes that require garbage collection, so why weren't _they_ causing problems? Because when they are collected, they will have no parent, and the standard way of finding the ScriptNode for cancellation is to look for a ScriptNode ancestor. The special case using the `__scriptNode` plug only applies to the Settings node.

Longer term it would be good to come up with a better mechanism than the `__scriptNode` plug, but I think this is a sufficient workaround in the meantime.

Fixes #5877
  • Loading branch information
johnhaddon authored and murraystevenson committed Jun 14, 2024
1 parent e40be84 commit 5660eac
Show file tree
Hide file tree
Showing 3 changed files with 63 additions and 0 deletions.
5 changes: 5 additions & 0 deletions Changes.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,11 @@ Fixes

- Viewer : Fixed handling of Gaffer `${contextVariable}` references in OpenColorIO variable values. The Viewer now updates the Display Transform appropriately when the value of the context variable changes.

Fixes
-----

- UI : Fixed hangs caused by garbage collection of removed Editors. One common example involved viewing a Catalogue in the NodeEditor after removing the ImageInspector (#5877).

API
---

Expand Down
11 changes: 11 additions & 0 deletions python/GafferUI/Editor.py
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,17 @@ def __init__( self, topLevelWidget, scriptNode, **kw ) :

self.__setContextInternal( scriptNode.context(), callUpdate=False )

def __del__( self ) :

for attr in self.__dict__.values() :
if isinstance( attr, GafferUI.Editor.Settings ) :
# Remove connection to ScriptNode now, on the UI thread.
# Otherwise we risk deadlock if the Settings node gets garbage
# collected in a BackgroundTask, which would attempt
# cancellation of all tasks for the ScriptNode, including the
# task itself.
attr["__scriptNode"].setInput( None )

def scriptNode( self ) :

return self.__scriptNode
Expand Down
47 changes: 47 additions & 0 deletions python/GafferUITest/EditorTest.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,8 @@
import unittest
import weakref

import IECore

import Gaffer
import GafferTest
import GafferUI
Expand Down Expand Up @@ -110,5 +112,50 @@ def pythonEditorCreated( editor ) :
self.assertEqual( editorsCreated, [ e1, e2, e3 ] )
self.assertEqual( pythonEditorsCreated, [ e2 ] )

def testIssue5877( self ) :

# This test models the bug reported in #5877. First we have
# an editor that uses a `Settings` node.

script = Gaffer.ScriptNode()
script["node"] = GafferTest.MultiplyNode()

class TestEditor( GafferUI.Editor ) :

def __init__( self, scriptNode, **kw ) :

GafferUI.Editor.__init__( self, GafferUI.Label( "MyEditor" ), scriptNode )

self.__settingsNode = self.Settings( "MyEditor", scriptNode )

editor = TestEditor( script )

# Then we dispose of that editor.

del editor

# Now we run a perfectly innocent background task that just
# wants to compute something perfectly innocently.

def f( canceller ) :

script["node"]["product"].getValue()

# But of course, garbage collection can occur at any time.
# For the purposes of this test we force it to happen explicitly,
# but in the real world it would be triggered by the creation
# of a new wrapped RefCounted instance. In #5877 this was commonly
# an `_ImagesPath` created by the CatalogueUI in a background task
# updating the PathListingWidget for the image listing.
IECore.RefCounted.collectGarbage()

task = Gaffer.BackgroundTask( script["node"]["product"], f )

# This would have deadlocked when `collectGarbage()` triggered the
# destruction of the Settings node, because that would trigger
# cancellation of the BackgroundTask from the task itself. And a task
# waiting on itself would never finish.
task.wait()

if __name__ == "__main__":
unittest.main()

0 comments on commit 5660eac

Please sign in to comment.