Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamo UI should not block during concurrent save/run operations- attempt 1 #12564

Merged
merged 7 commits into from
Jan 21, 2022

Conversation

mjkkirschner
Copy link
Member

@mjkkirschner mjkkirschner commented Jan 20, 2022

Purpose

while investigating https://jira.autodesk.com/browse/DYN-4521 I found that the new ExternalFiles implementation causes a UI hang when when called while the graph is executing. (more specifically when the LiveRunner has a lock on its mutexObject field) This is commonly during graph executions.

This is attempt 1 using an existing simple bool field (HomeWorkspaceModel.executingTask) to track if a graph run is in progress. I'm not sure how robust a solution it is, and it's not covering all cases because a client could call some other EngineController or LiveRunner method that grabs this lock - though it will likely work for most cases, is simple, and works in my repro case.

This means that save will still occur during graph execution as before but ExternalFile ref calculation will be skipped if the graph is executing.

Also note I have not yet verified this is the same issue as in Alias. I have verified this fixes the hang in Alias. (well at least the one I experienced during testing).

I've also removed the HeartBeat thread as it appears it was only used for instrumentation and logging uptime to the registry which doesn't appear to be used anywhere.

To test this is quite simple, run a long running graph and hit the save icon.

Declarations

Check these if you believe they are true

  • The codebase is in a better state after this PR
  • Is documented according to the standards
  • The level of testing this PR includes is appropriate
  • User facing strings, if any, are extracted into *.resx files
  • All tests pass using the self-service CI.
  • Snapshot of UI changes, if any.
  • Changes to the API follow Semantic Versioning and are documented in the API Changes document.
  • This PR modifies some build requirements and the readme is updated

Release Notes

Fixes a UI hang when saving at the same time the graph is running.

@mjkkirschner
Copy link
Member Author

⚠️
This doesn't work in some situations - for example see the failing test - It looks like executingTask is set to true AFTER it's set to false, so the state is actually inverted. I think this is because the tests put the scheduler into synchronous mode (like Revit) - I don't think this solution will work well in those contexts.

Maybe it could be adapted with another bool, for example RunEnabled seems to be reset at the correct time... not sure how it will work with different thread scenarios though... leaning towards solution 2 for now.

@mjkkirschner
Copy link
Member Author

RunEnabled seems to work, waiting for tests.

@pinzart90
Copy link
Contributor

I would rather go with this option.
The second one could probably be made a bit more generic and we could try to look at other similar situations

Copy link
Contributor

@pinzart90 pinzart90 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mjkkirschner
Copy link
Member Author

this passed on the serial CI as well.

@mjkkirschner
Copy link
Member Author

spoke with @pinzart90 about this, given the short timeline we agreed that while this PR does not cover all cases, it's sufficient and it's better to have less changes to test.

@aparajit-pratap @sm6srw please prioritize this PR over attempt 2.

I am working to verify this fixes the reported alias hang.

Copy link
Contributor

@pinzart90 pinzart90 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mjkkirschner mjkkirschner merged commit f784c5e into DynamoDS:master Jan 21, 2022
mjkkirschner added a commit to mjkkirschner/Dynamo that referenced this pull request Jan 21, 2022
…empt 1 (DynamoDS#12564)

* attempt 1, this works but seems odd.

* switch executingTask for RunEnabled

* put external files back in wsm

* int to private

* remove usings

* revert change

* revert change

Co-authored-by: michael kirschner <[email protected]>
mjkkirschner added a commit that referenced this pull request Jan 21, 2022
* Dynamo UI should not block during concurrent save/run operations- attempt 1 (#12564)

* attempt 1, this works but seems odd.

* switch executingTask for RunEnabled

* put external files back in wsm

* int to private

* remove usings

* revert change

* revert change

Co-authored-by: michael kirschner <[email protected]>

* remove heartbeat

* remove heartbeat.cs

Co-authored-by: michael kirschner <[email protected]>
// If an execution is in progress we'll have to wait for it to be done before we can gather the
// external file references as this implementation relies on the output values of each node.
//instead just bail to avoid blocking the UI.
if (this is HomeWorkspaceModel homeWorkspaceModel && homeWorkspaceModel.RunSettings.RunEnabled)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If retrieving external files is skipped here due to a graph running, say when saving a graph, then what action will cause this computation to actually run again at a later time?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In other words, how do we ensure that this is scheduled to run after the graph finishes executing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

currently as merged - the user will need to save the graph again, backup save etc while the graph is not running for this computation to run.

We could likely schedule another save if this condition is encountered by using the scheduler. I have some hesitation about this though - if the user tried to manually save their graph as execution starts, and then we schedule another save for directly after - if execution somehow modifies the graph then I guess this could be considered destructive, the example I'm thinking of is element binding, but there could be others... will think about it.

It's also important to note that this issue is only present in sandbox builds (where UI and Scheduler thread are different).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants