Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HYDRA-1457 : Fix UsdStageMap::allStages to not rely on the internal cache when it can be outdated #4128

Open
wants to merge 5 commits into
base: dev
Choose a base branch
from

Conversation

debloip-adsk
Copy link
Collaborator

This fix is notably required in order to get some UFE camera changes working in Maya.

@debloip-adsk debloip-adsk self-assigned this Feb 20, 2025
Copy link
Collaborator

@pierrebai-adsk pierrebai-adsk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IDK if there can be perf implication of calling getAllNames(), but we'll see.

@seando-adsk
Copy link
Collaborator

IDK if there can be perf implication of calling getAllNames(), but we'll see.

I think there is which is one of the reasons we are using the stage cache here. @debloip-adsk It seems to me like the "real" problem could be a missing call to setDirty() to make sure that when allStages() is called it will rebuild the cache first.

Copy link
Collaborator

@ppt-adsk ppt-adsk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any performance implications for this change? How often is this run?

@seando-adsk
Copy link
Collaborator

Are there any performance implications for this change? How often is this run?

That is kind of hard to say. Its available from a python function called getAllStages() and I've seen it being called in a few code places, such as from SceneSegmentHandler::findUsdGatewayItems() and the proxyRenderDelegate.

@debloip-adsk
Copy link
Collaborator Author

debloip-adsk commented Feb 21, 2025

The core problem I'm seeing here is : when/from where would we set the cache dirty? Once the node rename happens, there is a UFE notification and a Maya notification sent, but Maya UFE camera code receives the UFE notification first, and then immediately queries MayaUsd's UsdStageMap through Ufe::SceneSegmentHandler::findGatewayItems before it has received the Maya node rename notification. So MayaUsd has no idea any change has happened before it is queried for updated info. Even if we added support in MayaUsd's UsdStageMap for handling UFE notifications, we'd still depend on the notification order, which doesn't seem super robust either. Is there a way to set the cache dirty through Maya/UFE?

@ppt-adsk
Copy link
Collaborator

The core problem I'm seeing here is : when/from where would we set the cache dirty? Once the node rename happens, there is a UFE notification and a Maya notification sent, but Maya UFE camera code receives the UFE notification first, and then immediately queries MayaUsd's UsdStageMap through Ufe::SceneSegmentHandler::findGatewayItems before it has received the Maya node rename notification. So MayaUsd has no idea any change has happened before it is queried for updated info. Even if we added support in MayaUsd's UsdStageMap for handling UFE notifications, we'd still depend on the notification order, which doesn't seem super robust either. Is there a way to set the cache dirty through Maya/UFE?

Relying on notification order is very, very bad.

  1. Is there a way we can avoid Maya calling Ufe::SceneSegmentHandler::findGatewayItems() on rename, but simply marking it dirty in the caller so that the next time the information is needed a call to Ufe::SceneSegmentHandler::findGatewayItems() will be made? This is not great.
  2. Is there a way somehow in Maya to make sure the UFE callback is somehow first in the notification order? This may be very difficult to achieve.

Ran out of ideas at this point :) If there is any way we can convert the reliance on callbacks in this code to a fixed call order, we should explore that. If changes to Maya are acceptable, maybe there is a way to insert some kind of pre-callback callback :( to "solve" the problem (really push the problem back a little bit and hope that's sufficient).

@pierrebai-adsk
Copy link
Collaborator

Are the UFE camera problems inside MayaUSD itself?

Can that code detect the failure mode and manually call UsdStageMap::setDirty and try again? Or are the failures indistinguishable from normal operations?

@debloip-adsk
Copy link
Collaborator Author

Is there a way we can avoid Maya calling Ufe::SceneSegmentHandler::findGatewayItems() on rename, but simply marking it dirty in the caller so that the next time the information is needed a call to Ufe::SceneSegmentHandler::findGatewayItems() will be made? This is not great.

I think there is, but it would fix the problem only for this specific case, with further changes or new code we might run into the same issue again.

Is there a way somehow in Maya to make sure the UFE callback is somehow first in the notification order? This may be very difficult to achieve.

This doesn't seem super easy, as from the looks of it we'd likely need to make changes to the core Maya system, which seems pretty risky

Are the UFE camera problems inside MayaUSD itself?

The problem is a result of the interaction between both the Maya UFE camera system and MayaUsd's UFE implementation, where Maya queries MayaUsd but MayaUsd returns incorrect data due to relying on a notification order that is not the one happening.

Can that code detect the failure mode and manually call UsdStageMap::setDirty and try again? Or are the failures indistinguishable from normal operations?

The MayaUsd code can't really detect the failure mode without first querying for the update state of things, which is what the current change is already doing, so there wouldn't be much benefit.

I ran some rough performance measurements of the allStages method before and after the change to test the perf difference. I ran the testVP2RenderDelegateCameras test up to and including the cmds.rename('|stage', 'stage2') line (which is when the test failed). This ends up calling allStages a couple times overall. With the change, the average runtime of the method was 0.19475758ms, and without the change, it was 0.01281818ms. Upon closing Maya, the method is also called a few times when there are no stages left, and with the change, the average runtime of those calls was 0.10448333ms, and without the change, it was 0.00118333ms. So it seems there is roughly a base cost of 0.1ms to query the updated state of the data model regardless, and then a varying cost based on how many stages there are and if the cache is outdated or not.

That being said, the practical/observable impact of this is perf difference is hard to tell, so I started two preflights with and without the change so this can be manually tested. I will post the links once they're ready so QA (likely @santosd) can take a look.

My outlook on this right now is the following : if the perf impact of the current change is acceptable, then we go with that, otherwise we would try doing something like Pierre's first suggestion (Is there a way we can avoid Maya calling Ufe::SceneSegmentHandler::findGatewayItems() on rename, but simply marking it dirty in the caller so that the next time the information is needed a call to Ufe::SceneSegmentHandler::findGatewayItems() will be made?)

@seando-adsk
Copy link
Collaborator

seando-adsk commented Feb 25, 2025

Pixar (issue #4125) has reported a performance issue with camera finding which uses findUsdGatewayItems which calls getAllStages - so we need to be careful here to no introduce any performance degradation.

@debloip-adsk
Copy link
Collaborator Author

We have an alternative fix that we can do in Maya itself and that can be included in the PR that implements the changes to UFE cameras, see this commit : https://git.autodesk.com/maya3d/maya/pull/30320/commits/e4f7eaa8ceb095e4b443cbe35ef58c6ea0be630d. It does not fix the root issue with notification order and the allStages method potentially returning outdated data, but it at least fixes the case that we hit in practice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants