Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIL Execution Cleanup and Speed Optimizations Part 1 #12895

Merged
merged 10 commits into from
Jun 13, 2022

Conversation

saintentropy
Copy link
Contributor

@saintentropy saintentropy commented May 18, 2022

Purpose

JIL Execution in the VM is responsible for a small but important subset of node execution. While most nodes today are executed via the FFI (Foreign Function Interpreter) mechanisms which invoke C# code, some basic utility methods especially in Math category are written in native design script. This type of function is handled by the JILFunctionEndPoint. While many of these enhancements are targeted at this execution path, they also yield net improvements to the general execution of all EndPoint types. This is especially true in the POP_Handler, DEP_Handler, SetupExecutive, and RestoreFromCall methods within the VM's Executive. For a specific test graph with a mixture of geometry and mathematical operations, these enhancements specifically reduce overall UpdateGraph run time by 35% (17s to 13s). The net impact is also dramatically reduces temp memory allocation. For the case of this specific test graph the temporary memory allocation associated with UpdateGraph went from 11.3gb to 3.6gb. In summary this PR optimizes the execution of functions handled via the JIL Endpoint but has a net improvement to all node types.

Specifically this PR does the following

  • Clean up dependency on CurrentStackFrame property of RuntimeMemory. This getter creates a new StackFrame object to reference a subset of items at a specific location in the VM's Stack. The CurentStackFrame property is utilized 99% of the time from the IsGlobalScope method. This issue is IsGlobalScope can be called tens of millions of times during a Graph Execution run which creates a new StackFrame object every time the CurrentStackFrame property is referenced. This optimization simply removes the need to allocate a temporary StackValue object when the data which is needed can be easily referenced directly from the Stack. This optimization represents the majority of the extra gigabytes of temporary allocation described above in the sample graph performance delta.

  • Refactor RestoreFromCall to not allocate empty list until after the required check of runtimeCore.Options.RunMode == InterpreterMode.Expression. This allocation is done repeatedly with no items added to the collection. The call later to check the list via Any() checks can be refactored to a null check.

  • Fast path for GetGraphNodesAtScope when asking for a Invalid ClassIndex and ProcessIndex (ie -1). This is another function that can be called millions of times during a UpdateGraph run. Many calls that are routed through this function are looking for the same item in the graphNodeMap dictionary. This optimization creates a shortcut when the lookup is asking for the specific case of the invalid ClassIndex and ProcessIndex that avoids accessing the object from the dictionary. Note, in the case of this function, caching the previous lookup would not speed up the lookup as the method usage typically alternates between values.

  • Refactor UpdateGraph to not allocate a temporary list of GraphNodes.

Declarations

Check these if you believe they are true

  • The codebase is in a better state after this PR
  • Is documented according to the standards
  • The level of testing this PR includes is appropriate
  • User facing strings, if any, are extracted into *.resx files
  • All tests pass using the self-service CI.
  • Snapshot of UI changes, if any.
  • Changes to the API follow Semantic Versioning and are documented in the API Changes document.
  • This PR modifies some build requirements and the readme is updated

Reviewers

@sm6srw @aparajit-pratap

FYIs

@jasonstratton @mjkkirschner

…rame

(cherry picked from commit 95a8144)
(cherry picked from commit fdd960e)
(cherry picked from commit cd22f79)
(cherry picked from commit 5721a08)
(cherry picked from commit 26c2131)
(cherry picked from commit 08afa41)
@saintentropy
Copy link
Contributor Author

@jasonstratton @aparajit-pratap Closing #12153 in favor of this. This removes the commit refactoring JILFunctionEndPoint to allow caching of the Interpreter which also had a failing test. This PR now is exclusively allocation optimizations and should have no impact on exectuion.

@@ -1422,7 +1428,6 @@ private int UpdateGraph(int exprUID, bool isSSAAssign)
}
gnode.isActive = false;
}
return reachableGraphNodes.Count;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we keep this return statement, that way we can just say return GraphUpdateImpl(...); above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that makes sence. I will update that.

{
frame[destIndex] = Stack[sourceIndex];
}
var stackFrame = new StackFrame(frame);

stackFrame = new StackFrame(frame);
return stackFrame;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a redundant return as you're returning the stackFrame outside the if block anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep good catch

Comment on lines +340 to +348
get
{
if (FramePointer - StackFrame.StackFrameSize >= startFramePointer)
{
return null;
return false;
}

return true;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't we simply say:

get { stackFrame == null; }

Copy link
Contributor Author

@saintentropy saintentropy Jun 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to initialize the stackFrame object unless you actually need a StackFrame object. That is the big time penalty here. All of these essentially copy the implementation from StackFrame here to ovoid the getter for the CurrentStackFrame. On my sample graph it is called that getter is called 21 million times.

@@ -294,32 +294,101 @@ public int CurrentConstructBlockId
}
}


private int fp;
private StackFrame stackFrame;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename to currentStackFrame

@aparajit-pratap aparajit-pratap merged commit 1e0714f into DynamoDS:master Jun 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants