-
Notifications
You must be signed in to change notification settings - Fork 635
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
crash when undoing deletion of function definition code blocks #11220
Conversation
@pinzart I think this is the PR you intended to make. If so, I made this on your behalf so that we can start reviewing it. Or have you made another one I'm unaware of? |
@aparajit-pratap this is the PR I intended to create, so thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! And I do support a follow up refactoring task.
continue; | ||
} | ||
|
||
var cb = CoreUtils.GetCodeBlock(exe.CompleteCodeBlocks, currentLangBlock); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't this still have the potential to be slow if the input size is large? Programs can be of arbitrarily large size can't they?
What about a version of Coreutils.GetCodeBlock
that takes an initial index to check?
I guess eventually we'd want to stop using arrays and start using a sorted or balanced data structure with fast lookup?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The whole issue is that codeBlockIds are not equivalent to the indexes in CompleteCodeBlocks
.
A version of Coreutils.GetCodeBlock
that takes an initial index to check could return wrong data (even though we could guard against invalid indexes)
As I mentioned in the description, I want to refactor these data structs and replace with something more suitable to the access patterns (ex. simple Dictionionary<codeBlockId, data> )
For the first question, yes there is potential to degrade if we have graphs with a large number of codeblocks (hundreds of thousands).
There is an opportunity to optimize the existing changes I made (like moving CoreUtils.GetCodeBlock
outside the for in for loops). I would still leave it as it is even if it gets a bit slower.
Otherwise we would have to change a lot of code to refactor in the current PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the explanation - for a bit further clarification - when you say have graphs with a large number of codeblocks (hundreds of thousands).
you are referring to codeblocks as codeblocks
in the VM/AST right, not codeblock nodes?
IE - a single node might generate many codeblocks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IE - a single node might generate many codeblocks?
Yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am hoping to do the refactor as soon as possible. Do you think that because of possibility of perf degradation due to this PR would we should include the refactor here as well ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mjkkirschner yes, codeblock
s here refer to a code scope. For example the body of a for
, while
, if
, else
statements are all codeblocks by themselves. Each one of these codeblocks has a symbol table, procedure table, instruction stream, etc.
For example, the following DS code snippet:
def foo(a, b)
{
return [Imperative]
{
list = a..b;
x = [];
count = 0;
for(i in list)
{
x[count] = i;
count = count+1;
}
return x;
}
};
x = true;
l1 = foo(1, 10);
l2 = foo(10, 20);
[Imperative]
{
if(x==true)
{
return l1;
}
else
{
return l2;
}
};
generates the following instruction set. Each instruction is prefixed with a number like [i.1.0]
, which is of the format [associative: a/imperative: i
.codeblock index
.instruction index
]:
def foo:var[]..[](a:var, b:var)
{
==============Start Node==============
[i.1.0]push a
[i.1.1]push b
[i.1.2]push null
[i.1.3]push 0
[i.1.4]push False
[i.1.5]push False
[i.1.6]push 0[depth]
[i.1.7]callr %generate_range
[i.1.8]push _rx
[i.1.9]setexpuid 19[exprId]
[i.1.10]pop list
[i.1.11]setexpuid 20[exprId]
[i.1.12]newarr 0
[i.1.13]pop x
[i.1.14]setexpuid 21[exprId]
[i.1.15]push 0
[i.1.16]pop count
[i.1.17]setexpuid 22[exprId]
[i.2.18]pushb 2
[i.2.19]push null
[i.2.20]pop i
[i.2.21]setexpuid 23[exprId]
[i.2.22]push list
[i.2.23]pop %forloop_expr_0
[i.2.24]setexpuid 24[exprId]
[i.2.25]push 0
[i.2.26]pop %forloop_count_0
[i.2.27]setexpuid 25[exprId]
[i.2.28]pushvarsize %forloop_expr_0
[i.2.29]pop %forloop_key_0
[i.2.30]push %forloop_key_0
[i.2.31]push null
[i.2.32]nq
[i.2.33]cjmp L(-1)
[i.2.34]setexpuid 26[exprId]
[i.3.35]pushb 3
[i.3.36]push %forloop_expr_0
[i.3.37]push %forloop_count_0
[i.3.38]push 0[depth]
[i.3.39]callr ValueAtIndexInForLoop
[i.3.40]push _rx
[i.3.41]pop i
[i.3.42]setexpuid 27[exprId]
[i.3.43]push %forloop_key_0
[i.3.44]push 1
[i.3.45]add
[i.3.46]pop %forloop_key_0
[i.3.47]setexpuid 28[exprId]
[i.3.48]push %forloop_count_0
[i.3.49]push 1
[i.3.50]add
[i.3.51]pop %forloop_count_0
[i.3.52]setexpuid 29[exprId]
[i.3.53]push i
[i.3.54]push count
[i.3.55]push 1[dim]
[i.3.56]setelement x
[i.3.57]setexpuid 30[exprId]
[i.3.58]push count
[i.3.59]push 1
[i.3.60]add
[i.3.61]pop count
[i.3.62]setexpuid 31[exprId]
[i.3.63]retcn
[i.2.64]jmp L1(30)
[i.2.65]retcn
[i.1.66]push x
[i.1.67]retb
[i.1.68]setexpuid 32[exprId]
[a.0.2]bounce 1, 0
[a.0.3]push _rx
[a.0.4]return
==============End Node==============
}
==============Start Node==============
[a.0.219]push True
[a.0.220]pop x
[a.0.221]dep 1[ExprUID] False[SSA]
[a.0.222]jdep
==============End Node==============
==============Start Node==============
[a.0.223]pushlevel @0
[a.0.224]popguides 0
[a.0.225]push 1
[a.0.226]pushlevel @0
[a.0.227]popguides 0
[a.0.228]push 10
[a.0.229]push 0[depth]
[a.0.230]callr foo
[a.0.231]push _rx
[a.0.232]pop %t_0_bfef6a96488243a381505b36007a5c28
[a.0.233]dep 2[ExprUID] True[SSA]
[a.0.234]jdep
==============End Node==============
==============Start Node==============
[a.0.235]push %t_0_bfef6a96488243a381505b36007a5c28
[a.0.236]pop l1
[a.0.237]dep 2[ExprUID] False[SSA]
[a.0.238]jdep
==============End Node==============
==============Start Node==============
[a.0.239]pushlevel @0
[a.0.240]popguides 0
[a.0.241]push 10
[a.0.242]pushlevel @0
[a.0.243]popguides 0
[a.0.244]push 20
[a.0.245]push 0[depth]
[a.0.246]callr foo
[a.0.247]push _rx
[a.0.248]pop %t_1_bfef6a96488243a381505b36007a5c28
[a.0.249]dep 3[ExprUID] True[SSA]
[a.0.250]jdep
==============End Node==============
==============Start Node==============
[a.0.251]push %t_1_bfef6a96488243a381505b36007a5c28
[a.0.252]pop l2
[a.0.253]dep 3[ExprUID] False[SSA]
[a.0.254]jdep
==============End Node==============
==============Start Node==============
[i.4.0]push x
[i.4.1]push True
[i.4.2]eq
[i.4.3]cjmp L(-1)
[i.4.4]setexpuid 33[exprId]
[i.5.5]pushb 5
[i.5.6]push l1
[i.5.7]retb
[i.5.8]setexpuid 34[exprId]
[i.5.9]retcn
[i.4.10]jmp L1(-1)
[i.6.11]pushb 6
[i.6.12]push l2
[i.6.13]retb
[i.6.14]setexpuid 35[exprId]
[i.6.15]retcn
[i.4.16]push null
[i.4.17]retb
[a.0.255]bounce 4, 0
[a.0.256]push _rx
[a.0.257]pop %tempLangBlock9
[a.0.258]dep -1[ExprUID] False[SSA]
[a.0.259]jdep
==============End Node==============
[a.0.260]push null
[a.0.261]retb
@aparajit-pratap @mjkkirschner I added the least amount of refactor changes to address the possible performance issues of iterating over CompleteCodeBlock lists. |
src/Engine/ProtoCore/Core.cs
Outdated
@@ -214,7 +214,7 @@ public struct ErrorEntry | |||
public List<CodeBlock> CodeBlockList { get; set; } | |||
// The Complete Code Block list contains all the code blocks | |||
// unlike the codeblocklist which only stores the outer most code blocks | |||
public List<CodeBlock> CompleteCodeBlockList { get; set; } | |||
public SortedDictionary<int, CodeBlock> CompleteCodeBlockList { get; set; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pinzart I don't think we can do this. Unfortunately, lot's of code in here appears to be unnecessarily public. Even though we are in the "engine" we do not know who is using these public members, and we have stated publicly we will follow semantic versioning.
It's entirely possible an extension could access these data structures.
What about obsoleting this member, having it return the values inside the new dictionary, and adding a new one - and lastly consider if the new member should actually be public.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm..I had hoped the VM was not avavilablr to api users.
But I se you are correct, I updated the code. will add obsolete stmts
@@ -46,7 +46,7 @@ public class Executable | |||
public TypeSystem TypeSystem { get; set; } | |||
|
|||
public List<CodeBlock> CodeBlocks { get; set; } | |||
public List<CodeBlock> CompleteCodeBlocks { get; set; } | |||
public SortedDictionary<int, CodeBlock> CompleteCodeBlocks { get; set; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same API issue.
core.CompleteCodeBlockList.RemoveAll(x => x.guid == bnode.guid);// && x.AstID == bnode.OriginalAstID); | ||
foreach (var key in core.CompleteCodeBlockDict.Where(x => x.Value.guid == bnode.guid).Select(x => x.Key).ToArray()) | ||
{ | ||
core.CompleteCodeBlockDict.Remove(key); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about doing this:
core.CompleteCodeBlockDict = core.CompleteCodeBlockDict
.Where(kv => !(kv.Value.guid == bnode.guid))
.ToDictionary(kv => kv.Key, kv => kv.Value);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm..your suggestions does seem to have less operations
I'll give it a try. THanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm...actually completeCodeBlockDict is a SortedDictionary.
Your code would have to do something like
core.CompleteCodeBlockDict = new SortedDictionary<int, CodeBlock>(core.CompleteCodeBlockDict.Where(kv => !(kv.Value.guid == bnode.guid)).ToDictionary(kv => kv.Key, kv => kv.Value));
It might not be worth it (creates a standard dictionary and allocates another SortedDict) also I find it a bit less obvious than the original.
To make it even more obvious/ easy to understand maybe this would work better:
var keysToRemove = core.CompleteCodeBlockDict.Where(x => x.Value.guid == bnode.guid).Select(x => x.Key).ToArray(); keysToRemove.ForEach(key => core.CompleteCodeBlockDict.Remove(key));
I don't think performance should be an issue here...
LGTM! |
@@ -154,15 +154,14 @@ public static int GetSymbolIndex(ClassNode classNode, string name, int classScop | |||
return Constants.kInvalidIndex; | |||
} | |||
|
|||
public static List<int> GetAncestorBlockIdsOfBlock(int blockId, List<CodeBlock> codeblockList) | |||
public static List<int> GetAncestorBlockIdsOfBlock(int blockId, SortedDictionary<int, CodeBlock> codeblocks) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pinzart I'm sorry I missed this - you can't change this public signature - please follow the same pattern of obsoleting this and adding an overload.
@@ -69,7 +69,7 @@ public static int GetUpcastCountTo(ClassNode from, ClassNode to, RuntimeCore run | |||
// | |||
// 2.3 Otherwise, classScope == kInvalidIndex && functionScope == kInvalidIndex | |||
// Return public member in derived class, or public member in base classes | |||
public static int GetSymbolIndex(ClassNode classNode, string name, int classScope, int functionScope, int blockId, List<CodeBlock> codeblockList, out bool hasThisSymbol, out ProtoCore.DSASM.AddressType addressType) | |||
public static int GetSymbolIndex(ClassNode classNode, string name, int classScope, int functionScope, int blockId, SortedDictionary<int, CodeBlock> codeblocks, out bool hasThisSymbol, out ProtoCore.DSASM.AddressType addressType) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pinzart I'm sorry I missed this - you can't change this public signature - please follow the same pattern of obsoleting this and adding an overload.
Purpose
Fix crash when deleting/modifying codeblocks
What is causing the crash:
CodeBLocks
should have their indexes correspond to theCodeBlock.codeBlockId
.Basically code that does the following:
CodeBlockList[someCodeBlockId]
This assumptions is wrong because CodeBlockLists can be altered (items removed at any index in the list). For example when you delete a code block that contains a function declaration (see
Dynamo/src/Engine/ProtoCore/Core.cs
Line 350 in 81df8a9
More than the crash:
There is code that is error prone (multiple arrays that have to kept in sync, array indexes passed/stored in different objects).
I propose to create a new task to refactor the code - maybe add a Dictionary to hold all data that needs to be in sync. Also maybe change the data type of codeblockIds (currently integer - CodeBlock.codeBlockId) to a GUID or a dedicated class so that there will be no confusion that codeblockIds can be used as indexes in arrays or keys in Dictionaries.
My proposed solution:
Minimal changes that resole the crash.
CoreUtil.GetCodeBlock()
.CoreUtil.GetCodeBlock()
basically iterates the codeblock arrays and tries to match each one with the code block id.Potential issues with proposal:
CoreUtil.GetCodeBlock(codeBlockList, codeBlockId)
may cause performance degradation - although, from my experiments/measurements, I only noticed insignificant increases. Only tested on hundreds of codeblocks.--Update-- performance should not be an issue, because CompleteCodeBlocks contains a flat list of all code blocks. When searching for a code block through
CoreUtil.GetCodeBlock
no recursive calls should be made if the blockId is valid.Declarations
Check these if you believe they are true
*.resx
filesReviewers
(FILL ME IN) Reviewer 1 (If possible, assign the Reviewer for the PR)
(FILL ME IN, optional) Any additional notes to reviewers or testers.
FYIs
(FILL ME IN, Optional) Names of anyone else you wish to be notified of