crash when undoing deletion of function definition code blocks #11220

aparajit-pratap · 2020-10-29T22:55:57Z

Purpose

Fix crash when deleting/modifying codeblocks

What is causing the crash:

There is code that assumes that Lists of CodeBLocks should have their indexes correspond to the CodeBlock.codeBlockId.
Basically code that does the following: CodeBlockList[someCodeBlockId]
This assumptions is wrong because CodeBlockLists can be altered (items removed at any index in the list). For example when you delete a code block that contains a function declaration (see

Dynamo/src/Engine/ProtoCore/Core.cs

Line 350 in 81df8a9

CompleteCodeBlockList.RemoveAll(x => x.codeBlockId == cbID);

)
Not having null checks on arrays of symbolTables or ProcedureTables (or instruction streams) that can contain null items.

More than the crash:
There is code that is error prone (multiple arrays that have to kept in sync, array indexes passed/stored in different objects).
I propose to create a new task to refactor the code - maybe add a Dictionary to hold all data that needs to be in sync. Also maybe change the data type of codeblockIds (currently integer - CodeBlock.codeBlockId) to a GUID or a dedicated class so that there will be no confusion that codeblockIds can be used as indexes in arrays or keys in Dictionaries.

My proposed solution:
Minimal changes that resole the crash.

Null checks when iterating on arrays of SymbolTables, ProcedureTables and InstructionStreams.
Swap out all code that assumed CodeBlockArrays can use codeblockIds as their indexes with the CoreUtil.GetCodeBlock() .
CoreUtil.GetCodeBlock() basically iterates the codeblock arrays and tries to match each one with the code block id.

Potential issues with proposal:
CoreUtil.GetCodeBlock(codeBlockList, codeBlockId) may cause performance degradation - although, from my experiments/measurements, I only noticed insignificant increases. Only tested on hundreds of codeblocks.
--Update-- performance should not be an issue, because CompleteCodeBlocks contains a flat list of all code blocks. When searching for a code block through CoreUtil.GetCodeBlock no recursive calls should be made if the blockId is valid.

Declarations

Check these if you believe they are true

The codebase is in a better state after this PR
Is documented according to the standards
The level of testing this PR includes is appropriate
User facing strings, if any, are extracted into *.resx files
All tests pass using the self-service CI.
Snapshot of UI changes, if any.
Changes to the API follow Semantic Versioning and are documented in the API Changes document.
This PR modifies some build requirements and the readme is updated

Reviewers

(FILL ME IN) Reviewer 1 (If possible, assign the Reviewer for the PR)

(FILL ME IN, optional) Any additional notes to reviewers or testers.

FYIs

(FILL ME IN, Optional) Names of anyone else you wish to be notified of

aparajit-pratap · 2020-10-29T22:56:41Z

@pinzart I think this is the PR you intended to make. If so, I made this on your behalf so that we can start reviewing it. Or have you made another one I'm unaware of?

pinzart90 · 2020-10-30T14:14:44Z

@aparajit-pratap this is the PR I intended to create, so thanks.
I cannot seem to find you in the reviewer list though,,,,
Oh...duh, you are the creator of the PR :)

src/Engine/ProtoCore/DSASM/Executive.cs

sm6srw

LGTM! And I do support a follow up refactoring task.

mjkkirschner · 2020-11-02T17:45:04Z

src/Engine/ProtoCore/DSASM/Executive.cs

+                                continue;
+                            }
+
+                            var cb = CoreUtils.GetCodeBlock(exe.CompleteCodeBlocks, currentLangBlock);


Doesn't this still have the potential to be slow if the input size is large? Programs can be of arbitrarily large size can't they?

What about a version of Coreutils.GetCodeBlock that takes an initial index to check?

I guess eventually we'd want to stop using arrays and start using a sorted or balanced data structure with fast lookup?

The whole issue is that codeBlockIds are not equivalent to the indexes in CompleteCodeBlocks.
A version of Coreutils.GetCodeBlock that takes an initial index to check could return wrong data (even though we could guard against invalid indexes)

As I mentioned in the description, I want to refactor these data structs and replace with something more suitable to the access patterns (ex. simple Dictionionary<codeBlockId, data> )

For the first question, yes there is potential to degrade if we have graphs with a large number of codeblocks (hundreds of thousands).
There is an opportunity to optimize the existing changes I made (like moving CoreUtils.GetCodeBlock outside the for in for loops). I would still leave it as it is even if it gets a bit slower.
Otherwise we would have to change a lot of code to refactor in the current PR.

thanks for the explanation - for a bit further clarification - when you say have graphs with a large number of codeblocks (hundreds of thousands). you are referring to codeblocks as codeblocks in the VM/AST right, not codeblock nodes?

IE - a single node might generate many codeblocks?

IE - a single node might generate many codeblocks?
Yes

I am hoping to do the refactor as soon as possible. Do you think that because of possibility of perf degradation due to this PR would we should include the refactor here as well ?

@mjkkirschner yes, codeblocks here refer to a code scope. For example the body of a for, while, if, else statements are all codeblocks by themselves. Each one of these codeblocks has a symbol table, procedure table, instruction stream, etc.
For example, the following DS code snippet:

def foo(a, b) { return [Imperative] { list = a..b; x = []; count = 0; for(i in list) { x[count] = i; count = count+1; } return x; } }; x = true; l1 = foo(1, 10); l2 = foo(10, 20); [Imperative] { if(x==true) { return l1; } else { return l2; } };

generates the following instruction set. Each instruction is prefixed with a number like [i.1.0], which is of the format [associative: a/imperative: i.codeblock index.instruction index]:

def foo:var[]..[](a:var, b:var) { ==============Start Node============== [i.1.0]push a [i.1.1]push b [i.1.2]push null [i.1.3]push 0 [i.1.4]push False [i.1.5]push False [i.1.6]push 0[depth] [i.1.7]callr %generate_range [i.1.8]push _rx [i.1.9]setexpuid 19[exprId] [i.1.10]pop list [i.1.11]setexpuid 20[exprId] [i.1.12]newarr 0 [i.1.13]pop x [i.1.14]setexpuid 21[exprId] [i.1.15]push 0 [i.1.16]pop count [i.1.17]setexpuid 22[exprId] [i.2.18]pushb 2 [i.2.19]push null [i.2.20]pop i [i.2.21]setexpuid 23[exprId] [i.2.22]push list [i.2.23]pop %forloop_expr_0 [i.2.24]setexpuid 24[exprId] [i.2.25]push 0 [i.2.26]pop %forloop_count_0 [i.2.27]setexpuid 25[exprId] [i.2.28]pushvarsize %forloop_expr_0 [i.2.29]pop %forloop_key_0 [i.2.30]push %forloop_key_0 [i.2.31]push null [i.2.32]nq [i.2.33]cjmp L(-1) [i.2.34]setexpuid 26[exprId] [i.3.35]pushb 3 [i.3.36]push %forloop_expr_0 [i.3.37]push %forloop_count_0 [i.3.38]push 0[depth] [i.3.39]callr ValueAtIndexInForLoop [i.3.40]push _rx [i.3.41]pop i [i.3.42]setexpuid 27[exprId] [i.3.43]push %forloop_key_0 [i.3.44]push 1 [i.3.45]add [i.3.46]pop %forloop_key_0 [i.3.47]setexpuid 28[exprId] [i.3.48]push %forloop_count_0 [i.3.49]push 1 [i.3.50]add [i.3.51]pop %forloop_count_0 [i.3.52]setexpuid 29[exprId] [i.3.53]push i [i.3.54]push count [i.3.55]push 1[dim] [i.3.56]setelement x [i.3.57]setexpuid 30[exprId] [i.3.58]push count [i.3.59]push 1 [i.3.60]add [i.3.61]pop count [i.3.62]setexpuid 31[exprId] [i.3.63]retcn [i.2.64]jmp L1(30) [i.2.65]retcn [i.1.66]push x [i.1.67]retb [i.1.68]setexpuid 32[exprId] [a.0.2]bounce 1, 0 [a.0.3]push _rx [a.0.4]return ==============End Node============== } ==============Start Node============== [a.0.219]push True [a.0.220]pop x [a.0.221]dep 1[ExprUID] False[SSA] [a.0.222]jdep ==============End Node============== ==============Start Node============== [a.0.223]pushlevel @0 [a.0.224]popguides 0 [a.0.225]push 1 [a.0.226]pushlevel @0 [a.0.227]popguides 0 [a.0.228]push 10 [a.0.229]push 0[depth] [a.0.230]callr foo [a.0.231]push _rx [a.0.232]pop %t_0_bfef6a96488243a381505b36007a5c28 [a.0.233]dep 2[ExprUID] True[SSA] [a.0.234]jdep ==============End Node============== ==============Start Node============== [a.0.235]push %t_0_bfef6a96488243a381505b36007a5c28 [a.0.236]pop l1 [a.0.237]dep 2[ExprUID] False[SSA] [a.0.238]jdep ==============End Node============== ==============Start Node============== [a.0.239]pushlevel @0 [a.0.240]popguides 0 [a.0.241]push 10 [a.0.242]pushlevel @0 [a.0.243]popguides 0 [a.0.244]push 20 [a.0.245]push 0[depth] [a.0.246]callr foo [a.0.247]push _rx [a.0.248]pop %t_1_bfef6a96488243a381505b36007a5c28 [a.0.249]dep 3[ExprUID] True[SSA] [a.0.250]jdep ==============End Node============== ==============Start Node============== [a.0.251]push %t_1_bfef6a96488243a381505b36007a5c28 [a.0.252]pop l2 [a.0.253]dep 3[ExprUID] False[SSA] [a.0.254]jdep ==============End Node============== ==============Start Node============== [i.4.0]push x [i.4.1]push True [i.4.2]eq [i.4.3]cjmp L(-1) [i.4.4]setexpuid 33[exprId] [i.5.5]pushb 5 [i.5.6]push l1 [i.5.7]retb [i.5.8]setexpuid 34[exprId] [i.5.9]retcn [i.4.10]jmp L1(-1) [i.6.11]pushb 6 [i.6.12]push l2 [i.6.13]retb [i.6.14]setexpuid 35[exprId] [i.6.15]retcn [i.4.16]push null [i.4.17]retb [a.0.255]bounce 4, 0 [a.0.256]push _rx [a.0.257]pop %tempLangBlock9 [a.0.258]dep -1[ExprUID] False[SSA] [a.0.259]jdep ==============End Node============== [a.0.260]push null [a.0.261]retb

src/Engine/ProtoCore/DSASM/Executive.cs

test/Engine/ProtoTest/LiveRunnerTests/MicroFeatureTests.cs

pinzart90 · 2020-11-03T02:44:51Z

@aparajit-pratap @mjkkirschner I added the least amount of refactor changes to address the possible performance issues of iterating over CompleteCodeBlock lists.
Will continue with refactor in different PR.

mjkkirschner · 2020-11-03T16:25:37Z

src/Engine/ProtoCore/Core.cs

@@ -214,7 +214,7 @@ public struct ErrorEntry
        public List<CodeBlock> CodeBlockList { get; set; }
        // The Complete Code Block list contains all the code blocks
        // unlike the codeblocklist which only stores the outer most code blocks
-        public List<CodeBlock> CompleteCodeBlockList { get; set; }
+        public SortedDictionary<int, CodeBlock> CompleteCodeBlockList { get; set; }


@pinzart I don't think we can do this. Unfortunately, lot's of code in here appears to be unnecessarily public. Even though we are in the "engine" we do not know who is using these public members, and we have stated publicly we will follow semantic versioning.

It's entirely possible an extension could access these data structures.

What about obsoleting this member, having it return the values inside the new dictionary, and adding a new one - and lastly consider if the new member should actually be public.

hmm..I had hoped the VM was not avavilablr to api users.
But I se you are correct, I updated the code. will add obsolete stmts

mjkkirschner · 2020-11-03T16:26:13Z

src/Engine/ProtoCore/DSASM/Executable.cs

@@ -46,7 +46,7 @@ public class Executable
        public TypeSystem TypeSystem { get; set; }

        public List<CodeBlock> CodeBlocks { get; set; }
-        public List<CodeBlock> CompleteCodeBlocks { get; set; }
+        public SortedDictionary<int, CodeBlock> CompleteCodeBlocks { get; set; }


same API issue.

src/Engine/ProtoCore/Core.cs

aparajit-pratap · 2020-11-03T22:55:43Z

src/Engine/ProtoScript/Runners/LiveRunner.cs

-                    core.CompleteCodeBlockList.RemoveAll(x => x.guid == bnode.guid);// && x.AstID == bnode.OriginalAstID);
+                    foreach (var key in core.CompleteCodeBlockDict.Where(x => x.Value.guid == bnode.guid).Select(x => x.Key).ToArray())
+                    {
+                        core.CompleteCodeBlockDict.Remove(key);


How about doing this:

core.CompleteCodeBlockDict = core.CompleteCodeBlockDict .Where(kv => !(kv.Value.guid == bnode.guid)) .ToDictionary(kv => kv.Key, kv => kv.Value);

Hmm..your suggestions does seem to have less operations
I'll give it a try. THanks

Hmm...actually completeCodeBlockDict is a SortedDictionary.
Your code would have to do something like
core.CompleteCodeBlockDict = new SortedDictionary<int, CodeBlock>(core.CompleteCodeBlockDict.Where(kv => !(kv.Value.guid == bnode.guid)).ToDictionary(kv => kv.Key, kv => kv.Value));
It might not be worth it (creates a standard dictionary and allocates another SortedDict) also I find it a bit less obvious than the original.

To make it even more obvious/ easy to understand maybe this would work better:
var keysToRemove = core.CompleteCodeBlockDict.Where(x => x.Value.guid == bnode.guid).Select(x => x.Key).ToArray(); keysToRemove.ForEach(key => core.CompleteCodeBlockDict.Remove(key));

I don't think performance should be an issue here...

aparajit-pratap · 2020-11-04T02:06:15Z

LGTM!

mjkkirschner · 2020-11-04T15:51:58Z

src/Engine/ProtoCore/Utils/ClassUtils.cs

@@ -154,15 +154,14 @@ public static int GetSymbolIndex(ClassNode classNode, string name, int classScop
            return Constants.kInvalidIndex;
        }

-        public static List<int> GetAncestorBlockIdsOfBlock(int blockId, List<CodeBlock> codeblockList)
+        public static List<int> GetAncestorBlockIdsOfBlock(int blockId, SortedDictionary<int, CodeBlock> codeblocks)


@pinzart I'm sorry I missed this - you can't change this public signature - please follow the same pattern of obsoleting this and adding an overload.

mjkkirschner · 2020-11-04T15:52:09Z

src/Engine/ProtoCore/Utils/ClassUtils.cs

@@ -69,7 +69,7 @@ public static int GetUpcastCountTo(ClassNode from, ClassNode to, RuntimeCore run
        // 
        //     2.3 Otherwise, classScope == kInvalidIndex && functionScope == kInvalidIndex
        //         Return public member in derived class, or public member in base classes 
-        public static int GetSymbolIndex(ClassNode classNode, string name, int classScope, int functionScope, int blockId, List<CodeBlock> codeblockList, out bool hasThisSymbol, out ProtoCore.DSASM.AddressType addressType)
+        public static int GetSymbolIndex(ClassNode classNode, string name, int classScope, int functionScope, int blockId, SortedDictionary<int, CodeBlock> codeblocks, out bool hasThisSymbol, out ProtoCore.DSASM.AddressType addressType)


@pinzart I'm sorry I missed this - you can't change this public signature - please follow the same pattern of obsoleting this and adding an overload.

account for deletion of nodes

664ac8c

remove debugging code

53f645b

pinzart90 requested a review from mjkkirschner October 30, 2020 14:12

pinzart90 reviewed Oct 30, 2020

View reviewed changes

src/Engine/ProtoCore/DSASM/Executive.cs Outdated Show resolved Hide resolved

pinzart90 requested a review from sm6srw October 30, 2020 15:10

pinzart90 marked this pull request as ready for review October 30, 2020 18:45

pinzart90 added WIP DNM Do not merge. For PRs. labels Oct 30, 2020

pinzart added 3 commits October 31, 2020 18:11

update with assert

5ca9fac

more null checks

3d3bfe1

add test

9a5f9dd

pinzart90 added PTAL Please Take A Look 👀 and removed DNM Do not merge. For PRs. WIP labels Nov 2, 2020

pinzart90 changed the title ~~account for deletion of nodes~~ crash when undoing deletion of function definition code blocks Nov 2, 2020

sm6srw approved these changes Nov 2, 2020

View reviewed changes

mjkkirschner reviewed Nov 2, 2020

View reviewed changes

aparajit-pratap commented Nov 2, 2020

View reviewed changes

src/Engine/ProtoCore/DSASM/Executive.cs Show resolved Hide resolved

aparajit-pratap commented Nov 2, 2020

View reviewed changes

test/Engine/ProtoTest/LiveRunnerTests/MicroFeatureTests.cs Outdated Show resolved Hide resolved

aparajit-pratap commented Nov 2, 2020

View reviewed changes

test/Engine/ProtoTest/LiveRunnerTests/MicroFeatureTests.cs Outdated Show resolved Hide resolved

pinzart added 3 commits November 2, 2020 16:41

Update Executive.cs

cdd71c1

Update MicroFeatureTests.cs

8b221ce

minimal refactor to address possible performance issue

7b988ec

mjkkirschner reviewed Nov 3, 2020

View reviewed changes

pinzart added 2 commits November 3, 2020 12:35

update

2bf5e1e

obsolete

ac474d8

mjkkirschner reviewed Nov 3, 2020

View reviewed changes

src/Engine/ProtoCore/Core.cs Outdated Show resolved Hide resolved

update obsolete version

78041d4

mjkkirschner approved these changes Nov 3, 2020

View reviewed changes

aparajit-pratap commented Nov 3, 2020

View reviewed changes

Update LiveRunner.cs

d428ea4

pinzart90 merged commit 7a09d0c into master Nov 4, 2020

pinzart90 deleted the codeblock_list_crash branch November 4, 2020 15:07

mjkkirschner reviewed Nov 4, 2020

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

crash when undoing deletion of function definition code blocks #11220

crash when undoing deletion of function definition code blocks #11220

aparajit-pratap commented Oct 29, 2020 •

edited by pinzart90

Loading

aparajit-pratap commented Oct 29, 2020 •

edited

Loading

pinzart90 commented Oct 30, 2020 •

edited

Loading

sm6srw left a comment

mjkkirschner Nov 2, 2020

pinzart90 Nov 2, 2020

mjkkirschner Nov 2, 2020

pinzart90 Nov 2, 2020

pinzart90 Nov 2, 2020

aparajit-pratap Nov 3, 2020 •

edited

Loading

pinzart90 commented Nov 3, 2020

mjkkirschner Nov 3, 2020

pinzart90 Nov 3, 2020

mjkkirschner Nov 3, 2020

aparajit-pratap Nov 3, 2020

pinzart90 Nov 4, 2020

pinzart90 Nov 4, 2020 •

edited

Loading

aparajit-pratap commented Nov 4, 2020

mjkkirschner Nov 4, 2020

mjkkirschner Nov 4, 2020

crash when undoing deletion of function definition code blocks #11220

crash when undoing deletion of function definition code blocks #11220

Conversation

aparajit-pratap commented Oct 29, 2020 • edited by pinzart90 Loading

Purpose

Declarations

Reviewers

FYIs

aparajit-pratap commented Oct 29, 2020 • edited Loading

pinzart90 commented Oct 30, 2020 • edited Loading

sm6srw left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aparajit-pratap Nov 3, 2020 • edited Loading

Choose a reason for hiding this comment

pinzart90 commented Nov 3, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pinzart90 Nov 4, 2020 • edited Loading

Choose a reason for hiding this comment

aparajit-pratap commented Nov 4, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aparajit-pratap commented Oct 29, 2020 •

edited by pinzart90

Loading

aparajit-pratap commented Oct 29, 2020 •

edited

Loading

pinzart90 commented Oct 30, 2020 •

edited

Loading

aparajit-pratap Nov 3, 2020 •

edited

Loading

pinzart90 Nov 4, 2020 •

edited

Loading