Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Update stack diffing algorithm in describeNativeComponentFrame (#27132)
## Summary There's a bug with the existing stack comparison algorithm in `describeNativeComponentFrame` — specifically how it attempts to find a common root frame between the control and sample stacks. This PR attempts to fix that bug by injecting a frame that can have a guaranteed string in it for us to search for in both stacks to find a common root. ## Brief Background/How it works now Right now `describeNativeComponentFrame` does the following to leverage native browser/VM stack frames to get details (e.g. script path, row and col #s) for a single component: 1. Throwing and catching a control error in the function 2. Calling the component which should eventually throw an error (most of the time), that we'll catch as our sample error. 3. Diffing the stacks in the control and sample errors to find the line which should represent our component call. ## What's broken To account for potential stack trace truncation, the stack diffing algorithm first attempts to find a common "root" frame by inspecting the earliest frame of the sample stack and searching for an identical frame in the control stack starting from the bottom. However, there are a couple of scenarios which I've hit that cause the above approach to not work correctly. First, it's possible that for render passes of extremely large component trees to have a lot of repeating internal react function calls, which can result in an incorrect common or "root" frame found. Here's a small example from a stack trace using React Fizz for SSR. Our control frame can look like this: ``` Error: at Fake (...) at construct (native) at describeNativeComponentFrame (...) at describeClassComponentFrame (...) at getStackByComponentStackNode (...) at getCurrentStackInDEV (...) at renderNodeDestructive (...) at renderElement (...) at renderNodeDestructiveImpl (...) // <-- Actual common root frame with the sample stack at renderNodeDestructive (...) at renderElement (...) at renderNodeDestructiveImpl (...) // <-- Incorrectly chosen common root frame at renderNodeDestructive (...) ``` And our sample stack can look like this: ``` Error: at set (...) at PureComponent (...) at call (native) at apply (native) at ErrorBoundary (...) at construct (native) at describeNativeComponentFrame (...) at describeClassComponentFrame (...) at getStackByComponentStackNode (...) at getCurrentStackInDEV (...) at renderNodeDestructive (...) at renderElement (...) at renderNodeDestructiveImpl (...) // <-- Root frame that's common in the control stack ``` Here you can see that the earliest trace in the sample stack, the `renderNodeDestructiveImpl` call, can exactly match with multiple `renderNodeDestructiveImpl` calls in the control stack (including file path and line + col #s). Currently the algorithm will chose the earliest/last frame with the `renderNodeDestructiveImpl` call (which is the second last frame in our control stack), which is incorrect. The actual matching frame in the control stack is the latest or first frame (when traversing from the top) with the `renderNodeDestructiveImpl` call. This leads to the rest of the stack diffing associating an incorrect frame (`at getStackByComponentStackNode (...)`) for the component. Another issue with this approach is that it assumes all VMs will truncate stack traces at the *bottom*, [which isn't the case for the Hermes VM](https://github.com/facebook/hermes/blob/df07cf713a84a4434c83c08cede38ba438dc6aca/lib/VM/JSError.cpp#L688-L699) which **truncates stack traces in the middle**, placing a ``` at renderNodeDestructiveImpl (...) ... skipping {n} frames at renderNodeDestructive (...) ``` line in the middle of the stack trace for all stacks that contain more than 100 traces. This causes stack traces for React Native apps using the Hermes VM to potentially break for large component trees. Although for this specific case with Hermes, it's possible to account for this by either manually grepping and removing the `... skipping` line and everything below it (see draft PR: #26999), or by implementing the non-standard `prepareStackTrace` API which Hermes also supports to manually generate a stack trace that truncates from the bottom ([example implementation](main...KarimP:react:component-stack-hermes-fix)). ## The Fix I found different ways to go about fixing this. The first was to search for a common stack frame starting from the top/latest frame. It's a relatively small change ([see implementation](main...KarimP:react:component-stack-fix-2)), although it is less performant by being n^2 (albeit with `n` realistically being <= 5 here). It's also a bit more buggy for class components given that different VMs insert a different amount of additional lines for new/construct calls... Another fix would be to actually implement a [longest common substring](https://en.wikipedia.org/wiki/Longest_common_substring) algorithm, which can also be roughly n^2 time (assuming the longest common substring between control and sample will be most of the sample frame). The fix I ended up going with was have the lines that throw the control error and the lines that call/instantiate the component be inside a distinct method under an object property (`"DescribeNativeComponentFrameRoot"`). All major VMs (Safari's JavaScriptCore, Firefox's SpiderMonkey, V8, Hermes, and Bun) should display the object property name their stack trace. I've also set the `name` and `displayName` properties for method as well to account for minification, any advanced optimizations (e.g. key crushing), and VM inconsistencies (both Bun and Safari seem to exclusively use the value under `displayName` and not `name` in traces for methods defined under an object's own property...). We can then find this "common" frame by simply finding the line that has our special method name (`"DescribeNativeComponentFrameRoot"`), and the rest of the code to determine the actual component line works as expected. If by any chance we don't find a frame with our special method name in either control or sample stack traces, we then revert back to the existing approach mentioned above by searching for the last line of the sample frame in the control frame. ## How did you test this change? 1. There are bunch of existing tests that ensure a properly formatted component trace is logged for certain scenarios, so I ensured the existing full test suite passed 2. I threw an error in a component that's deep in the component hierarchy of a large React app (facebook) to ensure there's stack trace truncation, and ensured the correct component stack trace was logged for Chrome, Safari, and Firefox, and with and without minification. 3. Ran a large React app (facebook) on the Hermes VM, threw an error in a component that's deep in the component hierarchy, and ensured that component frames are generated despite stack traces being truncated in the middle. DiffTrain build for [88b00de](88b00de)
- Loading branch information