Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Merged by Bors] - [Merged by Bors] - Decouple bytecompiler from CodeBlock #2669

Closed
wants to merge 1 commit into from

Conversation

HalidOdat
Copy link
Member

@HalidOdat HalidOdat commented Mar 15, 2023

Hopefully this is a PR in a series of PRs to implement a bytecode optimizer, before that can happen there needs to be a lot of refactoring in the way we store and compile it.

This also give us some memory benefits, it reduces CodeBlock size from 264 => 208 (removes 56 bytes).

Additionally when calling into_boxed_slice, If the vector has excess capacity, its items will be moved into a newly-allocated buffer with exactly the right capacity removing wasted space.

@HalidOdat HalidOdat force-pushed the refactor/CodeBlock branch 2 times, most recently from d68cc24 to 8bec300 Compare March 15, 2023 18:16
@github-actions
Copy link

github-actions bot commented Mar 15, 2023

Test262 conformance changes

Test result main count PR count difference
Total 94,277 94,277 0
Passed 71,992 71,990 -2
Ignored 17,324 17,324 0
Failed 4,961 4,963 +2
Panics 12 14 +2
Conformance 76.36% 76.36% -0.00%
New panics (2):
test/built-ins/WeakSet/prototype/has/returns-false-when-object-value-not-present.js [strict mode] (previously Passed)
test/built-ins/WeakSet/prototype/delete/delete-entry-initial-iterable.js [strict mode] (previously Passed)

@HalidOdat HalidOdat marked this pull request as draft March 15, 2023 18:22
@HalidOdat HalidOdat force-pushed the refactor/CodeBlock branch 2 times, most recently from c0e79bb to f03a3e7 Compare March 15, 2023 18:37
@HalidOdat HalidOdat added the run-benchmark Label used to run banchmarks on PRs label Mar 15, 2023
@codecov
Copy link

codecov bot commented Mar 15, 2023

Codecov Report

Merging #2669 (36cc7bd) into main (0a843d2) will increase coverage by 0.62%.
The diff coverage is 54.00%.

@@            Coverage Diff             @@
##             main    #2669      +/-   ##
==========================================
+ Coverage   49.70%   50.32%   +0.62%     
==========================================
  Files         385      391       +6     
  Lines       39256    38788     -468     
==========================================
+ Hits        19512    19521       +9     
+ Misses      19744    19267     -477     
Impacted Files Coverage Δ
boa_engine/src/bytecompiler/class.rs 0.00% <0.00%> (ø)
boa_engine/src/vm/code_block.rs 29.56% <0.00%> (-1.44%) ⬇️
boa_engine/src/vm/flowgraph/mod.rs 0.00% <0.00%> (ø)
boa_engine/src/bytecompiler/function.rs 64.91% <70.00%> (-3.16%) ⬇️
boa_engine/src/bytecompiler/mod.rs 59.17% <92.15%> (+1.35%) ⬆️

... and 52 files with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@HalidOdat HalidOdat force-pushed the refactor/CodeBlock branch from f03a3e7 to 75edd12 Compare March 15, 2023 18:59
Copy link
Member

@Razican Razican left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All looks good :) just a couple of comments. Thanks!

self.code_block.bytecode[index + 2] = bytes[1];
self.code_block.bytecode[index + 3] = bytes[2];
self.code_block.bytecode[index + 4] = bytes[3];
self.bytecode[index + 1] = bytes[0];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At some point we should check if this adds bound checking, and if we can do the bounds checking only once. Might enhance performance slightly.

Copy link
Member Author

@HalidOdat HalidOdat Mar 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my checking it seems that it does generate a bounds check per array access, my guess is that so it can give accurate panic locations if out-of-bounds

See generated assembly instructions: https://godbolt.org/z/q8azssz5K

Copy link
Member Author

@HalidOdat HalidOdat Mar 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explicitly checking beforehand and using a little unsafe we can remove the bounds checks: https://godbolt.org/z/bKT9ozr8s

It can create a PR after this one to address this :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sometimes in Rust, if you check the full range before all the array access with an assert, it will remove all bounds checking without needing "unsafe". We would need to test that :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but through by testing it doesn't seem so do so https://godbolt.org/z/q15WGh9n8 in this case...

boa_engine/src/vm/code_block.rs Outdated Show resolved Hide resolved
@github-actions
Copy link

Benchmark for 4d5e85d

Click to view benchmark
Test Base PR %
Arithmetic operations (Compiler) 615.2±1.26ns 687.9±1.09ns +11.82%
Arithmetic operations (Execution) 604.5±0.59ns 620.5±0.69ns +2.65%
Arithmetic operations (Parser) 7.7±0.07µs 7.7±0.03µs 0.00%
Array access (Compiler) 1641.0±3.43ns 1952.8±5.69ns +19.00%
Array access (Execution) 8.6±0.01µs 8.6±0.02µs 0.00%
Array access (Parser) 15.9±0.03µs 16.1±0.04µs +1.26%
Array creation (Compiler) 2.5±0.01µs 2.9±0.02µs +16.00%
Array creation (Execution) 1184.0±2.46µs 1191.5±2.29µs +0.63%
Array creation (Parser) 19.4±0.04µs 19.4±0.05µs 0.00%
Array pop (Compiler) 5.1±0.01µs 5.5±0.03µs +7.84%
Array pop (Execution) 714.0±1.40µs 707.6±1.28µs -0.90%
Array pop (Parser) 171.5±0.19µs 172.1±0.33µs +0.35%
Boolean Object Access (Compiler) 1185.0±2.89ns 1359.8±4.01ns +14.75%
Boolean Object Access (Execution) 4.9±0.02µs 4.9±0.01µs 0.00%
Boolean Object Access (Parser) 20.1±0.13µs 20.2±0.02µs +0.50%
Clean js (Compiler) 5.1±0.01µs 5.8±0.01µs +13.73%
Clean js (Execution) 704.0±2.71µs 704.8±3.25µs +0.11%
Clean js (Parser) 40.6±0.11µs 40.3±0.07µs -0.74%
Create Realm 518.1±1.69µs 517.2±9.79µs -0.17%
Dynamic Object Property Access (Compiler) 1911.9±3.61ns 2.4±0.01µs +25.53%
Dynamic Object Property Access (Execution) 5.5±0.01µs 5.6±0.01µs +1.82%
Dynamic Object Property Access (Parser) 14.2±0.02µs 14.2±0.03µs 0.00%
Fibonacci (Compiler) 2.9±0.00µs 3.5±0.01µs +20.69%
Fibonacci (Execution) 1190.3±3.03µs 1177.5±5.76µs -1.08%
Fibonacci (Parser) 23.0±0.04µs 22.9±0.22µs -0.43%
For loop (Compiler) 2.9±0.02µs 3.3±0.01µs +13.79%
For loop (Execution) 18.2±0.03µs 18.4±0.04µs +1.10%
For loop (Parser) 19.7±0.06µs 19.9±0.04µs +1.02%
Mini js (Compiler) 4.6±0.01µs 5.2±0.02µs +13.04%
Mini js (Execution) 647.9±2.83µs 651.3±3.24µs +0.52%
Mini js (Parser) 35.5±0.04µs 35.3±0.06µs -0.56%
Number Object Access (Compiler) 1129.2±2.76ns 1288.4±3.43ns +14.10%
Number Object Access (Execution) 3.7±0.01µs 3.8±0.01µs +2.70%
Number Object Access (Parser) 15.4±0.04µs 15.4±0.04µs 0.00%
Object Creation (Compiler) 1669.4±4.59ns 2.1±0.01µs +25.79%
Object Creation (Execution) 5.2±0.01µs 5.2±0.02µs 0.00%
Object Creation (Parser) 12.5±0.02µs 12.6±0.03µs +0.80%
RegExp (Compiler) 1905.5±4.99ns 2.3±0.00µs +20.70%
RegExp (Execution) 13.9±0.06µs 13.8±0.04µs -0.72%
RegExp (Parser) 13.7±0.06µs 13.9±0.03µs +1.46%
RegExp Creation (Compiler) 1735.1±8.37ns 2.1±0.01µs +21.03%
RegExp Creation (Execution) 9.8±0.03µs 9.9±0.02µs +1.02%
RegExp Creation (Parser) 11.5±0.02µs 11.6±0.02µs +0.87%
RegExp Literal (Compiler) 1926.1±6.09ns 2.3±0.00µs +19.41%
RegExp Literal (Execution) 13.9±0.05µs 13.8±0.04µs -0.72%
RegExp Literal (Parser) 15.2±0.04µs 15.2±0.03µs 0.00%
RegExp Literal Creation (Compiler) 1736.2±9.28ns 2.1±0.01µs +20.95%
RegExp Literal Creation (Execution) 9.8±0.03µs 9.9±0.03µs +1.02%
RegExp Literal Creation (Parser) 12.9±0.04µs 13.0±0.06µs +0.78%
Static Object Property Access (Compiler) 1683.8±6.04ns 2.1±0.01µs +24.72%
Static Object Property Access (Execution) 5.3±0.01µs 5.4±0.01µs +1.89%
Static Object Property Access (Parser) 13.6±0.04µs 13.6±0.03µs 0.00%
String Object Access (Compiler) 1501.5±2.80ns 1744.6±5.71ns +16.19%
String Object Access (Execution) 6.8±0.02µs 6.9±0.02µs +1.47%
String Object Access (Parser) 19.4±0.03µs 19.5±0.04µs +0.52%
String comparison (Compiler) 2.6±0.01µs 2.9±0.01µs +11.54%
String comparison (Execution) 4.6±0.01µs 4.6±0.01µs 0.00%
String comparison (Parser) 15.5±0.02µs 15.7±0.02µs +1.29%
String concatenation (Compiler) 1977.7±6.98ns 2.4±0.01µs +21.35%
String concatenation (Execution) 4.3±0.01µs 4.4±0.01µs +2.33%
String concatenation (Parser) 10.6±0.02µs 10.6±0.03µs 0.00%
String copy (Compiler) 1612.7±8.74ns 1947.6±7.70ns +20.77%
String copy (Execution) 4.0±0.01µs 4.0±0.01µs 0.00%
String copy (Parser) 7.8±0.02µs 7.9±0.02µs +1.28%
Symbols (Compiler) 1162.9±2.15ns 1430.2±2.42ns +22.99%
Symbols (Execution) 4.1±0.01µs 4.2±0.01µs +2.44%
Symbols (Parser) 6.0±0.02µs 6.1±0.03µs +1.67%

This also give us some memory benefits, it reduces CodeBlock size from
264 => 208 (removes 56 bytes).

Additionally calling `into_boxed_slice` If the vector has excess capacity,
its items will be moved into a newly-allocated buffer with exactly the
right capacity.
@HalidOdat HalidOdat force-pushed the refactor/CodeBlock branch from 75edd12 to 36cc7bd Compare March 16, 2023 10:24
@HalidOdat HalidOdat marked this pull request as ready for review March 16, 2023 10:25
@github-actions
Copy link

Benchmark for 5a9e612

Click to view benchmark
Test Base PR %
Arithmetic operations (Compiler) 520.4±34.98ns 558.2±32.79ns +7.26%
Arithmetic operations (Execution) 464.6±35.14ns 446.3±20.36ns -3.94%
Arithmetic operations (Parser) 6.3±0.33µs 6.3±0.40µs 0.00%
Array access (Compiler) 1369.6±89.86ns 1677.9±142.55ns +22.51%
Array access (Execution) 7.5±0.44µs 7.2±0.32µs -4.00%
Array access (Parser) 13.2±0.71µs 14.0±0.65µs +6.06%
Array creation (Compiler) 2.1±0.16µs 2.3±0.09µs +9.52%
Array creation (Execution) 1057.5±61.47µs 1040.2±63.86µs -1.64%
Array creation (Parser) 16.2±1.08µs 16.1±1.16µs -0.62%
Array pop (Compiler) 4.1±0.18µs 4.5±0.29µs +9.76%
Array pop (Execution) 595.7±33.74µs 601.8±29.25µs +1.02%
Array pop (Parser) 140.9±7.65µs 138.5±6.91µs -1.70%
Boolean Object Access (Compiler) 1004.0±63.16ns 1082.0±56.34ns +7.77%
Boolean Object Access (Execution) 4.2±0.24µs 4.3±0.32µs +2.38%
Boolean Object Access (Parser) 17.4±1.48µs 16.7±1.00µs -4.02%
Clean js (Compiler) 4.4±0.26µs 4.9±0.31µs +11.36%
Clean js (Execution) 580.2±29.78µs 572.7±26.95µs -1.29%
Clean js (Parser) 34.2±2.13µs 33.5±1.61µs -2.05%
Create Realm 442.3±21.22µs 501.3±41.63µs +13.34%
Dynamic Object Property Access (Compiler) 1572.7±88.28ns 1889.5±101.50ns +20.14%
Dynamic Object Property Access (Execution) 4.4±0.20µs 4.8±0.22µs +9.09%
Dynamic Object Property Access (Parser) 12.1±0.73µs 11.8±0.65µs -2.48%
Fibonacci (Compiler) 2.5±0.19µs 2.9±0.16µs +16.00%
Fibonacci (Execution) 985.7±44.28µs 1044.5±73.27µs +5.97%
Fibonacci (Parser) 18.9±1.05µs 18.9±1.01µs 0.00%
For loop (Compiler) 2.5±0.13µs 2.7±0.12µs +8.00%
For loop (Execution) 15.4±0.92µs 15.5±0.86µs +0.65%
For loop (Parser) 16.7±1.19µs 17.0±0.88µs +1.80%
Mini js (Compiler) 3.8±0.21µs 4.2±0.21µs +10.53%
Mini js (Execution) 540.4±30.92µs 538.2±25.03µs -0.41%
Mini js (Parser) 29.4±1.71µs 28.8±1.91µs -2.04%
Number Object Access (Compiler) 944.6±56.79ns 1127.5±80.81ns +19.36%
Number Object Access (Execution) 3.1±0.16µs 3.3±0.27µs +6.45%
Number Object Access (Parser) 13.1±0.87µs 13.5±1.02µs +3.05%
Object Creation (Compiler) 1354.5±70.94ns 1755.0±87.86ns +29.57%
Object Creation (Execution) 4.4±0.37µs 4.5±0.28µs +2.27%
Object Creation (Parser) 10.2±0.45µs 10.6±0.58µs +3.92%
RegExp (Compiler) 1575.4±97.32ns 1960.4±128.89ns +24.44%
RegExp (Execution) 11.4±0.59µs 11.9±0.62µs +4.39%
RegExp (Parser) 11.3±0.60µs 11.6±0.72µs +2.65%
RegExp Creation (Compiler) 1473.6±98.21ns 1735.4±89.34ns +17.77%
RegExp Creation (Execution) 8.4±0.56µs 8.7±0.53µs +3.57%
RegExp Creation (Parser) 9.5±0.48µs 9.3±0.42µs -2.11%
RegExp Literal (Compiler) 1603.5±98.10ns 1889.9±100.46ns +17.86%
RegExp Literal (Execution) 11.4±0.73µs 11.8±0.61µs +3.51%
RegExp Literal (Parser) 12.6±0.71µs 12.9±0.77µs +2.38%
RegExp Literal Creation (Compiler) 1434.2±77.51ns 1812.4±120.29ns +26.37%
RegExp Literal Creation (Execution) 7.9±0.40µs 9.0±0.50µs +13.92%
RegExp Literal Creation (Parser) 10.7±0.50µs 10.8±0.51µs +0.93%
Static Object Property Access (Compiler) 1391.9±113.96ns 1779.6±97.73ns +27.85%
Static Object Property Access (Execution) 4.3±0.22µs 4.6±0.24µs +6.98%
Static Object Property Access (Parser) 11.4±0.69µs 10.9±0.57µs -4.39%
String Object Access (Compiler) 1248.9±59.83ns 1420.9±77.75ns +13.77%
String Object Access (Execution) 6.0±0.53µs 5.6±0.46µs -6.67%
String Object Access (Parser) 16.6±0.91µs 16.6±1.31µs 0.00%
String comparison (Compiler) 2.1±0.14µs 2.5±0.16µs +19.05%
String comparison (Execution) 3.8±0.16µs 4.1±0.32µs +7.89%
String comparison (Parser) 13.0±0.80µs 12.8±0.66µs -1.54%
String concatenation (Compiler) 1635.0±107.22ns 1997.7±133.14ns +22.18%
String concatenation (Execution) 3.6±0.24µs 3.7±0.24µs +2.78%
String concatenation (Parser) 8.8±0.48µs 8.7±0.50µs -1.14%
String copy (Compiler) 1290.6±63.08ns 1675.2±112.81ns +29.80%
String copy (Execution) 3.3±0.18µs 4.0±0.27µs +21.21%
String copy (Parser) 6.5±0.37µs 6.4±0.36µs -1.54%
Symbols (Compiler) 960.6±56.16ns 1162.5±70.80ns +21.02%
Symbols (Execution) 3.5±0.21µs 3.5±0.20µs 0.00%
Symbols (Parser) 4.9±0.19µs 4.8±0.25µs -2.04%

@HalidOdat
Copy link
Member Author

bors r+

bors bot pushed a commit that referenced this pull request Mar 16, 2023
Hopefully this is a PR in a series of PRs to implement a bytecode optimizer, before that can happen there needs to be a lot of refactoring in the way we store and compile it.

This also give us some memory benefits, it reduces `CodeBlock` size from `264` **=>** `208` (removes `56` bytes).

Additionally when calling `into_boxed_slice`, If the vector has excess capacity, its items will be moved into a newly-allocated buffer with exactly the right capacity removing wasted space.
@bors
Copy link

bors bot commented Mar 16, 2023

Pull request successfully merged into main.

Build succeeded:

@bors bors bot changed the title Decouple bytecompiler from CodeBlock [Merged by Bors] - Decouple bytecompiler from CodeBlock Mar 16, 2023
@bors bors bot closed this Mar 16, 2023
@bors bors bot deleted the refactor/CodeBlock branch March 16, 2023 11:21
@bors
Copy link

bors bot commented Mar 16, 2023

Pull request successfully merged into main.

Build succeeded:

@bors bors bot changed the title [Merged by Bors] - Decouple bytecompiler from CodeBlock [Merged by Bors] - [Merged by Bors] - Decouple bytecompiler from CodeBlock Mar 16, 2023
bors bot pushed a commit that referenced this pull request Mar 17, 2023
As discussed in this comment #2669 (comment),
`rustc` doesn't seem to optimize out the bounds checks.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
run-benchmark Label used to run banchmarks on PRs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants