Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Merged by Bors] - Avoid unneeded bounds checks in bytecode address patching #2680

Closed
wants to merge 2 commits into from

Conversation

HalidOdat
Copy link
Member

As discussed in this comment #2669 (comment),
rustc doesn't seem to optimize out the bounds checks.

@HalidOdat HalidOdat added the performance Performance related changes and issues label Mar 16, 2023
@github-actions
Copy link

Test262 conformance changes

Test result main count PR count difference
Total 94,277 94,277 0
Passed 71,990 71,990 0
Ignored 17,324 17,324 0
Failed 4,963 4,963 0
Panics 14 14 0
Conformance 76.36% 76.36% 0.00%

@HalidOdat HalidOdat added the run-benchmark Label used to run banchmarks on PRs label Mar 16, 2023
@codecov
Copy link

codecov bot commented Mar 16, 2023

Codecov Report

Merging #2680 (bba7043) into main (da866ca) will increase coverage by 0.22%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main    #2680      +/-   ##
==========================================
+ Coverage   49.19%   49.41%   +0.22%     
==========================================
  Files         397      395       -2     
  Lines       39739    39531     -208     
==========================================
- Hits        19548    19533      -15     
+ Misses      20191    19998     -193     
Impacted Files Coverage Δ
boa_engine/src/bytecompiler/mod.rs 59.06% <100.00%> (-0.11%) ⬇️

... and 11 files with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@jedel1043
Copy link
Member

jedel1043 commented Mar 16, 2023

Got it to optimize the bounds check without using unsafe: https://godbolt.org/z/9eEs13TjW

EDIT: Another alternative that clarifies why the second condition is needed:

const U32_SIZE: usize = std::mem::size_of::<u32>();
assert!(code.len() > index + U32_SIZE && usize::MAX - U32_SIZE >= index);

let values = value.to_ne_bytes();
code[index..index + U32_SIZE].copy_from_slice(values.as_slice())

@jedel1043 jedel1043 added this to the v0.17.0 milestone Mar 16, 2023
@HalidOdat HalidOdat requested a review from jedel1043 March 16, 2023 19:13
@HalidOdat HalidOdat marked this pull request as ready for review March 16, 2023 19:13
@HalidOdat HalidOdat force-pushed the optimize/bytecompiler-address-patching branch 2 times, most recently from bcff86c to 4f867db Compare March 16, 2023 20:00
@HalidOdat HalidOdat force-pushed the optimize/bytecompiler-address-patching branch from 4f867db to bba7043 Compare March 16, 2023 20:10
@github-actions
Copy link

Benchmark for 4fc8be4

Click to view benchmark
Test Base PR %
Arithmetic operations (Compiler) 546.9±0.60ns 553.4±0.86ns +1.19%
Arithmetic operations (Execution) 417.4±0.29ns 410.2±0.17ns -1.72%
Arithmetic operations (Parser) 7.4±0.02µs 7.2±0.08µs -2.70%
Array access (Compiler) 1803.0±45.35ns 1661.8±3.67ns -7.83%
Array access (Execution) 7.6±0.04µs 7.5±0.01µs -1.32%
Array access (Parser) 16.4±0.06µs 16.1±0.06µs -1.83%
Array creation (Compiler) 2.5±0.00µs 2.5±0.01µs 0.00%
Array creation (Execution) 1067.3±1.84µs 1063.6±1.29µs -0.35%
Array creation (Parser) 19.1±0.07µs 19.0±0.07µs -0.52%
Array pop (Compiler) 4.3±0.01µs 4.5±0.01µs +4.65%
Array pop (Execution) 638.0±1.32µs 634.9±2.40µs -0.49%
Array pop (Parser) 160.3±0.78µs 163.6±0.44µs +2.06%
Boolean Object Access (Compiler) 1220.2±2.96ns 1192.3±2.89ns -2.29%
Boolean Object Access (Execution) 4.3±0.00µs 4.3±0.01µs 0.00%
Boolean Object Access (Parser) 19.4±0.06µs 19.3±0.20µs -0.52%
Clean js (Compiler) 5.3±0.02µs 5.2±0.01µs -1.89%
Clean js (Execution) 614.1±2.75µs 616.5±3.81µs +0.39%
Clean js (Parser) 39.6±0.16µs 39.6±0.15µs 0.00%
Create Realm 437.1±3.01µs 436.0±19.15µs -0.25%
Dynamic Object Property Access (Compiler) 2.0±0.00µs 1951.7±5.41ns -2.42%
Dynamic Object Property Access (Execution) 4.9±0.01µs 4.8±0.01µs -2.04%
Dynamic Object Property Access (Parser) 14.6±0.13µs 14.5±0.05µs -0.68%
Fibonacci (Compiler) 3.1±0.01µs 3.1±0.01µs 0.00%
Fibonacci (Execution) 1050.6±2.40µs 1098.7±1.81µs +4.58%
Fibonacci (Parser) 22.7±0.05µs 22.4±0.04µs -1.32%
For loop (Compiler) 2.8±0.00µs 2.8±0.02µs 0.00%
For loop (Execution) 15.5±0.02µs 15.7±0.08µs +1.29%
For loop (Parser) 19.4±0.09µs 19.4±0.08µs 0.00%
Mini js (Compiler) 4.7±0.04µs 4.6±0.01µs -2.13%
Mini js (Execution) 568.1±2.95µs 569.3±3.71µs +0.21%
Mini js (Parser) 34.7±0.07µs 34.5±0.09µs -0.58%
Number Object Access (Compiler) 1110.1±3.27ns 1107.8±2.53ns -0.21%
Number Object Access (Execution) 3.2±0.01µs 3.2±0.01µs 0.00%
Number Object Access (Parser) 14.9±0.08µs 14.9±0.03µs 0.00%
Object Creation (Compiler) 1803.0±5.39ns 1752.0±9.13ns -2.83%
Object Creation (Execution) 4.6±0.01µs 4.5±0.01µs -2.17%
Object Creation (Parser) 12.8±0.08µs 12.8±0.08µs 0.00%
RegExp (Compiler) 2.1±0.04µs 1987.7±6.23ns -5.35%
RegExp (Execution) 12.4±0.10µs 12.4±0.02µs 0.00%
RegExp (Parser) 14.0±0.17µs 13.8±0.05µs -1.43%
RegExp Creation (Compiler) 1906.5±46.09ns 1746.0±1.88ns -8.42%
RegExp Creation (Execution) 8.9±0.02µs 8.9±0.02µs 0.00%
RegExp Creation (Parser) 11.7±0.06µs 11.5±0.06µs -1.71%
RegExp Literal (Compiler) 2.1±0.05µs 1990.6±4.88ns -5.21%
RegExp Literal (Execution) 12.5±0.17µs 12.4±0.02µs -0.80%
RegExp Literal (Parser) 15.0±0.08µs 14.9±0.05µs -0.67%
RegExp Literal Creation (Compiler) 1909.4±41.87ns 1745.2±1.94ns -8.60%
RegExp Literal Creation (Execution) 8.9±0.02µs 8.9±0.02µs 0.00%
RegExp Literal Creation (Parser) 12.7±0.06µs 12.6±0.04µs -0.79%
Static Object Property Access (Compiler) 1831.3±3.67ns 1770.7±2.83ns -3.31%
Static Object Property Access (Execution) 4.7±0.02µs 4.7±0.01µs 0.00%
Static Object Property Access (Parser) 13.8±0.05µs 13.7±0.04µs -0.72%
String Object Access (Compiler) 1573.3±2.81ns 1540.3±4.68ns -2.10%
String Object Access (Execution) 5.8±0.01µs 5.9±0.01µs +1.72%
String Object Access (Parser) 18.9±0.09µs 18.9±0.07µs 0.00%
String comparison (Compiler) 2.6±0.01µs 2.6±0.00µs 0.00%
String comparison (Execution) 4.0±0.01µs 4.0±0.01µs 0.00%
String comparison (Parser) 15.9±0.13µs 15.5±0.15µs -2.52%
String concatenation (Compiler) 2.0±0.00µs 2.0±0.00µs 0.00%
String concatenation (Execution) 3.8±0.01µs 3.8±0.01µs 0.00%
String concatenation (Parser) 10.8±0.08µs 10.6±0.04µs -1.85%
String copy (Compiler) 1674.2±5.14ns 1641.5±3.39ns -1.95%
String copy (Execution) 3.6±0.02µs 3.7±0.01µs +2.78%
String copy (Parser) 8.3±0.07µs 8.1±0.03µs -2.41%
Symbols (Compiler) 1305.5±38.05ns 1211.6±4.21ns -7.19%
Symbols (Execution) 3.7±0.01µs 3.7±0.01µs 0.00%
Symbols (Parser) 6.4±0.04µs 6.4±0.04µs 0.00%

@github-actions
Copy link

Benchmark for 8d1a5b9

Click to view benchmark
Test Base PR %
Arithmetic operations (Compiler) 690.4±31.66ns 744.8±42.09ns +7.88%
Arithmetic operations (Execution) 553.1±57.38ns 567.2±37.35ns +2.55%
Arithmetic operations (Parser) 8.4±0.42µs 8.6±0.54µs +2.38%
Array access (Compiler) 2.2±0.23µs 2.2±0.14µs 0.00%
Array access (Execution) 9.2±0.46µs 9.5±0.46µs +3.26%
Array access (Parser) 18.6±1.48µs 18.6±0.85µs 0.00%
Array creation (Compiler) 3.2±0.19µs 3.5±0.70µs +9.37%
Array creation (Execution) 1221.5±60.14µs 1302.6±270.00µs +6.64%
Array creation (Parser) 22.6±0.80µs 22.5±0.95µs -0.44%
Array pop (Compiler) 6.1±0.93µs 6.0±0.54µs -1.64%
Array pop (Execution) 702.3±37.99µs 739.8±35.38µs +5.34%
Array pop (Parser) 201.2±18.31µs 206.2±17.37µs +2.49%
Boolean Object Access (Compiler) 1493.8±90.73ns 1550.2±79.65ns +3.78%
Boolean Object Access (Execution) 5.3±0.25µs 5.4±0.38µs +1.89%
Boolean Object Access (Parser) 22.7±0.98µs 22.1±0.91µs -2.64%
Clean js (Compiler) 6.3±0.27µs 6.6±0.36µs +4.76%
Clean js (Execution) 748.3±43.36µs 769.7±26.26µs +2.86%
Clean js (Parser) 47.8±2.76µs 48.8±4.96µs +2.09%
Create Realm 564.6±28.68µs 555.2±34.41µs -1.66%
Dynamic Object Property Access (Compiler) 2.5±0.11µs 2.7±0.23µs +8.00%
Dynamic Object Property Access (Execution) 6.1±0.63µs 6.2±0.25µs +1.64%
Dynamic Object Property Access (Parser) 17.1±1.34µs 17.0±1.54µs -0.58%
Fibonacci (Compiler) 3.9±0.18µs 4.0±0.22µs +2.56%
Fibonacci (Execution) 1399.6±83.19µs 1369.2±59.46µs -2.17%
Fibonacci (Parser) 27.3±1.62µs 26.5±1.57µs -2.93%
For loop (Compiler) 3.8±0.28µs 3.6±0.20µs -5.26%
For loop (Execution) 21.0±1.02µs 20.4±0.86µs -2.86%
For loop (Parser) 23.7±1.69µs 22.0±1.43µs -7.17%
Mini js (Compiler) 5.5±0.35µs 5.8±0.34µs +5.45%
Mini js (Execution) 658.7±25.77µs 694.9±40.48µs +5.50%
Mini js (Parser) 41.9±3.83µs 43.2±4.98µs +3.10%
Number Object Access (Compiler) 1426.2±104.76ns 1420.1±72.89ns -0.43%
Number Object Access (Execution) 4.1±0.60µs 4.2±0.28µs +2.44%
Number Object Access (Parser) 17.5±0.88µs 17.0±0.95µs -2.86%
Object Creation (Compiler) 2.3±0.15µs 2.4±0.15µs +4.35%
Object Creation (Execution) 5.8±0.21µs 5.8±0.37µs 0.00%
Object Creation (Parser) 15.1±1.14µs 14.0±0.84µs -7.28%
RegExp (Compiler) 2.6±0.26µs 2.6±0.20µs 0.00%
RegExp (Execution) 17.6±0.95µs 17.7±0.84µs +0.57%
RegExp (Parser) 16.7±1.26µs 16.6±4.92µs -0.60%
RegExp Creation (Compiler) 2.4±0.17µs 2.3±0.24µs -4.17%
RegExp Creation (Execution) 11.6±0.91µs 11.6±0.92µs 0.00%
RegExp Creation (Parser) 14.1±0.91µs 14.0±2.31µs -0.71%
RegExp Literal (Compiler) 2.6±0.15µs 2.6±0.17µs 0.00%
RegExp Literal (Execution) 17.7±1.46µs 18.1±2.12µs +2.26%
RegExp Literal (Parser) 18.4±2.26µs 18.1±0.98µs -1.63%
RegExp Literal Creation (Compiler) 2.3±0.39µs 2.3±0.13µs 0.00%
RegExp Literal Creation (Execution) 11.4±0.94µs 11.3±0.69µs -0.88%
RegExp Literal Creation (Parser) 15.1±0.68µs 15.3±1.07µs +1.32%
Static Object Property Access (Compiler) 2.3±0.11µs 2.4±0.11µs +4.35%
Static Object Property Access (Execution) 6.0±0.33µs 6.1±0.50µs +1.67%
Static Object Property Access (Parser) 16.3±1.62µs 15.5±0.95µs -4.91%
String Object Access (Compiler) 1922.5±105.49ns 1910.9±82.88ns -0.60%
String Object Access (Execution) 7.2±0.52µs 7.5±0.41µs +4.17%
String Object Access (Parser) 21.8±0.92µs 22.3±1.73µs +2.29%
String comparison (Compiler) 3.2±0.17µs 3.3±0.19µs +3.12%
String comparison (Execution) 5.3±0.56µs 5.3±0.25µs 0.00%
String comparison (Parser) 18.3±1.22µs 18.2±1.11µs -0.55%
String concatenation (Compiler) 2.7±0.30µs 2.7±0.42µs 0.00%
String concatenation (Execution) 4.9±0.35µs 5.1±0.31µs +4.08%
String concatenation (Parser) 12.5±0.72µs 11.9±1.04µs -4.80%
String copy (Compiler) 2.2±0.24µs 2.1±0.16µs -4.55%
String copy (Execution) 4.8±0.39µs 4.7±0.26µs -2.08%
String copy (Parser) 9.3±0.68µs 9.3±1.14µs 0.00%
Symbols (Compiler) 1631.5±139.21ns 1653.0±117.97ns +1.32%
Symbols (Execution) 4.6±0.30µs 4.8±0.49µs +4.35%
Symbols (Parser) 7.2±0.37µs 7.2±0.87µs 0.00%

Copy link
Member

@Razican Razican left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

Copy link
Member

@jedel1043 jedel1043 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@jedel1043
Copy link
Member

bors r+

bors bot pushed a commit that referenced this pull request Mar 17, 2023
As discussed in this comment #2669 (comment),
`rustc` doesn't seem to optimize out the bounds checks.
@bors
Copy link

bors bot commented Mar 17, 2023

Pull request successfully merged into main.

Build succeeded:

@bors bors bot changed the title Avoid unneeded bounds checks in bytecode address patching [Merged by Bors] - Avoid unneeded bounds checks in bytecode address patching Mar 17, 2023
@bors bors bot closed this Mar 17, 2023
@bors bors bot deleted the optimize/bytecompiler-address-patching branch March 17, 2023 01:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance related changes and issues run-benchmark Label used to run banchmarks on PRs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants