Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Merged by Bors] - Rework RegExp struct to include bitflags field #1837

Closed

Conversation

aaronmunsters
Copy link
Contributor

@aaronmunsters aaronmunsters commented Feb 13, 2022

This Pull Request fixes/closes #1819.

It changes the following:

  • Move the bitflags from boa/src/syntax/lexer/regex.rs to boa/src/builtins/regexp/mod.rs
  • Replace the booleans in the RegExp struct to include the bitflags struct
  • Update match expressions to make use of the bitflags struct

@codecov
Copy link

codecov bot commented Feb 13, 2022

Codecov Report

Merging #1837 (325cb85) into main (be26b10) will decrease coverage by 0.00%.
The diff coverage is 63.88%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1837      +/-   ##
==========================================
- Coverage   55.67%   55.67%   -0.01%     
==========================================
  Files         201      201              
  Lines       17375    17368       -7     
==========================================
- Hits         9674     9670       -4     
+ Misses       7701     7698       -3     
Impacted Files Coverage Δ
boa/src/syntax/lexer/mod.rs 64.95% <ø> (ø)
boa/src/builtins/regexp/mod.rs 64.92% <40.90%> (-0.27%) ⬇️
boa/src/syntax/lexer/regex.rs 57.50% <100.00%> (+2.83%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update be26b10...325cb85. Read the comment docs.

@raskad raskad added this to the v0.14.0 milestone Feb 13, 2022
@raskad raskad added builtins PRs and Issues related to builtins/intrinsics performance Performance related changes and issues labels Feb 13, 2022
@raskad
Copy link
Member

raskad commented Feb 13, 2022

VM implementation

Test result main count PR count difference
Total 87,912 87,912 0
Passed 40,892 40,892 0
Ignored 19,929 19,929 0
Failed 27,091 27,091 0
Panics 0 0 0
Conformance 46.51% 46.51% 0.00%

Copy link
Member

@Razican Razican left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking very good! I added some suggestions :)

let mut dot_all = false;
let mut unicode = false;
let mut sticky = false;
let mut flags = RegExpFlags::default();
for c in f.chars() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would implement FromStr for RegexFlags, and use the version of parse_regex_flags() in the lexer, which iterates through bytes instead of characters, and should therefore be more efficient. Would also reduce duplicate code.

'i' => regexp.ignore_case,
'u' => regexp.unicode,
'y' => regexp.sticky,
'g' => regexp.flags.contains(RegExpFlags::GLOBAL),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we match bytes instead of characters here? Should be more efficient

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my last commit the code matches on bytes, but this could be replaced with the RegExpFlags which would allow for the following reduction in matching code:

fn regexp_has_flag(this: &JsValue, flag: RegExpFlags, context: &mut Context) -> JsResult<JsValue> {
    if let Some(object) = this.as_object() {
        if let Some(regexp) = object.borrow().as_regexp() {
            return Ok(JsValue::new(regexp.flags.contains(flag)));
        }
        // ...
    }
    // ...
} 

Which would be more readable?

const STICKY = 0b0010_0000;
}
}

fn parse_regex_flags(s: &str, start: Position, interner: &mut Interner) -> Result<Sym, Error> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would move this function together with the structure, and change it so that this implements the FromStr trait for the structure.

@RageKnify
Copy link
Member

Don't agree with moving the struct and accompanying functions from its original place.
It works against the idea of having the interpreter/environment depend on other parts of boa, ideally we'll be able to make the lexer and/or parser be separate crates to reduce iterative compile times. As is this PR would introduce dependencies going the other way.
Other than this the PR lgtm

@Razican
Copy link
Member

Razican commented Feb 14, 2022

Don't agree with moving the struct and accompanying functions from its original place.

It works against the idea of having the interpreter/environment depend on other parts of boa, ideally we'll be able to make the lexer and/or parser be separate crates to reduce iterative compile times. As is this PR would introduce dependencies going the other way.

Other than this the PR lgtm

Well, ideally this structure would ideally be part of regress. But I see what you mean. Maybe it should stay in the lexer.

@aaronmunsters
Copy link
Contributor Author

I reworked the PR with the provided feedback. Feel free to leave new suggestions :)

Copy link
Member

@RageKnify RageKnify left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Razican
Copy link
Member

Razican commented Feb 15, 2022

This just needs running cargo fit

Copy link
Member

@RageKnify RageKnify left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bors r+

bors bot pushed a commit that referenced this pull request Feb 16, 2022
This Pull Request fixes/closes #1819.

It changes the following:

- Move the bitflags from `boa/src/syntax/lexer/regex.rs` to `boa/src/builtins/regexp/mod.rs`
- Replace the booleans in the RegExp struct to include the bitflags struct
- Update match expressions to make use of the bitflags struct


Co-authored-by: Aäron Munsters <[email protected]>
@bors
Copy link

bors bot commented Feb 16, 2022

Pull request successfully merged into main.

Build succeeded:

@bors bors bot changed the title Rework RegExp struct to include bitflags field [Merged by Bors] - Rework RegExp struct to include bitflags field Feb 16, 2022
@bors bors bot closed this Feb 16, 2022
@aaronmunsters aaronmunsters deleted the rework_regexp_struct branch February 16, 2022 20:37
@RageKnify RageKnify added the Internal Category for changelog label Mar 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
builtins PRs and Issues related to builtins/intrinsics Internal Category for changelog performance Performance related changes and issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

The RegExp structure has too many booleans
4 participants