-
-
Notifications
You must be signed in to change notification settings - Fork 264
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: added a call limit setting to prevent crashes or timeouts #684
Conversation
461df3a
to
70178ff
Compare
#675 looks like it shouldn't be closed by this since it's a regression. Ideally, the optimizer should be able to detect all exponential cases and have replacement rules for them. Maybe someone could take a look at egg and implement a simplified version of the grammar there so peephole optimizations can be more easily written? |
@dragostis thanks. Yes, I agree that for the timeout issue, there should be a fix for the root cause. For the edge case found by the fuzzer, it seems it's due to the block comment feature introduced in this PR: #332 Given the investigation and the corresponding proper backwards-compatible fix for timeouts may take time, there are potentially two ways (with regards to this API: #684 (comment) ) to keep
I'm leaning towards the first one, because, in the meantime, it's practical in the sense it'd allow people to use pest on potentially problematic inputs (without worrying it'd eat up all CPU time) if their grammars suffered from this issue. -- |
Packrat parsing was not used because caching data is slow. While it removes the exponential cases, it adds substantial overhead for all other parsing. My solution to this was to identify such patterns and replace them in the optimizer. This has worked for some cases, but I haven't had the time to identify all patterns of interest. Also, I have no proof that all such patterns are actually replaceable with equivalent but non-exponential patterns. My vision for pest when I started it was to actually use something like ruler (which didn't exist at the time) to find these rules and use egg to optimize the grammars. Currently, I think this is non-trivial since pest's grammar is in need of a cleanup. I started working on a new meta grammar in pest3 a few years ago, but didn't have a chance to make it work. It has a few interesting ideas, including much better code generation, if you have the time to take over that work. For the rewrite rules inference, you'd need a simplified version of the grammar, something like the one here, where all grammar would be reduced to something in the lines of this. |
@glyn The API was simplified to only track calls as per the discussion: #684 (comment)
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update the docs of set_call_limit
.
@glyn fixed, thanks for catching that! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Improvements suggested, but at the author's discretion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove blocking review. Please note I have only reviewed a small part of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a suggestion but take it or not, this is a good change.
if let Some((current, _)) = &mut self.current_call_limit { | ||
*current += 1; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tomtau @NoahTheDuke
Just for interest, there's yet another way to code this:
if let Some((ref mut current, _)) = self.current_call_limit {
*current += 1;
}
This cropped up here in @jonhoo's video Crust of Rust: Lifetime Annotations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh that's cool. I didn't know you could embed those.
closes #674, closes #675