Heuristic #1 is very restricted and result in false negatives #49

raghavgarg1257 · 2021-07-19T18:40:23Z

Context: Heuristic#1: Star height > 1, dictates there should be no repetition inside of repetition.

Issue: The regex in question is /abcd(-[0-9a-z]{10,20}){2}/, which has repetition inside of repetition but it is not a vulnerable pattern because of fixed range quantifier.

Probable Improvements:

Make the Heuristic#1 configurable like the other Heuristic#2, which takes in options from the user and matches the start height to that particular config.
Usage of Range Quantifier as a factor for vulnerability.

Please share your thoughts on the above improvements and their feasibility. I will be happy to raise a PR for it. :)

The text was updated successfully, but these errors were encountered:

raghavgarg1257 · 2021-07-30T06:21:41Z

@davisjam Can you please share your views on this?

davisjam · 2021-07-30T20:31:05Z

Hi @raghavgarg1257. Thanks for your interest!

For a sound approach, see #17. I have some starter code I can share for this.

For a simpler improvement to the heuristic, I would be happy to give feedback on an approach that:

Preserves the existing behavior.
Can be configured to treat as safe any nested bounded repetition for which the total amount of ambiguity is "not too big", measured using the product of the size of the ranges involved. For example, ((a{5,10}){5,50}){0,12} would have a total amount of ambiguity of (10-5)(50-5)(12-0) = 2700.
Can be configured to treat as safe any nested bounded repetition for which each part has an upper bound -- as in your example. This would be a special case of the more general bounding strategy described in the previous bullet point.

Please @ me in any replies or PRs.

raghavgarg1257 · 2021-08-02T19:11:11Z

Hello @davisjam, Thanks for sharing your thoughts. :)

I will share an approach keeping the above points in mind and will try to create a poc for the same, before that I have a couple of doubts.

How can we define "not too big ambiguity", in your opinion? I mean coming up with a static upper limit might not be the best thing to do, right?
Readme #17 is a PR related to the updation of Readme. I guess you meant to share something else.

davisjam · 2021-08-04T14:11:22Z

Not too big ambiguity

Well, you can put an upper bound on the ambiguity (possibly a very loose upper bound) as I described above. Then the user can provide their desired bound. A little desktop benchmarking can be used to find a "recommended" value, e.g. 50 or 100 or etc.

#17

Oops, I meant #27.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Heuristic #1 is very restricted and result in false negatives #49

Heuristic #1 is very restricted and result in false negatives #49

raghavgarg1257 commented Jul 19, 2021

raghavgarg1257 commented Jul 30, 2021

davisjam commented Jul 30, 2021 •

edited

Loading

raghavgarg1257 commented Aug 2, 2021

davisjam commented Aug 4, 2021

Heuristic #1 is very restricted and result in false negatives #49

Heuristic #1 is very restricted and result in false negatives #49

Comments

raghavgarg1257 commented Jul 19, 2021

raghavgarg1257 commented Jul 30, 2021

davisjam commented Jul 30, 2021 • edited Loading

raghavgarg1257 commented Aug 2, 2021

davisjam commented Aug 4, 2021

davisjam commented Jul 30, 2021 •

edited

Loading