Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing ‘a***a***a***a***…’ takes quadratic time #63

Closed
andersk opened this issue Mar 10, 2019 · 4 comments
Closed

Parsing ‘a***a***a***a***…’ takes quadratic time #63

andersk opened this issue Mar 10, 2019 · 4 comments
Labels

Comments

@andersk
Copy link

andersk commented Mar 10, 2019

$ python -c 'print("a***" * 20000)' | time md2html > /dev/null
0.92user 0.00system 0:00.94elapsed 97%CPU (0avgtext+0avgdata 3080maxresident)k
0inputs+0outputs (0major+463minor)pagefaults 0swaps
$ python -c 'print("a***" * 40000)' | time md2html > /dev/null
3.83user 0.00system 0:03.87elapsed 98%CPU (0avgtext+0avgdata 4240maxresident)k
0inputs+0outputs (0major+775minor)pagefaults 0swaps
$ python -c 'print("a***" * 80000)' | time md2html > /dev/null
22.21user 0.02system 0:22.46elapsed 98%CPU (0avgtext+0avgdata 7028maxresident)k
0inputs+0outputs (0major+1441minor)pagefaults 0swaps
@mity mity added the bug label Mar 11, 2019
@mity
Copy link
Owner

mity commented Mar 11, 2019

@jgm Maybe you can thank me for commonmark/cmark#284 by explaining how is it possible that Cmark is not vulnerable to this issue? ;-)

MD4C walks the dangling openers for each *** but all of them fail because of the rule of three. So after it all fails, it is added as just another dangling opener, and then we do it all again for next ***.

@jgm
Copy link

jgm commented Mar 11, 2019

@mity in cmark we keep a list of "lower bounds" for openers that might match each possible type of closer (e.g., asterisk span with length mod 3 == 0). So once we've rejected one opening span as a possible opener for a closer consisting of 3 asterisks, we won't need to look at it again. Code is around here.

@mity
Copy link
Owner

mity commented Mar 11, 2019

@jgm I will have to do it a bit differently due to some internal limitations. But it helped a lot, thanks.

@mity mity closed this as completed in 2dd96ab Mar 12, 2019
@mity
Copy link
Owner

mity commented Mar 12, 2019

@andersk Fixed it all, finally. Thank you for the bug hunt. I will eventually release 0.3.1 with all those fixes after some afl-fuzzing session. That is, unless you have something more queued for me? ;-)

ec1oud pushed a commit to ec1oud/md4c that referenced this issue Apr 17, 2019
We had to break the list of potential '*' openers into multiple ones so
we do not have to walk it when looking for matching length due to the
"rule of three" for intraword delimiter runs.

Fixes mity#63.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants