Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
## Version 0.5.1 Changes: * LaTeX math extension (`MD_FLAG_LATEXMATHSPANS`) now requires that opener mark is not immediately preceded with alpha-numeric character and similarly that closer mark is not immediately followed with alpha-numeric character. So for example `foo$ x + y = z $` is not recognized as LaTeX equation anymore because there is no space between `foo` and the opening `$`. * Table extension (`MD_FLAG_TABLES`) now recognizes only tables with no more than 128 columns. This limit has been imposed to prevent a pathological case of quadratic output size explosion which could be used as DoS attack vector. * We are now more strict with `MD_FLAG_PERMISSIVExxxAUTOLINKS` family of extensions with respect to non-alphanumeric characters, with the aim to mitigate false positive detections. Only relatively few selected non-alphanumeric are now allowed in permissive e-mail auto-links (`MD_FLAG_PERMISSIVEEMAILAUTOLINKS`): - `.`, `-`, `_`, `+` in user name part of e-mail address; and - `.`, `-`, `_` in host part of the e-mail address. Similarly for URL and e-mail auto-links (`MD_FLAG_PERMISSIVEURLAUTOLINKS` and `MD_FLAG_PERMISSIVEWWWAUTOLINKS`): - `.`, `-`, `_` in host part of the URL; - `/`, `.`, `-`, `_` in path part of the URL; - `&`, `.`, `-`, `+`, `_`, `=`, `(`, `)` in the query part of the URL (additionally, if present, `(` and `)` must form balanced pairs); and - `.`, `-`, `+`, `_` in the fragment part of the URL. Furthermore these characters (with some exceptions like where they serve as delimiter characters, e.g. `/` for paths) are generally accepted only when an alphanumeric character both precedes and follows them (i.e. these cannot be "stacked" together). Fixes: * Fix several bugs where we haven't properly respected already resolved spans of higher precedence level in handling of permissive auto-links extensions (family of `MD_FLAG_PERMISSIVExxxAUTOLINKS` flags), LaTeX math extension (`MD_FLAG_LATEXMATHSPANS`) and wiki-links extension (`MD_FLAG_WIKILINKS`) of the form `[[label|text]]` (with pipe `|`). In some complex cases this could lead to invalid internal parser state and memory corruption. Identified with [OSS-Fuzz](https://github.com/google/oss-fuzz). * [#222](mity/md4c#222): Fix strike-through extension (`MD_FLAG_STRIKETHROUGH`) which did not respect same rules for pairing opener and closer marks as other emphasis spans. * [#223](mity/md4c#223): Fix incorrect handling of new-line character just at the beginning and/or end of a code span where we were not following CommonMark specification requirements correctly. ## Version 0.5.0 Changes: * Changes mandated by CommonMark specification 0.30. Actually there are only very minor changes to recognition of HTML blocks: - The tag `<textarea>` now triggers HTML block (of type 1 as per the specification). - HTML declaration (HTML block type 4) is not required to begin with an upper-case ASCII character after the `<!`. Any ASCII character is now allowed. Also it now doesn't require a whitespace before the closing `>`. Other than that, the newest specification mainly improves test coverage and clarifies its wording in some cases, without affecting the implementation. Refer to [CommonMark 0.30 notes](https://github.com/commonmark/commonmark-spec/releases/tag/0.30) for more info. * Make Unicode-specific code compliant to Unicode 15.1. * Update list of entities known to the HTML renderer from https://html.spec.whatwg.org/entities.json. New Features: * Add extension allowing to treat all soft break as hard ones. It has to be explicitly enabled with `MD_FLAG_HARD_SOFT_BREAKS`. Contributed by [l-m](https://github.com/l1mey112). * Structure `MD_SPAN_A_DETAIL` now has a new member `is_autolink`. Contributed by [Jens Alfke](https://github.com/snej). * `md2html` utility now supports command line options `--html-title` and `--html-css`. Contributed by [Andreas Baumann](https://github.com/andreasbaumann). Fixes: * [#163](mity/md4c#163): Make HTML renderer to emit `'\n'` after the root tag when in the XHTML mode. * [#165](mity/md4c#165): Make HTML renderer not to percent-encode `'~'` in URLs. Although it does work, it's not needed, and it can actually be confusing with URLs such as `http://www.example.com/~johndoe/`. * [#167](mity/md4c#167), [#168](mity/md4c#168): Fix multiple instances of various buffer overflow bugs, found mostly using a fuzz testing. Contributed by [dtldarek](https://github.com/dtldarek) and [Thierry Coppey](https://github.com/TCKnet). * [#169](mity/md4c#169): Table underline now does not require 3 characters per table column anymore. One dash (optionally with a leading or tailing `:` appended or prepended) is now sufficient. This improves compatibility with the GFM. * [#172](mity/md4c#172): Fix quadratic time behavior caused by unnecessary lookup for link reference definition even if the potential label contains nested brackets. * [#173](mity/md4c#173), [#174](mity/md4c#174), [#212](mity/md4c#212), [#213](mity/md4c#213): Multiple bugs identified with [OSS-Fuzz](https://github.com/google/oss-fuzz) were fixed. * [#190](mity/md4c#190), [#200](mity/md4c#200), [#201](mity/md4c#201): Multiple fixes of incorrect interactions of indented code block with a preceding block. * [#202](mity/md4c#202): We were not correctly calling `enter_block()` and `leave_block()` callbacks if multiple HTML blocks followed one after another; instead previously such blocks were merged into one. (This may likely impact only applications interested in Markdown's AST, and not just converting Markdown to other formats like HTML.) * [#210](mity/md4c#210): The `md2html` utility now handles nested images with optional titles correctly. * [#214](mity/md4c#214): Tags `<h2>` ... `<h6>` incorrectly did not trigger HTML block. * [#215](mity/md4c#215): The parser incorrectly did not accept optional tabs after setext header underline. * [#217](mity/md4c#217): The parser incorrectly resolved emphasis in some situations, if the emphasis marks were enclosed by punctuation characters.
- Loading branch information