Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: punctuation regex syntax compatibility #3229

Closed
wants to merge 2 commits into from

Conversation

musicq
Copy link

@musicq musicq commented Mar 15, 2024

Marked version: 12.0.1

Markdown flavor: Markdown.pl|CommonMark|GitHub Flavored Markdown|n/a

Description

  • Fix punctuation regex syntax compatibility

In some runtime environment, the punctuation regex syntax is not supported. Here's an example in Lark app.

image

Expectation

Marked should work fine in most cases.

Result

Host environment reports error due to unsupported regex syntax.

What was attempted

Nothing can't be executed correctly.

Contributor

  • Test(s) exist to ensure functionality and minimize regression (if no tests added, list tests covering this PR); or,
  • no tests required for this PR.
  • If submitting new feature, it has been documented in the appropriate places.

Committer

In most cases, this should be a different person than the contributor.

Copy link

vercel bot commented Mar 15, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
marked-website ✅ Ready (Inspect) Visit Preview 💬 Add feedback Mar 18, 2024 1:16am

src/rules.ts Outdated
@@ -170,7 +170,8 @@ const br = /^( {2,}|\\)\n(?!\s*$)/;
const inlineText = /^(`+|[^`])(?:(?= {2,}\n)|[\s\S]*?(?:(?=[\\<!\[`*_]|\b_|$)|[^ ](?= {2,}\n)))/;

// list of unicode punctuation marks, plus any missing characters from CommonMark spec
const _punctuation = '\\p{P}\\p{S}';
// fallback to explicit punctuation list if runtime environment doesn't support punctuation regex syntax
const _punctuation = isSupportPunctuationRegex ? '\\p{P}\\p{S}' : '!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two regular expressions are very different which means markdown will render differently in different browsers.

You should be able to just change the string to a regular expression. Then babel can transpile it correctly.

- const _punctuation = '\\p{P}\\p{S}';
+ const _punctuation = /\p{P}\p{S}/u;

"@arethetypeswrong/cli": "^0.15.1",
"@babel/plugin-transform-unicode-regex": "^7.23.3",
"@markedjs/testutils": "12.0.0-0",
"@rollup/plugin-babel": "^6.0.4",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't want to use babel for the output. This will make marked a lot slower and a lot bigger. If users need to support older browsers they can use babel themselves.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's better to inline the transformed regex and use it as a downgrade if users browser is too old?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I think babel here is just to transform the unicode regex, won't do other stuff 🤔

.getRegex();
const emStrongLDelim = /^(?:\*+(?:((?!\*)[\p{P}\p{S}])|[^\s*]))|^_+(?:((?!_)[\p{P}\p{S}])|([^\s_]))/u;

const emStrongRDelimAst = /^[^_*]*?__[^_*]*?\*[^_*]*?(?=__)|[^*]+(?=[^*])|(?!\*)[\p{P}\p{S}](\*+)(?=[\s]|$)|[^\p{P}\p{S}\s](\*+)(?!\*)(?=[\p{P}\p{S}\s]|$)|(?!\*)[\p{P}\p{S}\s](\*+)(?=[^\p{P}\p{S}\s])|[\s](\*+)(?!\*)(?=[\p{P}\p{S}])|(?!\*)[\p{P}\p{S}](\*+)(?!\*)(?=[\p{P}\p{S}])|[^\p{P}\p{S}\s](\*+)(?=[^\p{P}\p{S}\s])/gu;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need to change this regex since babel will replace the regex in _punctuation with a longer regex then replace will use that regex for punct

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm. Then it seems to me that it is not worth the extra work to support browsers that are > 50 versions behind

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then I'll just close this PR. Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants