-
-
Notifications
You must be signed in to change notification settings - Fork 8.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(utils): make Markdown link replacement much more rigorous #8927
Conversation
✅ [V2]
To edit notification comments on pull requests, go to your Netlify site settings. |
⚡️ Lighthouse report for the deploy preview of this PR
|
Size Change: 0 B Total Size: 1.01 MB ℹ️ View Unchanged
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM if tests are passing 👍
Left some minor review comments
My thoughts recently: we should process the MD AST with a remark plugin to swap links instead of using regexes to preprocess the string content of the doc. It would be cleaner, simpler to handle only valid markdown links, and faster to run.
That's a much larger refactoring so I'm fine merging this for now.
const linkTitlePattern = '(?:\\s+(?:\'.*?\'|".*?"|\\(.*?\\)))?'; | ||
const linkSuffixPattern = '(?:\\?[^#>\\s]+)?(?:#[^>\\s]+)?'; | ||
const linkCapture = (forbidden: string) => | ||
`((?!https?://|@site/)[^${forbidden}#?]+)`; | ||
const linkURLPattern = `(?:${linkCapture( | ||
'()\\s', | ||
)}${linkSuffixPattern}|<${linkCapture('>')}${linkSuffixPattern}>)`; | ||
const linkPattern = new RegExp( | ||
`\\[(?:(?!\\]\\().)*\\]\\(\\s*${linkURLPattern}${linkTitlePattern}\\s*\\)|^\\s*\\[[^[\\]]*[^[\\]\\s][^[\\]]*\\]:\\s*${linkURLPattern}${linkTitlePattern}$`, | ||
'dgm', | ||
); | ||
let mdMatch = linkPattern.exec(modifiedLine); | ||
while (mdMatch !== null) { | ||
// Replace it to correct html link. | ||
const mdLink = mdMatch.groups!.filename!; | ||
const mdLink = mdMatch.slice(1, 5).find(Boolean)!; | ||
const mdLinkRange = mdMatch.indices!.slice(1, 5).find(Boolean)!; | ||
if (!/\.mdx?$/.test(mdLink)) { | ||
mdMatch = linkPattern.exec(modifiedLine); | ||
continue; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be solved by #6370 |
Maybe, will have to study this more in-depth 😄 What I was thinking is more something like replacing: export default function markdownLoader(
this: LoaderContext<DocsMarkdownOption>,
source: string,
): void {
const fileString = source;
const callback = this.async();
const options = this.getOptions();
return callback(null, linkify(fileString, this.resourcePath, options));
} By export default function markdownLoader(
this: LoaderContext<DocsMarkdownOption>,
source: string,
): void {
const fileString = source;
const callback = this.async();
const options = this.getOptions();
const newOptions = {...options,beforeRemarkPlugins: [...options.beforeRemarkPlugins, remarkLinkify]}
return callback(null,fileString, this.resourcePath, newOptions));
} (not even sure it's a good idea because mdx-loader has a caching a creating a new option object might lead to weird things: maybe better to handle in plugin index.ts) |
Pre-flight checklist
Motivation
This makes our Markdown parsing much more in line with how it's actually handled at runtime.
Test Plan
Test links
Deploy preview: https://deploy-preview-_____--docusaurus-2.netlify.app/
Related issues/PRs