Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added ability for mdxJSX elements to not be stripped out, and also ad… #32

Closed
wants to merge 3 commits into from

Conversation

cobbvanth
Copy link

I have added support for mdxJSX TEXT & FLOW elements to not be stripped out if they are added to the schema whitelist, as well as attributes on these elements

Initial checklist

  • [x ] I read the support docs
  • [x ] I read the contributing guide
  • [ x] I agree to follow the code of conduct
  • [ x] I searched issues and discussions and couldn’t find anything or linked relevant results below
  • [ x] I made sure the docs are up to date
  • [ x] I included tests (or that’s not needed)

Description of changes

I have added support for mdxJSX TEXT & FLOW elements to not be stripped out if they are added to the schema whitelist, as well as attributes on these elements.

This was achieved by putting giving the element's the expected 'tagName' property, which was not generated by the creation of JSX MDX elements, therefore as the sanitizer could not detect a tag name, it was automatically omitted and stripped.

The same was done for attributes on these JSX MDX elements by moving the 'attributes' array, to the expected 'properties' object, so that it could be properly matched against the schema whitelist.

In future this could be moved to a separate 'JSXElement' function, but for now I have extended the functionality of the current 'text' and 'element' functions that already existed, as can be seen by returning element switch statement.

…ded the ability for attributes whitelisted on those elements to not be stripped
@github-actions github-actions bot added 👋 phase/new Post is being triaged automatically 🤞 phase/open Post is being triaged manually and removed 👋 phase/new Post is being triaged automatically labels Jan 4, 2025
@cobbvanth
Copy link
Author

Woops, will fix up these type errors ASAP

@wooorm
Copy link
Member

wooorm commented Jan 4, 2025

Hi! MDX is never safe. This project will not do anything to make MDX safe. I cannot see why you want to use this project with MDX.

I have trouble with the premise of this issue first. And then there’s a bunch of stuff right now failing in this PR, with the linter and the types and then the coverage? Perhaps first open a discussion to talk about what you want to do? Thanks

@wooorm wooorm closed this Jan 4, 2025

This comment has been minimized.

@wooorm wooorm added the 🙅 no/wontfix This is not (enough of) an issue for this project label Jan 4, 2025
@github-actions github-actions bot added 👎 phase/no Post cannot or will not be acted on and removed 🤞 phase/open Post is being triaged manually labels Jan 4, 2025
@cobbvanth
Copy link
Author

This plugin is often used to sanitize input from users and its used by the rehype-sanitize plugin for mdx. So it would have been nice to have JSX components supporter, with the whitelist functionality that this plugin provides. But maybe that project is a better place for the contribution, now that I look at it

@wooorm
Copy link
Member

wooorm commented Jan 4, 2025

It is not possible to make MDX safe with this plugin, as I said above. Please, take some more time to research what you want to do and how these things work, and post a discussion

@cobbvanth
Copy link
Author

cobbvanth commented Jan 5, 2025

It was working great! .mdx just gets converted to a HTML ast anyway via rehype. At current, this strips out any AST items with the type of 'mdxJSXElement', regardless. What this pull request added was the ability for these elements to be recognised as normal elements, and be treated by the schema in the same way. I just hadn't gotten around to adding the coverage tests for the new code.

@wooorm
Copy link
Member

wooorm commented Jan 5, 2025

I think you ignore what I say: it cannot work great, this project is about making things safe, MDX is never safe. MDX does not just get converted to HTML by rehype. Do not use this project and MDX works. Please, take the time to understand what I say, and take time to write down what you want. See https://github.com/syntax-tree/.github/blob/main/support.md.

@ChristianMurphy
Copy link
Member

Adding on.
One of the worst security vulnerabilities is Remote Code Execution. RCE means an attacker can run arbitrary code on your system, which can lead to complete compromise. JSX allows JavaScript expressions, which opens the door to RCE if user input is not fully controlled. This is why simply whitelisting MDX elements does not make them safe.

@remcohaszing
Copy link
Member

I imagine maybe you’re using MDX to turn markdown into React components. That’s a valid use case and in that case you may indeed want to sanitize the HTML tags inside markdown, but you shouldn’t treat the content as MDX. If so, you should make to either pass the MDX using a vfile with a .md extension (preferred), or forcefully set the format option to md. Official integrations already do the former.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🙅 no/wontfix This is not (enough of) an issue for this project 👎 phase/no Post cannot or will not be acted on
Development

Successfully merging this pull request may close these issues.

4 participants