-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow wildcards in whitelist attributes #499
Comments
Also wanted to add that I'm willing to contribute code to support this. Before doing so, I just want to make sure this change is acceptable and determine the best way to support it (options 1 or 2 above, or something entirely different). Thanks! |
Hi @foo4u. Sorry for the late reply. I like option 2 (because I can't think of another case which it would be helpful for). Would be great if you write that. |
I guess it would be more flexible if you implemented option 1 with regex patterns instead of only wildcards. E.g.: Whitelist.relaxed().addAttributes("a", "data-.*") |
OK, handling examples like that makes sense. I'd be OK with either a prefix or a regex matcher. The prefix match seems simple and unlikely to let anyone shoot themselves in the foot. |
Ok, will try to get a PR for prefix matching sent in a few weeks. |
(Closing out old, dormant bugs. If you are still impacted by this, please reopen & vote.) |
I am trying out jsoup to validate html pages. Works great so far. |
Hi @foo4u , have you ever prepared a PR? |
I have prepared a PR: #1871 |
(Reopening as mentioned in earlier close, there is renewed interest here.) |
Is there any update on this... it looks like the changes requested in the PR were made back in February? I have been watching these for some months now because we have a need to not strip out aria-* attributes. |
HTML5 allows the use of
data-foo
,data-foo-bar
, etc to specify information on elements. These are relatively harmless and should only contain text.Currently, each data- attribute needs to be specified explicitly on a whitelist so that it's not removed by
Jsoup.clean()
. Can we add support for either:Whitelist.relaxed().addAttributes("a", "data-*")
or
Whitelist.relaxed().allowDataAttributes("a")
The text was updated successfully, but these errors were encountered: