You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Either I am misunderstanding what clean_content_tags does or it is not working correctly. I cannot get the clean_content_tags attribute to work on anything other than the two tags <script> and <style>. Using nh3 version 0.2.14, python 3.11.0.
So I figured out the problem is that you MUST have a tag in tags in order to get clean_content_tags to work if you're trying to clear anything other than <style> or <script> tags , meaning this works:
But this doesn't work because the <div> tag isn't specified in the tags attribute and I'm trying to clear tags that aren't EXCLUSIVELY script or style:
item = "<div><b>hi</b><script><style></div>"
print("Output: ", nh3.clean(html=item, clean_content_tags={'b', 'script'}))
#traceback appears with assertion error previously stated in OP
This is very confusing and I doubt it's intended to work this way. Why would clean_content_tags only work on it's own with script and style but not the others? Why does at least one tag need to be whitelisted in order to get other tags to work in clean_content_tags that aren't style and script?
If you're sanitizing user input though and you don't want to allow any HTML tags at all, I'm still not sure how I would remove them because I don't know what they will be ahead of time. It would be helpful to have an option that strips any and all HTML tags.
Either I am misunderstanding what
clean_content_tags
does or it is not working correctly. I cannot get theclean_content_tags
attribute to work on anything other than the two tags<script>
and<style>
. Using nh3 version 0.2.14, python 3.11.0.I receive this error:
I have been able to reproduce this error with b, br, div, and img tags. I haven't tried any others. script and style tags work as expected.
The text was updated successfully, but these errors were encountered: