-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add HTML.unescape [Closes #3107] #3374
Conversation
|
219487c
to
2370970
Compare
❤️ @chaniks thank you |
2370970
to
3877c73
Compare
end | ||
when /\A#x([0-9a-f]+)\z/i | ||
n = $1.to_i(16) | ||
if n < charlimit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be n <= charlimit
, since 0x10ffff
is a valid unicode value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thanks!
charlimit = 0x10ffff | ||
|
||
string.gsub(/&(apos|amp|quot|gt|lt|\#[0-9]+|\#[xX][0-9A-Fa-f]+);/) do |string, _match| | ||
match = _match[1].dup |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this dup
doesn't seem to be needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ops, done!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I have too many thoughts on this PR.
I'd better skip them..
But two things,
- Integer overflow (e.g.
�
,�
)
(can be generated withHTML.escape
—reversibility)
And maybe specifying behaviors like �
=> �
and �
=> �
in specs might be a good idea, if it is intended.
And maybe reusing Char::MAX_CODEPOINT
.
I guess I should stop here.
3877c73
to
6d02e69
Compare
@dukex Thank you for this! ❤️ |
Hi @chaniks Fell free to send all your comments here, I want write the best code for the crystal and not just copy the ruby code, I appreciate your points here and I want read more. Thanks for your comments, I'm using the I update the code to unescape spaces( I think the the condition You can help me with behaviors like |
@dukex
Please never worry about my other comments. I wanted to edit it, but it was a review so I couldn't.. 😢 p.s. If you want dig more, this link might help: https://www.w3.org/TR/html401/sgml/entities.html |
here's a list of entities and their unicode code points. |
Given the ruby implementation CGI::unescapeHTML and the comments on #3226 I'm made this implementation of
HTML.unescape
.I tried to translate the unescape for hexadecimal codes without successSomeone can help me?Thanks @chaniks