Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add URL encoding: https://en.wikipedia.org/wiki/URL_encoding #51

Open
Albretch opened this issue May 17, 2023 · 3 comments
Open

Add URL encoding: https://en.wikipedia.org/wiki/URL_encoding #51

Albretch opened this issue May 17, 2023 · 3 comments

Comments

@Albretch
Copy link

_AZ="激光, 這兩個字是甚麼意思"

_AZ=$(echo "${_AZ}" | recode html..utf-8)
echo "// __ $_AZ: |${_AZ}|"
// __ $_AZ: |激光, 這兩個字是甚麼意思|

_AZ="t%C3%AAte-%C3%A0-t%C3%AAte"
...
// __ $_AZ: |t%C3%AAte-%C3%A0-t%C3%AAte|
it should be: "tête-à-tête"

How do you make recode give you UTF-8 regardless of the input string (which encoding should be easy to figure out based on the patterns of the input string)?

@rrthomas
Copy link
Owner

How do you make recode give you UTF-8 regardless of the input string (which encoding should be easy to figure out based on the patterns of the input string)?

I don't see any HTML character entities. Your examples look like URL escaping, not HTML character entities.

which encoding should be easy to figure out based on the patterns of the input string

Recode does not attempt to guess what encoding its input uses, it uses the encoding you tell it; you'd need another tool to guess encodings.

@Albretch
Copy link
Author

OK, is there a way to make recode get as input: "t%C3%AAte-%C3%A0-t%C3%AAte"
and give as output: "tête-à-tête" ?

@rrthomas
Copy link
Owner

No, I don't think recode supports URL encoding. That would be a good thing to add.

@rrthomas rrthomas changed the title recode swallows fine unicode, but not plain HTML char entities for French characters? Add URL encoding: https://en.wikipedia.org/wiki/URL_encoding May 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants