Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for returning aliases of a given character #43

Open
eugenesvk opened this issue Oct 21, 2024 · 3 comments
Open

Add support for returning aliases of a given character #43

eugenesvk opened this issue Oct 21, 2024 · 3 comments
Labels
enhancement New feature or request

Comments

@eugenesvk
Copy link

println!("\u{7f} is called {:?}", unicode_names2::name('\u{7f}')) results in None instead of Delete, same for other control chars

@progval
Copy link
Owner

progval commented Oct 21, 2024

Relevant line from UnicodeData.txt:

007F;<control>;Cc;0;BN;;;;;N;DELETE;;;;

According to https://www.unicode.org/Public/3.1-Update/UnicodeData-3.1.0.html , <control> while DELETE is the "old name as published in Unicode 1.0". So the current behavior is correct, though I don't understand it. It's also documented here: https://docs.rs/unicode_names2/latest/unicode_names2/fn.name.html

However, this character does have two aliases: DELETE and DEL. I would accept a PR adding support for returning a list of aliases of a given character, gated behind a feature flag.

@progval progval added the enhancement New feature or request label Oct 21, 2024
@progval progval changed the title Control chars not supported Add support for returning aliases of a given codepoint Oct 21, 2024
@progval progval changed the title Add support for returning aliases of a given codepoint Add support for returning aliases of a given character Oct 21, 2024
@eugenesvk
Copy link
Author

I also don't understand how the current behavior is correct when you have a name (either an old one or the "aliased" one). What's the point of NOT using it and complicating the use by pushing this distinction onto the unsuspecting users?

FYI https://github.com/JuanPotato/charname returns the names for control chars
Python is weird

x1 = lookup("BACKSPACE") # works, can find a control char by name
print(x1)
# x1nm = unicodedata.name(x1) # fails to find a name by code

@progval
Copy link
Owner

progval commented Oct 21, 2024

🤷 you have to ask the Unicode Consortium

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants