-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unicode reference version #33965
Comments
Unicode is handled by this dependency: https://github.com/JuliaStrings/utf8proc but maybe not only and we have partially 9.0 support (and thus only that, really)? Should the version number in that file simply be updated? There is a http://www.unicode.org/Public/12.1.0/ucd/UnicodeData.txt file and it's 2% longer with change to LATIN SMALL LETTER S WITH HOOK, and adding a lot like HEBREW YOD TRIANGLE. See at: https://github.com/JuliaLang/julia/blob/master/deps/utf8proc.mk PCRE also handles Unicode, and maybe that's the only other dependency (then there are packages ICU.jl and possibly other, Scott's). Possibly that file is only for "tab completion of LaTeX-like abbreviations in the Julia REPL", see here (I didn't check carefully): |
Hi @PallHaraldsson - thanks for the explanation and links...! So, if I understand correctly, the REPL completions files (emoji_symbols.jl and latex_symbols.jl) determine which Unicode symbols are looked for in the v9 UnicodeData.txt file, so I suppose that that must currently restrict the Julia 1.3 REPL's Emoji support to a few versions behind the current version? So there are quite a few 'current' emojis missing from Julia 1.3 because they were introduced after v9, such as: 🦝🦙🦛🦘🦡🦢🦚🦜🦟🦠🥭🥬🥯🧂🥮🦞🧁🧭🧱🛹🧳🧨🧧🥎🥏🥍🧿🧩🧸♟🧵🧶🥽🥼🥾🥿🧮🧾🧰🧲🧪🧫🧬🧴🧷🧹🧺🧻🧼🧽🧯♾🏴☠️🧘🏾♀️-🧘🏿♀️🦓🦒🦔🦕🦖🦗🥥🥦🥨🥩🥪🥣🥫🥟🥠🥡🥧🥤🥢🛸🛷🥌🧣🧤🧥🧦🧢🏴🏴🏴🦖 ... not to mention all the diversity-oriented emojis recently released with v12.0.0 ... (😱) The v12 file is 2000 lines longer than v9 (many of the additions are new or archaic languages). The I don't know whether it's the official Julia policy to continuously support all the emojis in the current standard in the REPL, or whether there's any selection process... 🏚🚴♀️🖌 It's fun to name your Julia variables 🦖 or 🤔 though... :) |
I happened to be using |
there's a package to support the additional emoji symbols for REPL. |
@wookay Nice package. Can you use any of your code there to update Julia 1.3 to the latest version? |
@cormullion well, that package used the same code from |
I can confirm the package does work adding emojis to the REPL, but I wouldn't say not having them or latest Unicode in the REPL means not having Unicode 12.1 support (C has not REPL by default and Perl with "good" UTF-8 support has bad REPL). You can still copy and paste these in (or use the package). I would be most worried about runtime support, e.g. lowercase and uppercase (and I don't think they apply to emojis). |
The UnicodeData.txt file in the Makefile is only there to look up the names of the characters produced by LaTeX-like tab-completions in the REPL in order to generate this section of the documentation. All of the current tab-completion characters are present in Unicode 9, so no one bothered to update this data file to a newer version. This has nothing to do with the version of Unicode supported by Julia (e.g. for parsing or text processing), which is determined by utf8proc. The emoji tab completions were added as on April Fool's day in #10709, and I don't know if they have been updated in a while. (Realize that the (Most Unicode characters will never have tab completions in the REPL. They are still supported.) I would suggest closing this issue, as it's really not about Unicode support in Julia. If you want to open another issue to add more emoji tab completions, please go ahead. |
This might be a silly question :) but ... I noticed that the 1.3 release notes mention Unicode version 12.1.0:
here
but the only Unicode reference I can find in this repo is version 9.0.0:
from here
Are there two numbering systems? I'm happy to be educated in these arcane Unicode matters, such as what determines which Unicode symbols are in and which aren't...
The text was updated successfully, but these errors were encountered: