-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Emojis splitted up unexpectedly (e.g. https://emojipedia.org/ninja-cat/) #29
Comments
See #28 |
This seems to explain it, yes. |
i'm also getting unexpected splitting: π¨πΏβπ¦° splits into [ "π¨πΏβ", "π¦°" ], and this emoji is officially part of unicode, check it out. Might have something to do with zero width joiners not being recognized correctly? Edit: just did some more testing, and π¨πΏβπ¦° is split after the ZWJ (\u200d), so π¨πΏβπ¦° split looks like this: |
Hi! Same here, this emoji π¨β𦳠gets splited into [ 'π¨β', 'π¦³' ] that are these unicodes |
I think you're right. They were introduced in Unicode 11.0, so this depends on #24 |
Hi there,
first of all, thanks a lot for this library and the efforts you put in!
I've got a scenario, where some emojis seem to be split up the wrong way.
When splitting up the following emoji-sequence:
π±βπ»π±βππ±βπ€
I get the following string-tokens (notice the first two matching and the ninja-cat being split into two):
Is there an easy explanation for the behavior or is there a general guideline on which emojis are supported and which aren't?
I'm on Windows 10.
Thanks!
The text was updated successfully, but these errors were encountered: