Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect special character sorting #2573

Open
vonox7 opened this issue Apr 14, 2017 · 6 comments
Open

Incorrect special character sorting #2573

vonox7 opened this issue Apr 14, 2017 · 6 comments
Labels

Comments

@vonox7
Copy link

vonox7 commented Apr 14, 2017

The JVM, Swift and all other programming languages (see also unicode specification) mentioned in src/realm/unicode.cpp) define the following character sequence as "sorted:
!"#$%&'()*+,-./123:;<=>?ABCXYZ_abcxyz

In contrast, Realm 3.1.2 considers the following as "sorted":
'- !"#$%&()*,./:;?_+<=>123aAbBcCxXyYzZ

Code for reproduction

The realm sorting algorithm has the following bugs:

  • The characters ' and - are before the characters !"#
  • The numbers (0-9) are after all other special characters like : and =
  • The characters ?, + and _ are in the wrong position

The realm sorting documentation has the following missing:

  • Realm sort is case-insensitive (in contrast to Java/Swift greaterThan/smallerThan Operator (< / >))

It seems like realm doesn't comply to any standard (unicode, ascii...) and just sorts all special characters by a random specification and order.

@kneth
Copy link
Contributor

kneth commented Apr 20, 2017

@vonox7 The sorting behaviour you mention is the expected one. I have created realm/realm-java#4527 to clarify. Moreover, I will update the documentation so it is clear tha this is the intended behaviour.

@vonox7
Copy link
Author

vonox7 commented Apr 20, 2017

@kneth Thanks for the update. The documentation update explains the intended lowercase/uppercase problem. But the characters like'-!"#:=?+_ ARE in the utf-8 range (0-591). See also https://en.wikipedia.org/wiki/Basic_Latin_(Unicode_block)#Table_of_characters . I think you missed here my point.

@ironage
Copy link
Contributor

ironage commented May 18, 2017

#777 is related

@nalexn
Copy link

nalexn commented Jan 22, 2021

I'm shocked that such divergence from all the existing specs exists for years unresolved...

How am I supposed to sort contacts by name to form a (#) group after a (z), if (z) is always the last, no matter which character I use for grouping? Have to appeal to yet another hack to make Realm work.

@bmunkholm
Copy link
Contributor

Hey @nalexn - happy to see someone still cares!
We fully understand the frustration when something like this is needed!

We surely would love to get this fixed, and while we hoped for a simple solution by adding full ICU support from the platforms, none of the mobile platforms really support that in an official way, unfortunately (despite earlier promises..). Just including all of ICU is also out of question for size reasons. So it looks like we have to get more inventive and offer some kind of opt-in solution or configuration of sorts so that we don’t blow up the library size significantly for those that don’t need this.

We can’t really promise anything about when we can take that on, but it’s not on our immediate shortlist for now. We have hundreds of good ideas in the backlog and we do try to prioritize features based on user feedback so that we give most people the highest value all the time. So definitely appreciate your "upvote" on this!

In the meantime, did you actually find a workaround? Otherwise happy to help out!

@jaltin
Copy link

jaltin commented Dec 15, 2021

Hi,

We also really would like to be able to control the sorting. I have outlined a suggestion in realm/realm-js#3770, perhaps you could do something along those lines?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

8 participants