Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Localized sorting #39873

Merged
merged 2 commits into from
Apr 25, 2020
Merged

Conversation

jbytheway
Copy link
Contributor

Summary

SUMMARY: I18N "Add helper for localized sorting"

Purpose of change

The issue of sorting (collating) strings in the game has come up a few times. Currently it's always done by just sorting byte-wise, which is equivalent to sorting lexicographically by codepoint. That's acceptable for English, but not that great for many other languages.

I thought there was nothing we could really do about this without adding a heavyweight dependency like ICU, but I recently discovered that the standard library does have some support for localized sorting, which might not be the Unicode Collation Algorithm, but might be better than the status quo.

Describe the solution

Add a helper struct to be used as a comparison functor when performing localised sorting.

Use it to sort the translated names in the debug vehicle spawn menu.

I picked this list as a first test case because it's long, and thus ought to demonstrate most ways in which sorting can go wrong, but also a low-impact change that's safe for an experiment like this.

Note that this is implemented via translate-up-front and then sort, so there are no more translations than there were in the previous implementation, so there will be no performance impact from extra translations (there is potential for performance impact from slower comparison, but I have not observed any).

Describe alternatives you've considered

Using ICU for full-blown Unicode collation. Or keeping the status quo. Or rolling our own.

Testing

Tested for German on Linux (see screenshot below).

I'm not sure how this will fare for other languages / platforms. I'm making this a small change to facilitate testing before rolling out more widely.

Additional context

Screenshot of sorted German names:
vehicle-name-sorting
Observe in particular that Käfer correctly sorts before Kajak, whereas in codepoint order it would appear after Kreuzspaltung.

This is a helper to assist with sorting strings in a locale-aware
manner.
@kevingranade kevingranade merged commit 4a304e3 into CleverRaven:master Apr 25, 2020
@jbytheway jbytheway deleted the localized_sorting branch April 25, 2020 10:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants