Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Locale.ROOT to toUpper/toLower Methods #2584

Merged
merged 4 commits into from
Mar 6, 2017
Merged

Added Locale.ROOT to toUpper/toLower Methods #2584

merged 4 commits into from
Mar 6, 2017

Conversation

LinusDietz
Copy link
Member

This can improve the robustness JabRef for users with `exotic' Locale settings. See https://docs.oracle.com/javase/8/docs/api/java/lang/String.html#toUpperCase-- .

@@ -65,10 +66,10 @@ public int compare(BibEntry e1, BibEntry e2) {
// sorted according to last name.
if (InternalBibtexFields.getFieldProperties(sortField).contains(FieldProperty.PERSON_NAMES)) {
if (f1 != null) {
f1 = AuthorList.fixAuthorForAlphabetization((String) f1).toLowerCase();
f1 = AuthorList.fixAuthorForAlphabetization((String) f1).toLowerCase(Locale.ROOT);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can only use ROOT for language insensitive Strings. We think not all missing places are insensitive. Also, sometimes Locale.ENGLISH or so might be better in such cases?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm... Generally, I think ROOT is better than ENGLISH for Strings where we use them internally, e.g. to match files names etc..

However you have a point. If the String is language sensitive it might be better to use the user-specific Locale (i.e., the Locale.getDefault();). However in this case we must be sure that it is not used for any comparisons etc., because this would break stuff if compared to Strings that were manipulated with another Locale.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general we should use ROOT for internal strings only. And especially in reading/writing we have to be backwards compatible.
For GUI stuff we should use the English locale for english terms, e.g. the bibtex fieldnames or entry types.
As this effects many cases. I think we should discuss this in a devcall @JabRef/developers WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any issue solved with this?

Is it worth that we spend two hours on this? Why?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in favor of merging this PR at is right now. I can't see any disadvantage of using ROOT instead of the default one. Maybe for some GUI related stuff you can argue about ENGLISH but well...The worst possible thing (from what I get) is that now the keys of some entries with "strange" values change and maybe the order in the main table is slightly different...not very system-critical.

@LinusDietz LinusDietz changed the title Added Locale.ROOT to toUpper/toLower Methods [WIP] Added Locale.ROOT to toUpper/toLower Methods Feb 22, 2017
@LinusDietz
Copy link
Member Author

I have thought about this matter for a while and would argue to merge it as it is.

  • For users with a locale that does not have specific rules on toLower/toUpper nothing changes.
  • For users with a `special' locale, e.g., tr_TR, it is advisable to use Locale.ROOT anyways for the following reasons:
    • Internal Strings must be Locale.ROOT anyways to ensure correct behavior of JabRef.
    • Field Strings in a BibEntry should not be manipulated in the user's Locale, as science is mostly international, thus we are most likely to make an mistake if we apply language-specific toLower rules on English/German/Russian/Chinese names.
    • BibTex should be ASCII. If we apply toLowerCase using an esoteric locale we might produce an non-ASCII character.

@LinusDietz LinusDietz changed the title [WIP] Added Locale.ROOT to toUpper/toLower Methods Added Locale.ROOT to toUpper/toLower Methods Feb 24, 2017
@LinusDietz LinusDietz added the status: ready-for-review Pull Requests that are ready to be reviewed by the maintainers label Feb 24, 2017
@koppor
Copy link
Member

koppor commented Feb 24, 2017

Please do not underestimate the value of UTF-8 and the power of biblatex to handle UTF-8.

My personal bibliography is handled in UTF-8. Now LaTeX encoding any more.

When I use it at Springer papers, I use our converter to convert to plain latex. See http://help.jabref.org/en/CleanupEntries.

See also #160 :)

@Siedlerchr
Copy link
Member

Yes, biber + biblatex all handle stuff in UTF-8 and I think the default for a new db is already UTF-8. My dbs are also all in UTF-8. No more Latex encoding. And here we should be aware of a sort order depending on the locale.

@LinusDietz
Copy link
Member Author

Ok, then let's forget about the encoding point, even if it unclear how many people actually use a bibliography compiler that supports UTF-8...
What about my other points?

Copy link
Member

@tobiasdiez tobiasdiez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I said above, I'm also in favor of merge!

@koppor
Copy link
Member

koppor commented Mar 2, 2017

After resolving the conflicts, it is OK for me to merge. We will fix wrongly uppercased things when issues will come up.

@koppor koppor removed the status: ready-for-review Pull Requests that are ready to be reviewed by the maintainers label Mar 2, 2017
@LinusDietz LinusDietz merged commit 7a1b15a into master Mar 6, 2017
@LinusDietz LinusDietz deleted the locales branch March 6, 2017 12:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants