Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Edit language list, remove dialect specificity #7134

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
89 changes: 44 additions & 45 deletions app/constants/languages.js
Original file line number Diff line number Diff line change
@@ -1,102 +1,101 @@
const languages = [
{ value: 'af', label: 'Afrikaans' },
{ value: 'sq', label: 'Albanian' },
{ value: 'gsw-fr', label: 'Alsatian (France)' },
{ value: 'am-et', label: 'Amharic (Ethiopia)' },
{ value: 'gsw', label: 'Alsatian' },
{ value: 'am', label: 'Amharic' },
{ value: 'ar', label: 'Arabic' },
{ value: 'hy', label: 'Armenian' },
{ value: 'as-in', label: 'Assamese (India)' },
{ value: 'as', label: 'Assamese' },
{ value: 'az', label: 'Azeri' },
{ value: 'ba-ru', label: 'Bashkir (Russia)' },
{ value: 'ba', label: 'Bashkir' },
{ value: 'eu', label: 'Basque' },
{ value: 'be', label: 'Belarusian' },
{ value: 'bn', label: 'Bengali' },
{ value: 'bn-bd', label: 'Bengali (Bangladesh)' },
{ value: 'bn-in', label: 'Bengali (India)' },
{ value: 'bs', label: 'Bosnian' },
{ value: 'bs-cyrl-ba', label: 'Bosnian (Cyrillic, Bosnia and Herzegovina)' },
{ value: 'bs-latn-ba', label: 'Bosnian (Latin, Bosnia and Herzegovina)' },
{ value: 'br-fr', label: 'Breton (France)' },
{ value: 'br', label: 'Breton' },
{ value: 'bg', label: 'Bulgarian' },
{ value: 'ca', label: 'Catalan' },
{ value: 'zh-cn', label: 'Chinese (Simplified)' },
{ value: 'zh-tw', label: 'Chinese (Traditional)' },
{ value: 'hr', label: 'Croatian' },
{ value: 'cs', label: 'Czech' },
{ value: 'da', label: 'Danish' },
{ value: 'prs-af', label: 'Dari (Afghanistan)' },
{ value: 'prs', label: 'Dari' },
{ value: 'div', label: 'Divehi' },
{ value: 'nl', label: 'Dutch' },
{ value: 'en', label: 'English' },
{ value: 'en-gb', label: 'English (United Kingdom)' },
{ value: 'en-us', label: 'English (United States)' },
Comment on lines 29 to 30
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❓ Question: in Pandora PR 848, both 'en-gb' and 'en-us' are removed from the Zooniverse Translations website's list of "languages you can set translations for".

Is this an oversight, or are we maintaining 'en-gb' and 'en-us' on PFE for backwards compatibility? (See recent Slack message: 68 projects have en-gb and 217 projects have en-us)

If we ARE making a conscious decision to only have en-gb and en-us in PFE's languages list but not Pandora's languages list, we should add a comment here saying something like /* English dialects should exist only PFE's language list. Do not delete these entries from PFE nor add them to Pandora when syncing the language lists of both repos. */

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is most important to limit the language list in Pandora, so as to limit the language translations that can be created. For PFE, the list can be more inclusive, because to the best of my knowledge, it is only used to transform a language code into a language name.

In a feat of cowardice, I opt to keep en-gb and en-us in the list because there are multiple projects and workflows that have these language codes as their primary_language value. While these additional entries shouldn't be needed, there is little penalty for doing so.

Note: whether en-us and en-gb primary_language values should be allowed or cleaned up is a whole issue unto itself; out of scope for now.

I'm happy to add the comment if you feel strongly we should, but I feel it is not necessary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification, Cliff. 👍 I'm on board with the reasons you have.

I do very, very strongly suggest we add the comment about the PFE code's en-gb and en-us entries, because I know that in the future, a dev (e.g. me) is gonna go "I'll just copy-paste the languages.js file wholesale to/from pandora to sync the changes, it'll be fast & easy! What could go wrong? A-hyuck!", and an inline comment in the code will make it very obvious to a reviewer that they shouldn't be doing that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the example -- yes, I see the utility of the comment now. I'll get that in (here and on Pandora just for completeness) before merging early next week.

Copy link
Contributor

@eatyourgreens eatyourgreens Jul 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From an accessibility perspective, en-US and en-GB can be read and pronounced differently by screenreaders, so specifying a dialect can be useful for screenreader users, in cases where there are differences of spelling or pronunciation. I assume that's true for other dialects too eg. Mexican Spanish is pronounced very differently from European Spanish. One fairly common example, in English, is that VoiceOver/Safari with a UK voice will consistently mispronounce 'favorite' on Zooniverse project pages.

{ value: 'et', label: 'Estonian' },
{ value: 'fo', label: 'Faroese' },
{ value: 'fil-ph', label: 'Filipino (Philippines)' },
{ value: 'fil', label: 'Filipino' },
{ value: 'fi', label: 'Finnish' },
{ value: 'fr', label: 'French' },
{ value: 'gl', label: 'Galician' },
{ value: 'ka', label: 'Georgian' },
{ value: 'de', label: 'German' },
{ value: 'el', label: 'Greek' },
{ value: 'kl-gl', label: 'Greenlandic (Greenland)' },
{ value: 'kl', label: 'Greenlandic' },
{ value: 'gu', label: 'Gujarati' },
{ value: 'ha-latn-ng', label: 'Hausa (Latin, Nigeria)' },
{ value: 'ha', label: 'Hausa' },
{ value: 'he', label: 'Hebrew' },
{ value: 'hi', label: 'Hindi' },
{ value: 'hu', label: 'Hungarian' },
{ value: 'is', label: 'Icelandic' },
{ value: 'ig-ng', label: 'Igbo (Nigeria)' },
{ value: 'ig', label: 'Igbo' },
{ value: 'id', label: 'Indonesian' },
{ value: 'iu', label: 'Inuktitut' },
{ value: 'iu-latn-ca', label: 'Inuktitut (Latin, Canada)' },
{ value: 'iu-cans-ca', label: 'Inuktitut (Syllabics, Canada)' },
{ value: 'ga-ie', label: 'Irish (Ireland)' },
{ value: 'xh-za', label: 'isiXhosa (South Africa)' },
{ value: 'zu-za', label: 'isiZulu (South Africa)' },
{ value: 'ga', label: 'Irish' },
{ value: 'xh', label: 'isiXhosa' },
{ value: 'zu', label: 'isiZulu' },
{ value: 'it', label: 'Italian' },
{ value: 'ja', label: 'Japanese' },
{ value: 'kn', label: 'Kannada' },
{ value: 'kk', label: 'Kazakh' },
{ value: 'km-kh', label: 'Khmer (Cambodia)' },
{ value: 'qut-gt', label: 'K\'iche (Guatemala)' },
{ value: 'rw-rw', label: 'Kinyarwanda (Rwanda)' },
{ value: 'km', label: 'Khmer' },
{ value: 'qut', label: 'K\'iche' },
{ value: 'rw', label: 'Kinyarwanda' },
{ value: 'sw', label: 'Kiswahili' },
{ value: 'kok', label: 'Konkani' },
{ value: 'ko', label: 'Korean' },
{ value: 'ky', label: 'Kyrgyz' },
{ value: 'lo-la', label: 'Lao (Lao P.D.R.)' },
{ value: 'lo', label: 'Lao' },
{ value: 'lv', label: 'Latvian' },
{ value: 'lt', label: 'Lithuanian' },
{ value: 'wee-de', label: 'Lower Sorbian (Germany)' },
{ value: 'lb-lu', label: 'Luxembourgish (Luxembourg)' },
{ value: 'wee', label: 'Lower Sorbian' },
{ value: 'lb', label: 'Luxembourgish' },
{ value: 'mk', label: 'Macedonian' },
{ value: 'ms', label: 'Malay' },
{ value: 'ml-in', label: 'Malayalam (India)' },
{ value: 'mt-mt', label: 'Maltese (Malta)' },
{ value: 'mi-nz', label: 'Maori (New Zealand)' },
{ value: 'arn-cl', label: 'Mapudungun (Chile)' },
{ value: 'ml', label: 'Malayalam' },
{ value: 'mt', label: 'Maltese' },
{ value: 'mi', label: 'Maori' },
{ value: 'arn', label: 'Mapudungun' },
{ value: 'mr', label: 'Marathi' },
{ value: 'moh-ca', label: 'Mohawk (Mohawk)' },
{ value: 'moh', label: 'Mohawk' },
{ value: 'mn', label: 'Mongolian' },
{ value: 'ne-np', label: 'Nepali (Nepal)' },
{ value: 'ne', label: 'Nepali' },
{ value: 'no', label: 'Norwegian' },
{ value: 'oc-fr', label: 'Occitan (France)' },
{ value: 'or-in', label: 'Oriya (India)' },
{ value: 'ps-af', label: 'Pashto (Afghanistan)' },
{ value: 'oc', label: 'Occitan' },
{ value: 'or', label: 'Oriya' },
{ value: 'ps', label: 'Pashto' },
{ value: 'fa', label: 'Persian' },
{ value: 'pl', label: 'Polish' },
{ value: 'pt', label: 'Portuguese' },
{ value: 'pa', label: 'Punjabi' },
{ value: 'quz-bo', label: 'Quechua (Bolivia)' },
{ value: 'quz-ec', label: 'Quechua (Ecuador)' },
{ value: 'quz-pe', label: 'Quechua (Peru)' },
{ value: 'qu', label: 'Quechua' },
{ value: 'ro', label: 'Romanian' },
{ value: 'rm-ch', label: 'Romansh (Switzerland)' },
{ value: 'rm', label: 'Romansh' },
{ value: 'ru', label: 'Russian' },
{ value: 'sa', label: 'Sanskrit' },
{ value: 'sr', label: 'Serbian' },
{ value: 'nso-za', label: 'Sesotho sa Leboa (South Africa)' },
{ value: 'tn-za', label: 'Setswana (South Africa)' },
{ value: 'si-lk', label: 'Sinhala (Sri Lanka)' },
{ value: 'st', label: 'Sesotho' },
{ value: 'nso', label: 'Sesotho sa Leboa' },
{ value: 'tn', label: 'Setswana' },
{ value: 'si', label: 'Sinhala' },
{ value: 'sk', label: 'Slovak' },
{ value: 'sl', label: 'Slovenian' },
{ value: 'es', label: 'Spanish' },
Expand All @@ -108,22 +107,22 @@ const languages = [
{ value: 'tt', label: 'Tatar' },
{ value: 'te', label: 'Telugu' },
{ value: 'th', label: 'Thai' },
{ value: 'bo-cn', label: 'Tibetan (PRC)' },
{ value: 'bo', label: 'Tibetan' },
{ value: 've', label: 'Tshivenḓa' },
{ value: 'tr', label: 'Turkish' },
{ value: 'tk-tm', label: 'Turkmen (Turkmenistan)' },
{ value: 'ug-cn', label: 'Uighur (PRC)' },
{ value: 'tk', label: 'Turkmen' },
{ value: 'ug', label: 'Uighur' },
{ value: 'uk', label: 'Ukrainian' },
{ value: 'wen-de', label: 'Upper Sorbian (Germany)' },
{ value: 'wen', label: 'Upper Sorbian' },
{ value: 'ur', label: 'Urdu' },
{ value: 'uz', label: 'Uzbek' },
{ value: 'vi', label: 'Vietlabelse' },
{ value: 'cy', label: 'Welsh' },
{ value: 'wo-sn', label: 'Wolof (Senegal)' },
{ value: 'wo', label: 'Wolof' },
{ value: 'ts', label: 'Xitsonga'},
{ value: 'sah-ru', label: 'Yakut (Russia)' },
{ value: 'ii-cn', label: 'Yi (PRC)' },
{ value: 'yo-ng', label: 'Yoruba (Nigeria)' }
{ value: 'sah', label: 'Yakut' },
{ value: 'ii', label: 'Yi' },
{ value: 'yo', label: 'Yoruba' }
];

export default languages;
Loading