Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add comprehensive internationalization and localization validation checks #4562

Open
6 tasks
BenHenning opened this issue Sep 7, 2022 · 0 comments
Open
6 tasks
Labels
bug End user-perceivable behaviors which are not desirable. Impact: High High perceived user impact (breaks a critical feature or blocks a release). Work: Medium The means to find the solution is clear, but it isn't at good-first-issue level yet. Z-ibt Temporary label for Ben to keep track of issues he's triaged.

Comments

@BenHenning
Copy link
Member

BenHenning commented Sep 7, 2022

TODO: fill in this description from my original WIP TDD.

(From #4992) We've had past strings break due to having incorrect format specifiers and some HTML formatting problems (though Translatewiki mostly covers those problems). We still need much more robust checks for strings--these haven't been implemented yet.


Notes from @seanlip: We should implement at least the checks that we do on Web. We have found breaking changes due to incorrectly-formatted translatewiki strings that result in last-minute release delays (see e.g. #5034) so this issue is bumped to high priority and is now being treated as a bug. Here are the checks that we currently do on Web (in some case, an analogous check is listed below since the translation file formats are different):

  • There is at least one key in the English resource file (app/src/main/res/values/strings.xml).
  • All resources are correctly formatted (in terms of the XML tag structure).
  • The keys in the qq file (see Add descriptions to strings #3660) exactly match those in the English file (nothing is missing)
  • For all other resource files in app/src/main/res/values-* (besides the qq file):
    • [Most important -- tackle this first] HTML tags in translations are correctly preserved.
      • Rough algorithm: look for CDATA tags; remove anything apart from <...>; compare both sets of tags. See https://github.com/oppia/oppia/blob/develop/core/controllers/base_test.py#L1045 for reference. Note that we could make this stronger by requiring that e.g. ol/li tags appear in the same order as the original, though we might not want to specify that the full set of tags has the same order due to differences in word order between different languages.
    • Any % strings in the English resource file (e.g. %1$s, %s) are correctly preserved.
    • The keys are a subset of the keys in the English resource file.
    • All keys in each file are sorted alphabetically (not strictly needed, but helps with eyeballing and manual investigation of issues) -- EDIT: this might actually not be desired, see https://github.com/oppia/oppia-android/pull/4823/files. Removing the checkbox associated with it for now.
@BenHenning BenHenning self-assigned this Sep 7, 2022
BenHenning added a commit that referenced this issue Sep 8, 2022
## Explanation
Fixes part of #4562.

This PR introduces two new scripts:
- One for computing how many strings are not yet translated for Arabic, Brazilian Portuguese, and Swahili (as compared to the base English strings)
- One for verifying newline consistency between the English & non-English strings (based on number of lines)

The latter was also added as one of the jobs to be run during the static checks CI workflow, and is specifically useful to ensure newlines weren't inadvertently added by translators (see the updated translation strings for an idea on how often this has happened). Note that, as part of fixing these strings, the source string for lets_get_started was updated to no longer include a newline. This seems reasonable given that we should never use newlines for spacing/styling unless it's for logical splits (like paragraphs). However, I can't verify whether actual style changes are needed since this string is used as part of the unreleased app walkthrough feature. I expect that we'll re-audit the UI of that feature when we revisit it.

This PR also includes a major performance fix for RepositoryFile that will benefit all checks that use it. Kotlin's built-in file tree walker is built to implicitly follow symlinks on the filesystem. For Bazel builds, this expands the codebase to >1M files (since directory filtering happens *after* the files are known). Since scripts don't actually need to follow symlinks, the utility was updated to specifically not follow them (using Java NIO's file tree walk routine) which led to an observed ~10x performance improvement. Further optimizations could be done by building our own tree-walker and filtering directories during search (which will probably be necessary if we ever _do_ need to follow symlinks, but it probably won't be needed for a long time since the current performance is quite solid).

The former script is useful when manually auditing the translation progress using the codebase as the source of truth rather than Translatewiki. For now, it's been helpful in beta MR1 planning but longer-term it may also be useful as a badge on the repo's README.

Finally, the Kotlin stdlib dependency was updated to point to jdk8 instead of jdk7 (since a specific feature was needed from the former, and the codebase can rely on higher Java SDK versions than 7 given that the minimum Java version is 1.8).

## Essential Checklist
- [x] The PR title and explanation each start with "Fix #bugnum: " (If this PR fixes part of an issue, prefix the title with "Fix part of #bugnum: ...".)
- [x] Any changes to [scripts/assets](https://github.com/oppia/oppia-android/tree/develop/scripts/assets) files have their rationale included in the PR explanation.
- [x] The PR follows the [style guide](https://github.com/oppia/oppia-android/wiki/Coding-style-guide).
- [x] The PR does not contain any unnecessary code changes from Android Studio ([reference](https://github.com/oppia/oppia-android/wiki/Guidance-on-submitting-a-PR#undo-unnecessary-changes)).
- [x] The PR is made from a branch that's **not** called "develop" and is up-to-date with "develop".
- [x] The PR is **assigned** to the appropriate reviewers ([reference](https://github.com/oppia/oppia-android/wiki/Guidance-on-submitting-a-PR#clarification-regarding-assignees-and-reviewers-section)).

## For UI-specific PRs only
N/A -- this is a developer infrastructure-only PR, except for all of the string newline fixes (which just result in a more correct UI for non-English), and the changed English string (which is inaccessible to demonstrate). Screenshots could be uploaded, but the newlines need to be removed regardless so it doesn't seem specifically useful here (plus those languages may not be obvious to verify for reviewers).

Commits:

* Create dedicated alpha application component.

This simplifies application component management significantly and
allows individual build flavors to have their own unique module lists.

* Add beta & GA update notices.

This also introduces dedicated beta & GA build flavors which is a
necessary prerequisite.

It also introduces an extra beta, alpha, and dev mode labels for the splash
screen (the latter 2 were extra) with 2 second minimum wait timers for
beta and alpha to ensure they are seen. A 5-second safety timer was
added to ensure the splash screen can always be passed even if something
goes wrong at the domain level (since there are now quite a few moving
pieces to determine the user's current onboarding state).

* Add build tests for the new beta & GA flavors.

* Fix broken test per earlier changes.

* Fix general broken tests & builds.

Tests broken due to changes to the app startup experience haven't yet
been fixed.

* Lint fixes.

* First part of adding tests for GA notices.

There's a bunch left to do here, this is mainly needed so that I can
transfer changes to a different machine.

* Update TransformAndroidManifestTest.kt

Correct typos.

* Fix tests & static checks.

This also removes temporary debug code and TODOs, and finishes the tests
for SplashActivity.

* Post-merge fixes.

* Test fixes.

* Fix Gradle test.

* Add some string resource checks/tools.

Also, fixes major performance issue with all file-based CI checks.

* Ensure newline consistency in translated strings.

Also, fix reporting in new validation check script.

* Add tests & fix static checks.

* Follow-up adjustments after self-review.
@BenHenning BenHenning added issue_type_infrastructure Impact: Low Low perceived user impact (e.g. edge cases). Issue: Needs Clarification Indicates that an issue needs more detail in order to be able to be acted upon. Z-ibt Temporary label for Ben to keep track of issues he's triaged. labels Sep 14, 2022
@seanlip seanlip added enhancement End user-perceivable enhancements. and removed issue_type_infrastructure labels Mar 28, 2023
@seanlip seanlip added bug End user-perceivable behaviors which are not desirable. Impact: High High perceived user impact (breaks a critical feature or blocks a release). Work: Medium The means to find the solution is clear, but it isn't at good-first-issue level yet. and removed Impact: Low Low perceived user impact (e.g. edge cases). enhancement End user-perceivable enhancements. labels Jun 9, 2023
@seanlip seanlip removed the Issue: Needs Clarification Indicates that an issue needs more detail in order to be able to be acted upon. label Jun 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug End user-perceivable behaviors which are not desirable. Impact: High High perceived user impact (breaks a critical feature or blocks a release). Work: Medium The means to find the solution is clear, but it isn't at good-first-issue level yet. Z-ibt Temporary label for Ben to keep track of issues he's triaged.
Development

No branches or pull requests

2 participants