Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VPN-6649: fix translations for cities with special characters #10029

Merged
merged 2 commits into from
Nov 15, 2024

Conversation

mcleinman
Copy link
Collaborator

@mcleinman mcleinman commented Nov 12, 2024

Description

Malmö is the only city with special characters in the name. Our build scripts should handle this fine - and we use Servers.Malm for the string ID. It builds perfectly on my machine, but was failing on TaskCluster. This failure was due to taskcluster not handling special characters in string IDs in the same way my machine does, apparently. While special characters should be removed from string IDs, they weren't being removed.

This PR adds "cleaning of special characters" on the front end - before we pass the string IDs to the translation repo. Thus, we'll now keep the ID as servers.Malm for the round trip journey from our client repo to the translation repo and back.

VPN-6649 will be fixed as 4 separate PRs, merged in order:

  1. A PR on the translation repo to copy all existing translations of Malmö from servers.Malmö to servers.Malm (Copy servers.Malmö translations to servers.Malm mozilla-l10n/mozilla-vpn-client-l10n#488)
  2. A fixit PR on the translation repo: https://github.com/mozilla-l10n/mozilla-vpn-client-l10n/pull/490/files
  3. This PR, which will update the translation script.
  4. Another PR on the translation repo to remove all the legacy servers.Malmö translations.

More details from my research:

  1. In Localizer::getTranslatedCityName, we have the following values on my machine on main branch: cityName Malmö / parsedCityName Malm / i18nCityId ServersMalm / value Μάλμε. On TaskCluster (on main), we get the same ones with one exception - value is Malmö.
  2. Digging deeper into how value is created here, this seems to imply it's an issue with the translation that Qt is pulling out.
  3. Looking at how we build our translation files, we have this line being build on TaskCluster: <message id="servers.Malmö"> When building locally, I get the same thing. This is in build/translations/generated/mozillavpn_el.ts. This makes sense, as servers.Malmö is the string ID used in the translation repo.
  4. After the .ts file is created in our build process, we use lconvert to convert it to a binary qm file. qm files are binary, and are somewhat of a black box to us.
  5. It seems like somewhere in lconvert or loading the files when running the client, special characters should get stripped out from the translation ID. And they are on my machine, but not on TaskCluster. In both cases, what happens at this level is somewhat unknown, as they get deep Qt's handling of translations - it's deeper than our translation scripts (for creating translation files or loading them into the client).

(A different approach to solving this bug was taken in #9964 - that PR is being closed in favor of this one.)

Reference

VPN-6649

Checklist

  • My code follows the style guidelines for this project
  • I have not added any packages that contain high risk or unknown licenses (GPL, LGPL, MPL, etc. consult with DevOps if in question)
  • I have performed a self review of my own code
  • I have commented my code PARTICULARLY in hard to understand areas
  • I have added thorough tests where needed

@mcleinman
Copy link
Collaborator Author

This is ready for review, but I'm leaving it on draft to ensure we do not merge it before mozilla-l10n/mozilla-vpn-client-l10n#488 is merged.

@mcleinman mcleinman mentioned this pull request Nov 12, 2024
5 tasks
@@ -61,8 +61,15 @@ def fetch_server_list():
string_map = countries
for city in cities:
# Remove state suffix, capitalize each work, remove spaces.
id = city.split(",")[0].strip().title().replace(" ", "")
string_map[id] = city
baseId = city.split(",")[0].strip().title().replace(" ", "")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to obtain the same with this code

        id = city.split(",")[0].strip().title().replace(" ", "")
        id = re.sub(r'[^A-Za-z\s]', '', id)
        string_map[id] = city

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, thanks! Updated this.

I didn't include the white space check (\s), as I don't think we need it. We remove all spaces in the first line (.replace(" ", "")). While no current city names have any other whitespace characters, we'd want to remove them if they did.

@mcleinman mcleinman marked this pull request as ready for review November 13, 2024 17:08
@mcleinman
Copy link
Collaborator Author

Off draft, as the first l10n PR is merged.

@mcleinman mcleinman force-pushed the vpn-6649-remove-special-characters-before-translation branch from 106f33d to b6c928c Compare November 13, 2024 19:08
Copy link
Collaborator

@oskirby oskirby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

@mcleinman mcleinman merged commit c9dcb20 into main Nov 15, 2024
113 of 114 checks passed
@mcleinman mcleinman deleted the vpn-6649-remove-special-characters-before-translation branch November 15, 2024 00:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants