Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VPN-6649 - Fix Malmö translation #9964

Closed
wants to merge 5 commits into from

Conversation

mcleinman
Copy link
Collaborator

@mcleinman mcleinman commented Oct 15, 2024

Description

Malmö is the only city with special characters in the name. Our build scripts should handle this fine - and we use Servers.Malm for the string ID. It builds perfectly on my machine, but was failing on TaskCluster.

I've spent perhaps a bit too long trying to get the to the root of the problem, before I put up this hacky fix. Here is what I know:

  1. In Localizer::getTranslatedCityName, we have the following values on my machine on main branch: cityName Malmö / parsedCityName Malm / i18nCityId ServersMalm / value Μάλμε. On TaskCluster (on main), we get the same ones with one exception - value is Malmö.
  2. Digging deeper into how value is created here, this seems to imply it's an issue with the translation that Qt is pulling out.
  3. Looking at how we build our translation files, we have this line being build on TaskCluster: <message id="servers.Malmö"> When building locally, I get the same thing. This is in build/translations/generated/mozillavpn_el.ts. This makes sense, as servers.Malmö is the string ID used in the translation repo.
  4. After the .ts file is created in our build process, we use lconvert to convert it to a binary qm file. qm files are binary, and are somewhat of a black box to us.
  5. It seems like somewhere in lconvert or loading the files when running the client, special characters should get stripped out from the translation ID. And they are on my machine, but not on TaskCluster. In both cases, what happens at this level is somewhat unknown, as they get deep Qt's handling of translations - it's deeper than our translation scripts (for creating translation files or loading them into the client).

The attached fix is hacky. I hate it. It's not generalizable. While we currently only have one server with a special character, if we ever got another one we'd need to replicate this fix for it. It's also deep in a build script, and is not immediately obvious where/how this is handled.

However, I chose this option for two reasons:
A. We rarely add new servers, and thus the liklihood of adding a new server with a special character in the name is minimal. It may happen once every many years, but isn't something we'll deal with regularly.
B. We could hackily change the string ID when pulling strings out into the translation repo. That would at least keep the ID as servers.Malm for most of the round trip journey from our client repo to the translation repo and back. However, this would mean that we'd lose the existing translation for Malmö, and translators would need to re-translate it under the new string ID. I wanted to avoid that.

Feel free to reject this PR, if you don't like the hacky solution here. I'm open to other methods, especially changing the string ID before it's sent to the translation repo (which comes with the downside of B above).

Reference

VPN-6649

Checklist

  • My code follows the style guidelines for this project
  • I have not added any packages that contain high risk or unknown licenses (GPL, LGPL, MPL, etc. consult with DevOps if in question)
  • I have performed a self review of my own code
  • I have commented my code PARTICULARLY in hard to understand areas
  • I have added thorough tests where needed

@mcleinman mcleinman requested a review from a team as a code owner October 15, 2024 23:40
@mcleinman mcleinman requested review from jcristau and removed request for a team October 15, 2024 23:40
@mcleinman mcleinman changed the title DRAFT: taskcluster check VPN-6649 - Fix Malmö translation Oct 16, 2024
@mcleinman mcleinman requested review from lesleyjanenorton and removed request for jcristau October 16, 2024 21:37
@mcleinman mcleinman requested review from oskirby and removed request for lesleyjanenorton October 24, 2024 17:38
Copy link
Collaborator

@oskirby oskirby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like this fix, it feels very brittle and it fills me with concern over what exactly is allowed to be input as a valid string ID. I would prefer that we change the servers.Malmö ID to just servers.Malmo and rely on the translations to turn that into Malmö as appropriate.

If it's just one city translation to deal with, it shouldn't be that hard to copy the files over in Pontoon as necessary. Or there more to it than just changing the string ID in the VPN app (eg: does the ID come from guardian or something?).

@mcleinman
Copy link
Collaborator Author

I don't like this fix, it feels very brittle and it fills me with concern over what exactly is allowed to be input as a valid string ID. I would prefer that we change the servers.Malmö ID to just servers.Malmo and rely on the translations to turn that into Malmö as appropriate.

If it's just one city translation to deal with, it shouldn't be that hard to copy the files over in Pontoon as necessary. Or there more to it than just changing the string ID in the VPN app (eg: does the ID come from guardian or something?).

I agree that it's waaaay too brittle.

The ID comes from the English name of the city. So we'd have to change the name of the city in the app to use servers.Malmo.

Alternatively, we could strip special characters before sending it out to be translated, so the translation string is servers.Malm. (Currently it's servers.Malmö and we strip out that special o when ingesting it back into the app, after it's been translated.) In this case, we'd still need to work w/ l10n team to copy over the translations from servers.Malmö to servers.Malm.

How does that sound? In my mind, it's a bit more generalizable (it should work for any future cities with a special character), though it would take work w/ the translations team. If this sounds like an okay approach to you, let me know and I'll close this PR in favor of a new one.

@oskirby
Copy link
Collaborator

oskirby commented Oct 28, 2024

I agree that it's waaaay too brittle.

The ID comes from the English name of the city. So we'd have to change the name of the city in the app to use servers.Malmo.

Alternatively, we could strip special characters before sending it out to be translated, so the translation string is servers.Malm. (Currently it's servers.Malmö and we strip out that special o when ingesting it back into the app, after it's been translated.) In this case, we'd still need to work w/ l10n team to copy over the translations from servers.Malmö to servers.Malm.

How does that sound? In my mind, it's a bit more generalizable (it should work for any future cities with a special character), though it would take work w/ the translations team. If this sounds like an okay approach to you, let me know and I'll close this PR in favor of a new one.

I am in favor of changing the city ID to servers.Malmo and we will just have to work with the translation team to copy the names over. If we simply drop the non-ASCII chars (eg: servers.Malm) then it might lead to even more confusing IDs for other cities in the future.

We probably also want to enforce some kind of lint to ensure that the IDs only contain ASCII characters or something to prevent this from happening again in the future.

@mcleinman
Copy link
Collaborator Author

I am in favor of changing the city ID to servers.Malmo and we will just have to work with the translation team to copy the names over. If we simply drop the non-ASCII chars (eg: servers.Malm) then it might lead to even more confusing IDs for other cities in the future.

We probably also want to enforce some kind of lint to ensure that the IDs only contain ASCII characters or something to prevent this from happening again in the future.

The city ID is a derivative of the English city name, which in turn comes directly from Mullvad. So we can't change the city ID to servers.Malmo without some gross custom hacky code (either when coming back from Mullvad API or when creating the server translation strings). (I'm reluctant to do a general solution like "convert special vowels to regular vowels" - this current bug is because TaskCluster apparently handles some special characters differently than my machine, and I want to remove as much ambiguity as possible as to what a "special character" is, and use a fairly broad definition.)

We already use a parsed city string ID of servers.Malm internally when running the app - we strip out the ö when processing the strings to bring into the actual app. (However, we don't do this before sending the string out for translation.)

Potential steps forward, since we're both agreed this is hacky:

  1. Write hacky code to change the string ID before sending it out for translation. This will be similarly hacky, but live elsewhere - update_server_names.py (for preparing strings to send to the server) and in Localizer::getTranslatedCityName (so when looking for Malmö we actually search for Malmo).
  2. Write hacky code to change the actual string that the Mullvad API returns to us, to not use the ö - which will allow for a servers.Malmo string ID. This has a major downside of changing what we actually show users in the app for this scity's name.
  3. Keep digging in to the inner workings of lconvert. (In my initial research, once I got to "something is happening in lconvert that we can't see/understand, I stopped investigating.) I looked at the lconvert documentation and searched the internet for a bit, and couldn't account for anything that would cause the different behavior between my machine and TaskCluster. But I'm not personally satisfied that I can't confidently say exactly why this is happening. I stopped my research because it didn't feel worthwhile to spend another half day or day on research for a bug of this magnitude, but could very happily dive back in.
  4. Other options I'm not considering?
  5. Merge this PR

My inclination is #1 or #4 (if you have other ideas), but I'm genuinely open to any of these paths.

@mcleinman
Copy link
Collaborator Author

Closing in favor of #10029

@mcleinman mcleinman closed this Nov 12, 2024
@mcleinman mcleinman deleted the vpn-6649-malmo-taskcluster-issue branch November 21, 2024 18:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants