Skip to content

Commit

Permalink
docs: rejected proposal the fuzzy strings
Browse files Browse the repository at this point in the history
  • Loading branch information
OmarIthawi committed Nov 4, 2023
1 parent 4541357 commit 8599611
Show file tree
Hide file tree
Showing 6 changed files with 150 additions and 26 deletions.
21 changes: 5 additions & 16 deletions .github/workflows/automerge-transifex-app-prs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,26 +15,15 @@ jobs:
uses: actions/checkout@v3
with:
fetch-depth: 0
- name: merge pull request
uses: nick-fields/retry@v2
id: mergePR

- name: Auto-merge pull request
env:
# secrets can't be used in job conditionals, so we set them to env here
TRANSIFEX_APP_ACTOR_NAME: "${{ secrets.TRANSIFEX_APP_ACTOR_NAME }}"
TRANSIFEX_APP_ACTOR_ID: "${{ secrets.TRANSIFEX_APP_ACTOR_ID }}"
# This token requires Write access to the openedx-translations repo
GITHUB_TOKEN: ${{ secrets.EDX_TRANSIFEX_BOT_GITHUB_TOKEN }}
PR_NUMBER: ${{ github.event.number }}
if: "${{ github.actor == env.TRANSIFEX_APP_ACTOR_NAME && github.actor_id == env.TRANSIFEX_APP_ACTOR_ID }}"
with:
retry_wait_seconds: 60
max_attempts: 5
timeout_minutes: 15
retry_on: error
command: |
# Attempt to merge the PR
gh pr merge ${{ github.head_ref }} --rebase --auto
# The `fromdate | todate` are used merge to validate that `mergedAt` isn't null
# therefore verifying that the pull request was merged successfully.
gh pr view "$PR_NUMBER" --json mergedAt --jq '.mergedAt | fromdate | todate'
run: |
# Add the pull request to the merge queue with --rebase commit strategy
gh pr merge ${{ github.head_ref }} --rebase --auto
2 changes: 1 addition & 1 deletion .github/workflows/validate-translation-files.yml
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ jobs:
:warning: There are errors in the translation files:
```
${{ steps.validate_translation_files.outputs.VALIDATION_ERRORS }}
${{ steps.validate_translation_files.outputs.VALIDATION_ERRORS || 'No errors were reported.' }}
```
This comment has been posted by the `validate-translation-files.yml` GitHub Actions workflow.
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ test_requirements: ## Installs test.txt requirements
test: ## Run scripts tests
pytest -v -s --cov=. --cov-report=term-missing --cov-report=html scripts/tests

validate_translation_files: ## Run basic validation to ensure files are compilable
validate_translation_files: ## Run basic validation to ensure translation files are valid
python scripts/validate_translation_files.py

sync_translations: ## Syncs from the old projects to the new openedx-translations project
Expand Down
116 changes: 116 additions & 0 deletions docs/decisions/0002-mark-fuzzy-entries.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
Mark invalid entries in po files as fuzzy
#########################################

Status
******

Rejected.

Context
*******
As of the `OEP-58`_, the Transifex GitHub App is used to sync Translations.
These translations are validated by `LexiQA`_, built-in Transifex quality
checks, and human reviewers. During this synchronization process, the
`Transifex GitHub App`_ creates pull requests to this repository. The
translations in the pull requests are then validated by ``msgfmt`` in CI.

Problem
*******
There are times when the translations being synchronized fail ``msgfmt``
validation. This prevents the pull requests from being merged.


Default Workflow: Keep pull requests open
********************************************

By default pull requests will be kept open and the translators will
need to fix the invalid entries and the `Transifex GitHub App`_ will
update the pull request with the new changes and merge it automatically
if there are no more errors.

To improve the Translators experience, a notification will be sent to
for invalid entries. An issue has been created to track this work:
`Notify translators about invalid entries`_.

Rejected Alternative: Mark invalid entries as fuzzy
***************************************************

Rejection Reason
================

This design decision was rejected because there were multiple issues that
had no feasible solution within the original scope of the `OEP-58`_
proposal. These issues are listed below:

- The workflow script edits the translations and effectively hide the
errors, which can be confusing for the translators. Therefore,
a good notification should be
sent to translators to avoid silencing errors.
- Writing to the pull request will not trigger the GitHub Actions workflow,
therefore the pull request cannot be merged unless the commit status
is overridden manually via `GitHub "create a commit status" API`_.
- The solution is ``.po`` file specific. And there's no similar solution for
``.json``, ``.yaml``, iOS, Android or other translations files.

Please refer to the original pull request which contains the rejected
implementation: `mark invalid entries as fuzzy | FC-0012`_.


Proposed Design
===============

A GitHub Actions workflow will be implemented to mark invalid entries in
synchronized ``.po`` files as ``fuzzy``. This will update pull requests
created by the `Transifex GitHub App`_.

To ensure a safe and reliable workflow, the following workflows will be
combined into one single workflow with multiple jobs:

#. Run ``python-tests.yml`` to validate the python code
#. Then, run ``validate-translation-files.yml`` which performs the following:

#. Validate the po-files using ``msgfmt``
#. Notify the translators about the invalid entries via the preferred
communication channel (Slack, Transifex, GitHub)
#. Edit the po files to mark invalid entries as ``fuzzy``, so it's
excluded from ``.mo`` files
#. Revalidate the files

#. Commit the updated entries and push to the PR branch
#. Automatically merge the PR


Informing the Translators
=========================
Translators can be notified about invalid translations via Slack, Transifex
or GitHub issues depending on the community's preference.

Pros and Cons
=============

**Pros**

* New valid strings would make it into the ``.mo`` files
* There's no need for manual intervention, therefore it's fast and won't
create a backlog of pull requests.
* Rejected strings are easily identifiable by looking in the code, so it's
easy to fix them.
* Translators can be notified about invalid translations via Slack, Transifex,
GitHub depending on the community's preference.


**Cons**

* The workflow script runs and edits the pull request, which can be
confusing for the reviewers.
* Previously valid entries are going to be overwritten with new ``fuzzy``
strings which will make those entries to be shown in source language
(English).


.. _OEP-58: https://open-edx-proposals.readthedocs.io/en/latest/architectural-decisions/oep-0058-arch-translations-management.html
.. _LexiQA: https://help.transifex.com/en/articles/6219179-lexiqa
.. _Transifex GitHub App: https://github.com/apps/transifex-integration
.. _GitHub "create a commit status" API: https://docs.github.com/en/rest/commits/statuses?apiVersion=2022-11-28#create-a-commit-status
.. _mark invalid entries as fuzzy | FC-0012: https://github.com/openedx/openedx-translations/pull/1944
.. _Notify translators about invalid entries: https://github.com/openedx/openedx-translations/issues/2130
10 changes: 5 additions & 5 deletions scripts/tests/test_validate_translation_files.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,11 @@
"""

import os.path
import re

from ..validate_translation_files import (
get_translation_files,
main,
validate_translation_files,
)

SCRIPT_DIR = os.path.dirname(__file__)
Expand Down Expand Up @@ -35,7 +36,7 @@ def test_main_on_invalid_files(capsys):
Integration test for the `main` function on some invalid files.
"""
mock_translations_dir = os.path.join(SCRIPT_DIR, 'mock_translations_dir')
exit_code = main(mock_translations_dir)
exit_code = validate_translation_files(mock_translations_dir)
out, err = capsys.readouterr()

assert 'VALID:' in out, 'Valid files should be printed in stdout'
Expand All @@ -44,8 +45,7 @@ def test_main_on_invalid_files(capsys):
assert 'hi/LC_MESSAGES/django.po' not in out, 'Invalid file should be printed in stderr'
assert 'en/LC_MESSAGES/django.po' not in out, 'Source file should not be validated'

assert 'INVALID:' in err
assert 'hi/LC_MESSAGES/django.po' in err
assert re.match(r'INVALID: .*hi/LC_MESSAGES/django.po', err)
assert '\'msgstr\' is not a valid Python brace format string, unlike \'msgid\'' in err
assert 'FAILURE: Some translations are invalid.' in err

Expand All @@ -57,7 +57,7 @@ def test_main_on_valid_files(capsys):
Integration test for the `main` function but only for the Arabic translations which is valid.
"""
mock_translations_dir = os.path.join(SCRIPT_DIR, 'mock_translations_dir/demo-xblock/conf/locale/ar')
exit_code = main(mock_translations_dir)
exit_code = validate_translation_files(mock_translations_dir)
out, err = capsys.readouterr()

assert 'VALID:' in out, 'Valid files should be printed in stdout'
Expand Down
25 changes: 22 additions & 3 deletions scripts/validate_translation_files.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,15 @@
import sys
"""
Validate translation files using GNU gettext `msgfmt` command.
This script is used to validate translation files in the Open edX platform and mark
invalid entries as fuzzy.
"""

import argparse
import os
import os.path
import subprocess
import sys


def get_translation_files(translation_directory):
Expand Down Expand Up @@ -39,7 +47,9 @@ def validate_translation_file(po_file):
}


def main(translations_dir='translations'):
def validate_translation_files(
translations_dir='translations',
):
"""
Run GNU gettext `msgfmt` and print errors to stderr.
Expand Down Expand Up @@ -81,5 +91,14 @@ def main(translations_dir='translations'):
return exit_code


def main(): # pragma: no cover
parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument('--dir', action='store', type=str, default='translations')
args = parser.parse_args()
sys.exit(validate_translation_files(
translations_dir=args.dir,
))


if __name__ == '__main__':
sys.exit(main()) # pragma: no cover
main() # pragma: no cover

0 comments on commit 8599611

Please sign in to comment.