Evaluation fails on a pre-trained backward model #628

eu9ene · 2024-05-23T23:36:03Z

https://firefox-ci-tc.services.mozilla.com/tasks/CEUR_rZ1Qty22JNz3JC-mw

We shouldn't run evals for the pre-trained models though. It wasn't the case before, so something got broken in training continuation.

This is not critical as it does not block other tasks.

gabrielBusta · 2024-07-10T19:01:34Z

Hmm, maybe it's because continuation used to be done at graph generation time rather than at run-time? Perhaps we can prune these eval tasks from the graph using its parameters. Alternatively, we could have the eval tasks exit successfully without doing anything if they detect that the model was pretrained.

eu9ene · 2024-07-10T20:52:32Z

We should remove any redundant tasks from the graph. We can assume the pre-trained model has already been evaluated.

bhearsum · 2024-07-29T11:30:42Z

Hmm, maybe it's because continuation used to be done at graph generation time rather than at run-time? Perhaps we can prune these eval tasks from the graph using its parameters. Alternatively, we could have the eval tasks exit successfully without doing anything if they detect that the model was pretrained.

As far as I can tell, the run that was linked to is not using runtime continuation. I suspect this regressed with one of the recent-ish changes to train.py: https://github.com/mozilla/firefox-translations-training/commits/main/taskcluster/translations_taskgraph/actions/train.py

This is prep work for mozilla#628, where I'd like to add some tests to avoid regressing that again in the future. The fixtures here are based on similar tests from Gecko: https://searchfox.org/mozilla-central/source/taskcluster/test. There's a bit of a terrible hack to make optimized task graphs testable, described more in the comments.

This is prep work for #628, where I'd like to add some tests to avoid regressing that again in the future. The fixtures here are based on similar tests from Gecko: https://searchfox.org/mozilla-central/source/taskcluster/test. There's a bit of a terrible hack to make optimized task graphs testable, described more in the comments.

eu9ene added bug Something is broken or not correct taskcluster Issues related to the Taskcluster implementation of the training pipeline labels May 23, 2024

eu9ene mentioned this issue Jun 25, 2024

[meta] Make the pipeline reliable enough to train many languages #311

Open

bhearsum mentioned this issue Jul 30, 2024

feat: add scaffolding and basic tests for taskgraph generation #776

Merged

bhearsum self-assigned this Jul 31, 2024

bhearsum mentioned this issue Jul 31, 2024

fix: don't run evaluate tasks on pretrained models #781

Merged

bhearsum closed this as completed in #781 Sep 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluation fails on a pre-trained backward model #628

Evaluation fails on a pre-trained backward model #628

eu9ene commented May 23, 2024

gabrielBusta commented Jul 10, 2024

eu9ene commented Jul 10, 2024

bhearsum commented Jul 29, 2024

Evaluation fails on a pre-trained backward model #628

Evaluation fails on a pre-trained backward model #628

Comments

eu9ene commented May 23, 2024

gabrielBusta commented Jul 10, 2024

eu9ene commented Jul 10, 2024

bhearsum commented Jul 29, 2024