Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[tracking ERROR] Publication failed: Invalid config section: while scanning a simple key #554

Closed
Tracked by #164
bhearsum opened this issue May 2, 2024 · 2 comments · Fixed by #583
Closed
Tracked by #164
Labels
bug Something is broken or not correct weights and biases Intergration with Weights and Biases

Comments

@bhearsum
Copy link
Collaborator

bhearsum commented May 2, 2024

From https://firefox-ci-tc.services.mozilla.com/tasks/CcBRvVfDT229U-iLBThC5w/runs/2:

[task 2024-05-02T00:40:40.350Z] [tracking ERROR] Publication failed: Invalid config section: while scanning a simple key
[task 2024-05-02T00:40:40.350Z]   in "<unicode string>", line 493, column 1:
[task 2024-05-02T00:40:40.350Z]     Loaded model has been created wi ... 
[task 2024-05-02T00:40:40.350Z]     ^
[task 2024-05-02T00:40:40.350Z] could not find expected ':'
[task 2024-05-02T00:40:40.350Z]   in "<unicode string>", line 495, column 1:
[task 2024-05-02T00:40:40.350Z]     
[task 2024-05-02T00:40:40.350Z]     ^

Possibly of note, this is a run where I am testing autocontinuation after a spot termination. Run 1 published fine, run 2 downloaded the artifacts from run 1, and then started to train. I wonder if this is a general issue with publication when continuing an existing training?

@eu9ene eu9ene added bug Something is broken or not correct weights and biases Intergration with Weights and Biases labels May 2, 2024
@eu9ene
Copy link
Collaborator

eu9ene commented May 9, 2024

@La0 @vrigal ok, we see this consistently, it seems the parser just can't parse:

[task 2024-05-09T20:18:17.301Z] [2024-05-09 20:18:17] [config] Loaded model has been created with Marian v1.12.14 2d067af 2024-02-16 11:44:13 -0500

https://firefox-ci-tc.services.mozilla.com/tasks/MuXYgqjUQ4G-HVTnSyzvgw/runs/1/logs/live/public/logs/live.log

We should make parser more reliable so that it tolerates new Marian logs it doesn't understand. This particular one does not look like a part of config, yes, but it still shouldn't fail.

@eu9ene
Copy link
Collaborator

eu9ene commented May 9, 2024

Also, this blocks landing https://github.com/mozilla/firefox-translations-training/pull/580/files as we want to make sure W&B continues tracking correctly after the training is restarted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something is broken or not correct weights and biases Intergration with Weights and Biases
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants