-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Validate alignments - Segmentation fault with addAlignmentsToBatch #450
Comments
The crash appears to be in inlined code from: https://github.com/marian-nmt/marian/blob/65bf82ffce52f4854295d8b98482534f176d494e/src/data/corpus_base.cpp#L468-L487 |
Just kidding, I just didn't understand the format |
I've validated the correctness of the generated alignments. https://firefox-ci-tc.services.mozilla.com/tasks/b-7CDsKNQ_Cn7wf3RcxDXw#artifacts |
This is still happening with OpusTrainer even with no augmentation. I removed it and used Marian directly in my training and worked around this, but we should fix this to support augmentation in teachers. |
Was this fixed by #491? |
Yes, the students are training now and we don't see this error after Marian update |
* Enables model ensembles Adds the ability to use ensembles of models. This supports ensembles of binary- or npz-format models, as well as mixtures of both. When all models in the ensembles are of binary format, the load from memory path is used. Otherwise, they are loaded via the file system. Enable log-level debug for output related to this. * Fix formatting * Fix WASM bindings for MemoryBundle For now, this does not support ensembles. * Remove shared_ptr wrapping the AlignedMemory of models. * Fix formatting
In the en-ca training #384 I'm getting a crash in Marian from the alignments.
Taskcluster Log
In this message from Jorg: https://groups.google.com/g/marian-nmt/c/PjA-rQJ3Oio
So I suspect there is an alignment that is broken somehow in our code. We should validate the alignments. I'll investigate.
The text was updated successfully, but these errors were encountered: