-
Notifications
You must be signed in to change notification settings - Fork 3.5k
T2T 1.4.1 transformer beam search result different with 1.3.2 #525
Comments
I have also noticed a huge BLEU drop between T2T versions 1.2.9 and 1.4.2. |
I can confirm this. I used v1.1.7 and got a BLEU of 47.66 on my ASPEC Chinese-Japanese task whereas with v1.4 I get 36.87 And as @martinpopel says, the training diverges after a few thousand iterations. Its as if it only looks at a fraction of the data shards and overfits on them. AFAIK in the new version the default number of shards is 100 and I suspect that it might be the case that the current code only looks at 10 of these shards and overfits. Anyone else here who has observed such a problem? |
I found out the bug was introduced in T2T 1.3.0. See the graph below where the upper curve is v1.2.9 and the lower is v1.3.0, all hyperparams are exactly the same. |
@martinpopel GG |
I realize the bug we are discussing now is a different one than the title of this issue and the first post, which is about v1.3.2 vs v1.4.1 problems. |
Yeah, not good that the beam search deteriorated. Not sure what the issue might be though. You used the exact same checkpoint? If you retrained, then the issue that @martinpopel found may be the culprit. If not, then that's a bit mysterious. Would probably mean that some logic in the decode path changed. |
I have trained a transformer translate model with t2t 1.3.2.
Now, I want to return every beam search result and socre, so I update t2t version to 1.4.1. I used that model, but got different results in some cases, and the whole bleu decrases.
Can some one help me?
The text was updated successfully, but these errors were encountered: