-
Notifications
You must be signed in to change notification settings - Fork 27.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pythia regression in transformers==4.36.2 vs transformers==4.30.1 #28316
Comments
Sorry but the source of the regression might be pretty much anything. If the model supports SDPA, it can come from SDPA, if the tokenizer had a bug before, it might be the tokenizer etc etc |
Could you try using |
Also the number you have don't really seem alarming no? |
I ran it with
and it did not seem to make a difference.
Yeah, but I guess this is why it's tricky — the numbers do not look that different but it causes a significant regression for reward model training. Maybe the hidden states index are being messed up somehow? It's using |
Oh sorry if using output_hidden_states, |
Hi @vwxyzjn ! |
I think it's 253f9a3 i'll fix the nans! |
System Info
Happy New Year all!
transformers
version: 4.36.2accelerate
Who can help?
Maybe @younesbelkada @ArthurZucker?
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Here is a minimal reproduction https://gist.github.com/vwxyzjn/e67e0bb28363e6fbb309bd0b78922a93. I ran the same
repro.py
withtransformers==4.36.2
andtransformers==4.30.1
, resulting in slightly different losses. Given the data is and other dependencies are precisely the same.Regression in end-to-end reward model training performance
This difference causes a regression in training reward models. When setting the code, data to be exactly the same, the average reward model accuracy across four random seeds is as follows:
The SFT losses are relatively similar (maybe except for 6.9B, there was a minor loss explosion with
transformers==4.36.2
)Here is the report. https://wandb.ai/costa-huang/tldr_summarize/reports/pythia-transformers-regression--Vmlldzo2Mzk3OTQ1
Here is the code comparison: identical code and only the dependencies are different
Expected behavior
There shouldn't be a regression in the performance.
The text was updated successfully, but these errors were encountered: