Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need help: the loss curve is strange. #1

Closed
ifshine opened this issue Sep 29, 2023 · 3 comments
Closed

Need help: the loss curve is strange. #1

ifshine opened this issue Sep 29, 2023 · 3 comments

Comments

@ifshine
Copy link

ifshine commented Sep 29, 2023

Thanks for your excellent work! Question in title(when I train the final cheery model).

image
@MingLiiii
Copy link
Member

Hi, thank you very much for your interest in this work!

Firstly, I would like to declare that this problem does not come from our selected data but probably comes from the Stanford alpaca codebase. You can find our training losses of different models on our hugging face repo: https://huggingface.co/MingLiiii/cherry-alpaca-5-percent-7B/blob/main/trainer_state.json

Then, for this problem, I think directly downgrading the transformers into 4.28.1 will solve this problem: pip install transformers==4.28.1
and probably you need to re-install wandb:
pip install wandb

You can find similar problems here:
tatsu-lab/stanford_alpaca#298
tloen/alpaca-lora#418
tloen/alpaca-lora#170

Hope it works for you.
Please let me know if you have any other questions~

@ifshine
Copy link
Author

ifshine commented Sep 29, 2023

As a beginner, when I see the loss curve becoming very strange, I feel at a loss.

Deeply thank you for your quick and detailed response. The loss curve is normal now.

(I encountered a very oscillatory loss curve while running code for some other projects. I wonder if it would be convenient for you to provide some debugging suggestions.)

@ifshine ifshine closed this as completed Sep 29, 2023
@MingLiiii
Copy link
Member

Hi, thank you for asking, but I doubt if I can fix your problem since the loss curves should be really task-specific, and I am really not an expert.

Anyways, you can send me an email if you want for more discussions or something like that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants