You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello. I've tried to apply SPIN on llama-2 with the ultrachat200k datasets. It seems that the tokenizer for llama2 and zephyr-7b are different (as shown in this document). As a result, the code in spin/run_spin.py likely needs to be modified (which is primarily focused on the apply_chat_template function). By the way, for alpaca-like datasets, the structure of the dataset is different from ultrachat200k (the key is "instruction" and "output" rather than "prompt" and "messages"). I believe just changing the code in spin/reformat.py to ensure the structure of the dataset is the same should be ok.
I want to apply SPIN method on llama2 with alpaca-like finetuning datasets. What changes should I make to apply the SPIN method?
Thanks a lot!
The text was updated successfully, but these errors were encountered: