You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've noted that in run_cpt.py and run_sft.py, we introduce packing=True. However, we didn't provide DataCollatorForCompletionOnlyLM into SFTtrainer; would it introduce cross contamination in training?
Hello @elichen3051 the task is the same whether one uses packing or not (i.e. next token prediction). The DataCollatorForCompletionOnlyLM is for the special case where you want to mask the inputs / prompts and in some cases gives a small performance boost
Dear HuggingFace
I've noted that in run_cpt.py and run_sft.py, we introduce
packing=True
. However, we didn't provideDataCollatorForCompletionOnlyLM
into SFTtrainer; would it introduce cross contamination in training?referenece article: Improving Hugging Face Training Efficiency Through Packing with Flash Attention
trl issue on github: huggingface/trl#805
The text was updated successfully, but these errors were encountered: