Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use SFTConfig instead of SFTTrainer keyword args #2150

Merged
merged 5 commits into from
Oct 15, 2024
Merged

Conversation

qgallouedec
Copy link
Member

SFTTrainer's keyword args like packing, dataset_kwargs, dataset_text_field, max_seq_length has been deprecated and will be soon removed. Instead, we use SFTConfig (subclass of TrainingArguments)

This PR updates the code related to SFTTrainer.

@BenjaminBossan
Copy link
Member

Thanks for addressing these trl deprecations. For my understanding, some arguments have been dropped without adding them to SFTConfig. Is that because for those, the default values were used?

@qgallouedec
Copy link
Member Author

Which ones?

Comment on lines -122 to -128
packing=data_args.packing,
dataset_kwargs={
"append_concat_token": data_args.append_concat_token,
"add_special_tokens": data_args.add_special_tokens,
},
dataset_text_field=data_args.dataset_text_field,
max_seq_length=data_args.max_seq_length,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here would be an example where arguments are removed from SFTTrainer but no equivalent arguments were added to training_args.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I see what you mean.

When you run the script with, for example, --max_seq_length 123, the value will no longer feed data_args but training_args. Everything happens behind the scenes when arguments are parsed.

python example.py --output_dir tmp --max_seq_length 123

Before:

from dataclasses import dataclass
from typing import Optional
from transformers import HfArgumentParser, TrainingArguments

@dataclass
class DataTrainingArguments:
    dataset_name: Optional[str] = None
    max_seq_length: int = 512

if __name__ == "__main__":
    parser = HfArgumentParser((DataTrainingArguments, TrainingArguments))
    data_args, training_args = parser.parse_args_into_dataclasses()
    print(data_args.max_seq_length)  # 123

After:

from dataclasses import dataclass
from typing import Optional
from transformers import HfArgumentParser
from trl import SFTConfig

@dataclass
class DataTrainingArguments:
    dataset_name: Optional[str] = None

if __name__ == "__main__":
    parser = HfArgumentParser((DataTrainingArguments, SFTConfig))
    data_args, training_args = parser.parse_args_into_dataclasses()
    print(training_args.max_seq_length)  # 123

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR and explaining the change. I tried the updated script and it worked. LGTM.

@BenjaminBossan BenjaminBossan merged commit 93ddb10 into main Oct 15, 2024
15 checks passed
@qgallouedec qgallouedec deleted the update_trl branch October 15, 2024 09:28
yaswanth19 pushed a commit to yaswanth19/peft that referenced this pull request Oct 20, 2024
…#2150)

Update training script using trl to fix deprecations in argument usage.
yaswanth19 pushed a commit to yaswanth19/peft that referenced this pull request Oct 20, 2024
…#2150)

Update training script using trl to fix deprecations in argument usage.
BenjaminBossan pushed a commit to BenjaminBossan/peft that referenced this pull request Oct 22, 2024
…#2150)

Update training script using trl to fix deprecations in argument usage.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants