Allow "auto" for gradient clipping in YAML #2649

regisss · 2024-04-10T14:18:39Z

What does this PR do?

If we have gradient_clipping: auto in the DeepSpeed config inside an Accelerate config, the code currently fails with:

Traceback (most recent call last):
  File "/root/workspace/lora/tmp.py", line 3, in <module>
    args = TrainingArguments(
  File "<string>", line 124, in __init__
  File "/usr/local/lib/python3.10/dist-packages/transformers/training_args.py", line 1828, in __post_init__
    self.deepspeed_plugin = DeepSpeedPlugin()
  File "<string>", line 14, in __init__
  File "/usr/local/lib/python3.10/dist-packages/accelerate/utils/dataclasses.py", line 727, in __post_init__
    self.gradient_clipping = float(gradient_clipping)
ValueError: could not convert string to float: 'auto'

This can be reproduced running accelerate launch --config_file my_config.yaml script.py with script.py being

from transformers import TrainingArguments, Trainer, AutoModel

args = TrainingArguments(
    output_dir="/tmp/test",
    max_grad_norm=0.5,
)

trainer = Trainer(
    model=AutoModel.from_pretrained("bert-base-uncased"),
    args=args,
)

and my_config.yaml is

compute_environment: LOCAL_MACHINE
debug: false
deepspeed_config:
  gradient_accumulation_steps: 1
  gradient_clipping: auto
  offload_optimizer_device: none
  offload_param_device: none
  zero3_init_flag: true
  zero3_save_16bit_model: true
  zero_stage: 3
distributed_type: DEEPSPEED
downcast_bf16: 'no'
machine_rank: 0
main_training_function: main
mixed_precision: bf16
num_machines: 1
num_processes: 8
rdzv_backend: static
same_network: true
tpu_env: []
tpu_use_cluster: false
tpu_use_sudo: false
use_cpu: false

This PR fixes this issue by checking if the given value can actually be casted to float.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

muellerzr · 2024-04-10T14:22:40Z

cc @pacman100

HuggingFaceDocBuilderDev · 2024-04-10T14:24:51Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

pacman100

Thank you @regisss for supporting auto for gradient_clipping to be inline with gradient_accumulation_steps providing more flexibility to users! Left a suggestion.

src/accelerate/utils/dataclasses.py

Co-authored-by: Sourab Mangrulkar <[email protected]>

pacman100

Thank you @regisss!

Allow "auto" for gradient clipping in YAML

bb2ad56

muellerzr requested a review from pacman100 April 10, 2024 14:22

pacman100 reviewed Apr 12, 2024

View reviewed changes

src/accelerate/utils/dataclasses.py Outdated Show resolved Hide resolved

regisss and others added 2 commits April 12, 2024 09:14

Update src/accelerate/utils/dataclasses.py

d323120

Co-authored-by: Sourab Mangrulkar <[email protected]>

Make style

3f1f6ec

regisss requested a review from pacman100 April 12, 2024 07:46

pacman100 approved these changes Apr 12, 2024

View reviewed changes

pacman100 merged commit 5056d32 into huggingface:main Apr 12, 2024
23 checks passed

regisss deleted the fix_gradient_clipping_yaml_auto branch April 12, 2024 08:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow "auto" for gradient clipping in YAML #2649

Allow "auto" for gradient clipping in YAML #2649

regisss commented Apr 10, 2024

muellerzr commented Apr 10, 2024

HuggingFaceDocBuilderDev commented Apr 10, 2024

pacman100 left a comment

pacman100 left a comment

Allow "auto" for gradient clipping in YAML #2649

Allow "auto" for gradient clipping in YAML #2649

Conversation

regisss commented Apr 10, 2024

What does this PR do?

Before submitting

Who can review?

muellerzr commented Apr 10, 2024

HuggingFaceDocBuilderDev commented Apr 10, 2024

pacman100 left a comment

Choose a reason for hiding this comment

pacman100 left a comment

Choose a reason for hiding this comment