Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hide parameters of Adversary's Perturber from DDP #166

Merged
merged 3 commits into from
Jun 12, 2023

Conversation

mzweilin
Copy link
Contributor

@mzweilin mzweilin commented Jun 12, 2023

What does this PR do?

  • Hide uninitialized parameters from DDP so that we can run adversarial training with DDP.
  • Add test examples of adversarial training.

Type of change

Please check all relevant options.

  • Improvement (non-breaking)
  • Bug fix (non-breaking)
  • New feature (non-breaking)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

Testing

Please describe the tests that you ran to verify your changes. Consider listing any relevant details of your test configuration.

  • pytest
  • CUDA_VISIBLE_DEVICES=0 python -m mart experiment=CIFAR10_CNN_Adv trainer=gpu trainer.precision=16 reports 70%.
  • CUDA_VISIBLE_DEVICES=0,1 python -m mart experiment=CIFAR10_CNN_Adv trainer=ddp trainer.precision=16 trainer.devices=2 model.optimizer.lr=0.2 trainer.max_steps=2925 datamodule.ims_per_batch=256 datamodule.world_size=2 reports 70%.

Before submitting

  • The title is self-explanatory and the description concisely explains the PR
  • My PR does only one thing, instead of bundling different changes together
  • I list all the breaking changes introduced by this pull request
  • I have commented my code
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I have run pre-commit hooks with pre-commit run -a command without errors

Did you have fun?

Make sure you had fun coding 🙃

@mzweilin mzweilin requested a review from dxoigmn June 12, 2023 21:15
dxoigmn
dxoigmn previously approved these changes Jun 12, 2023
Copy link
Contributor

@dxoigmn dxoigmn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

- [ ] Test B
- [ ] `pytest`
- [ ] `CUDA_VISIBLE_DEVICES=0 python -m mart experiment=CIFAR10_CNN_Adv trainer=gpu trainer.precision=16` reports 70%.
- [ ] `CUDA_VISIBLE_DEVICES=0,1 python -m mart experiment=CIFAR10_CNN_Adv trainer=ddp trainer.precision=16 trainer.devices=2 model.optimizer.lr=0.2 trainer.max_steps=2925 datamodule.ims_per_batch=256 datamodule.world_size=2` reports 70%.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just CUDA_VISIBLE_DEVICES=0,1 python -m mart experiment=CIFAR10_CNN_Adv trainer=ddp trainer.precision=16 trainer.devices=2 datamodule.world_size=2 ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't get speed-up from two-GPUs if we keep the original learning rate and effective batch size.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added the estimated time per epoch for the two experiments.

@mzweilin mzweilin requested a review from dxoigmn June 12, 2023 22:41
@mzweilin mzweilin merged commit 2a744eb into main Jun 12, 2023
@mzweilin mzweilin deleted the hide_adv_params_from_ddp branch June 12, 2023 23:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants