Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix iteration timing used in autotuning when gradient_accumulation_steps > 1 #2888

Merged
merged 12 commits into from
Sep 6, 2023

Conversation

cli99
Copy link
Contributor

@cli99 cli99 commented Feb 23, 2023

the number of flops (flops profiler) is captured at global step boundary, thus when gradient_accumulation_steps > 1, the iteration time (from fwd, bwd and step timers capturing micro steps) used in the autotuner needs to be multiplied by the gradient_accumulation_steps.

@cli99 cli99 enabled auto-merge (squash) February 24, 2023 00:16
Copy link

@21Ovi 21Ovi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will work

@cli99 cli99 requested a review from tjruwase June 23, 2023 21:40
auto-merge was automatically disabled July 6, 2023 17:12

Merge queue setting changed

@loadams loadams self-assigned this Aug 24, 2023
@loadams loadams enabled auto-merge September 6, 2023 21:46
@loadams loadams added this pull request to the merge queue Sep 6, 2023
Merged via the queue into master with commit bce6ed1 Sep 6, 2023
@loadams loadams deleted the flops-profiler-iter-timing branch February 28, 2024 18:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants