Avoid graph breaks by disabling sourceless calls in instrument_w_nvtx #7081

deepcharm · 2025-02-26T14:40:29Z

This PR is a continuation of the efforts to improve Deepspeed performance when using PyTorch compile.

The instrument_w_nvtx decorator is used to instrument code with NVIDIA Tools Extension (NVTX) markers for profiling and visualizing code execution on GPUs.

Along with executing the function itself, instrument_w_nvtx makes calls to nvtx.range_push and nvtx.range_pop which can't be traced by Dynamo.

That's why this decorator causes a graph break.
The impact on performance can be significant due to numerous uses of the decorator throughout the code.

We propose a simple solution: Don't invoke the sourceless functions when torch is compiling.

This PR is a continuation of the effort to improve Deepspeed performance when using PyTorch compile. The instrument_w_nvtx decorator is used to instrument code with NVIDIA Tools Extension (NVTX) markers for profiling and visualizing code execution on GPUs. Along with executing the function itself, instrument_w_nvtx makes calls to nvtx.range_push and nvtx.range_pop which can't be traced by Dynamo. That's why this decorator causes a graph break. The impact on performnace can be significant due to numerous uses of the decorator throughout the code. We propose a simple solution: Don't invoke the sourceless functions when torch is compiling. Signed-off-by: Max Kovalenko <[email protected]>

deepspeed/utils/nvtx.py

Requested usage of DeepSpeed utility to address CI failures.

deepspeed/utils/nvtx.py

…strument_w_nvtx

Signed-off-by: Max Kovalenko <[email protected]>

…strument_w_nvtx

Signed-off-by: Max Kovalenko <[email protected]>

…deepspeedai#7081) This PR is a continuation of the efforts to improve Deepspeed performance when using PyTorch compile. The `instrument_w_nvtx` decorator is used to instrument code with NVIDIA Tools Extension (NVTX) markers for profiling and visualizing code execution on GPUs. Along with executing the function itself, `instrument_w_nvtx` makes calls to `nvtx.range_push` and `nvtx.range_pop` which can't be traced by Dynamo. That's why this decorator causes a graph break. The impact on performance can be significant due to numerous uses of the decorator throughout the code. We propose a simple solution: Don't invoke the sourceless functions when torch is compiling. --------- Signed-off-by: Max Kovalenko <[email protected]> Co-authored-by: Logan Adams <[email protected]> Signed-off-by: yisheng <[email protected]>

…deepspeedai#7081) This PR is a continuation of the efforts to improve Deepspeed performance when using PyTorch compile. The `instrument_w_nvtx` decorator is used to instrument code with NVIDIA Tools Extension (NVTX) markers for profiling and visualizing code execution on GPUs. Along with executing the function itself, `instrument_w_nvtx` makes calls to `nvtx.range_push` and `nvtx.range_pop` which can't be traced by Dynamo. That's why this decorator causes a graph break. The impact on performance can be significant due to numerous uses of the decorator throughout the code. We propose a simple solution: Don't invoke the sourceless functions when torch is compiling. --------- Signed-off-by: Max Kovalenko <[email protected]> Co-authored-by: Logan Adams <[email protected]>

…deepspeedai#7081) This PR is a continuation of the efforts to improve Deepspeed performance when using PyTorch compile. The `instrument_w_nvtx` decorator is used to instrument code with NVIDIA Tools Extension (NVTX) markers for profiling and visualizing code execution on GPUs. Along with executing the function itself, `instrument_w_nvtx` makes calls to `nvtx.range_push` and `nvtx.range_pop` which can't be traced by Dynamo. That's why this decorator causes a graph break. The impact on performance can be significant due to numerous uses of the decorator throughout the code. We propose a simple solution: Don't invoke the sourceless functions when torch is compiling. --------- Signed-off-by: Max Kovalenko <[email protected]> Co-authored-by: Logan Adams <[email protected]> Signed-off-by: Shaik Raza Sikander <[email protected]>

deepcharm requested review from tjruwase and tohtana as code owners February 26, 2025 14:40

Merge branch 'master' into disable-sourceless-calls-in-instrument_w_nvtx

41f9bd4

tjruwase previously approved these changes Feb 26, 2025

View reviewed changes

tjruwase reviewed Feb 26, 2025

View reviewed changes

deepspeed/utils/nvtx.py Outdated Show resolved Hide resolved

tjruwase reviewed Feb 26, 2025

View reviewed changes

deepspeed/utils/nvtx.py Outdated Show resolved Hide resolved

deepcharm and others added 2 commits February 27, 2025 12:18

Merge branch 'deepspeedai:master' into disable-sourceless-calls-in-in…

1b2d02b

…strument_w_nvtx

Using already existing function is_compiling()

357fc99

Signed-off-by: Max Kovalenko <[email protected]>

tjruwase approved these changes Feb 27, 2025

View reviewed changes

deepcharm and others added 3 commits February 27, 2025 18:18

Merge branch 'deepspeedai:master' into disable-sourceless-calls-in-in…

4a0b785

…strument_w_nvtx

Removed unused import

80d4283

Signed-off-by: Max Kovalenko <[email protected]>

Merge branch 'master' into disable-sourceless-calls-in-instrument_w_nvtx

51c2a9e

loadams enabled auto-merge March 3, 2025 19:28

Merge branch 'master' into disable-sourceless-calls-in-instrument_w_nvtx

a2ec561

loadams added this pull request to the merge queue Mar 3, 2025

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Mar 3, 2025

loadams added this pull request to the merge queue Mar 3, 2025

Merged via the queue into deepspeedai:master with commit a88f56a Mar 3, 2025
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid graph breaks by disabling sourceless calls in instrument_w_nvtx #7081

Avoid graph breaks by disabling sourceless calls in instrument_w_nvtx #7081

deepcharm commented Feb 26, 2025 •

edited

Loading

Avoid graph breaks by disabling sourceless calls in instrument_w_nvtx #7081

Avoid graph breaks by disabling sourceless calls in instrument_w_nvtx #7081

Conversation

deepcharm commented Feb 26, 2025 • edited Loading

deepcharm commented Feb 26, 2025 •

edited

Loading