You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We can see that the units difference is observed here, when model_t is in the order of 10e-3 comparing to e2e_t
ds_report output
$ ds_report
/home/dsemiat/anaconda3/envs/py_venv_3.8_deepspeed4/lib/python3.8/site-packages/pandas/core/computation/expressions.py:20: UserWarning: Pandas requires version '2.7.3' or newer of 'numexpr' (version '2.7.2' currently installed).
from pandas.core.computation.check import NUMEXPR_INSTALLED
--------------------------------------------------
DeepSpeed C++/CUDA extension op report
--------------------------------------------------
NOTE: Ops not installed will be just-in-time (JIT) compiled at
runtime if needed. Op compatibility means that your system
meet the required dependencies to JIT install the op.
--------------------------------------------------
JIT compiled ops requires ninja
ninja .................. [OKAY]
--------------------------------------------------
op name ................ installed .. compatible
--------------------------------------------------
cpu_adam ............... [NO] ....... [OKAY]
cpu_adagrad ............ [NO] ....... [OKAY]
fused_adam ............. [NO] ....... [OKAY]
fused_lamb ............. [NO] ....... [OKAY]
[WARNING] sparse_attn cuda is not available from torch
[WARNING] sparse_attn requires a torch version >= 1.5 but detected 2.0
sparse_attn ............ [NO] ....... [NO]
transformer ............ [NO] ....... [OKAY]
stochastic_transformer . [NO] ....... [OKAY]
[WARNING] async_io requires the dev libaio .so object and headers but these were not found.
[WARNING] async_io: please install the libaio-dev package with apt
[WARNING] If libaio is already installed (perhaps from source), try setting the CFLAGS and LDFLAGS environment variables to where it can be found.
async_io ............... [NO] ....... [NO]
utils .................. [NO] ....... [OKAY]
quantizer .............. [NO] ....... [OKAY]
transformer_inference .. [NO] ....... [OKAY]
spatial_inference ...... [NO] ....... [OKAY]
--------------------------------------------------
DeepSpeed general environment info:
torch install path ............... ['/home/dsemiat/anaconda3/envs/py_venv_3.8_deepspeed4/lib/python3.8/site-packages/torch']
torch version .................... 2.0.0a0+gitcb066cd
torch cuda version ............... None
torch hip version ................ None
nvcc version ..................... [FAIL] cannot find CUDA_HOME via torch.utils.cpp_extension.CUDA_HOME=None
deepspeed install path ........... ['/home/dsemiat/qnpu/deepspeed4/src/deepspeed-fork/deepspeed']
deepspeed info ................... 0.7.7+37b837fa, 37b837fa, HEAD
deepspeed wheel compiled w. ...... torch 1.13, cuda 0.0
Screenshots
If applicable, add screenshots to help explain your problem.
Bug description
In class InferenceEngine under deepspeed/inference/engine.py:
When using cuda_events, the measured model time is stored in ms.
Code:
When not using cuda_events, the measured model time was stored in seconds.
Code:
Both of the values are stored under:
Reproduction
Cd to deepspeed tests:
Run the following unit test:
Expected behavior
The test will pass.
But when printing out the results, for example:
We can see that the units difference is observed here, when model_t is in the order of 10e-3 comparing to e2e_t
ds_report output
Screenshots
If applicable, add screenshots to help explain your problem.
System info:
Additional context
Opened a PR: #3501
The text was updated successfully, but these errors were encountered: