-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Performance] getting slower after run 10 times in intel cpu when I loop execute onnnxruntime inference #13651
Comments
|
|
can you give me email , the model email to you, can not upload because of github limit |
I don't think it would be a good assumption that email would accept files that are over github limit. People usually put it in the cloud and send a link. You specify onnxruntime version of 1.6 and yet in your example above you use some C++ API that only recently appeared. It is unlikely that that we are going to issue any patches for 1.6. We employ memory patterns and pre-allocations. You need to run at least 1-2 times before you can measure the performance, so we do not allocate much more memory (providing you do not use dynamic shapes). And then compute the mean and variance/percentiles. |
Thank you very much for you reply. link: https://pan.baidu.com/s/1r8i8YOAyz9yU3kxCk7gfZQ?pwd=j5d4 |
Baidu seems to require a client installation to download the file, and that I am not willing to do for my work computer. Would it be possible to put this on a MS OneDrive or Google Drive? Or anything else that does not require a proprietary client installation. |
One thing to suggest, since it is a CPU, you do not need memory arena. |
Describe the issue
i have one onnxruntime session running at intel cpu :
(1) at first inference total time cost is 200ms,
(2) when many times later, time cost is more than 10s.
getting slower after run some times when I loop execute onnnxruntime inference, Maybe the threadPool block execute ? what should i do ?
when the process is block , i gdb :
0x00007efd417a01a9 in onnxruntime::concurrency::ThreadPool::RunInParallel(std::function<void (unsigned int)>, unsigned int) () from /usr/local/lib64/libonnxruntime.so.1.6.0
Missing separate debuginfos, use: debuginfo-install libgcc-4.8.5-44.el7.x86_64 libstdc++-4.8.5-44.el7.x86_64 libuuid-2.23.2-65.el7.x86_64
(gdb) bt
#0 0x00007efd417a01a9 in onnxruntime::concurrency::ThreadPool::RunInParallel(std::function<void (unsigned int)>, unsigned int) () from /usr/local/lib64/libonnxruntime.so.1.6.0
#1 0x00007efd417a05ce in onnxruntime::concurrency::ThreadPool::ParallelForFixedBlockSizeScheduling(long, long, std::function<void (long, long)> const&) () from /usr/local/lib64/libonnxruntime.so.1.6.0
#2 0x00007efd417a06a5 in onnxruntime::concurrency::ThreadPool::SimpleParallelFor(long, std::function<void (long)> const&) () from /usr/local/lib64/libonnxruntime.so.1.6.0
#3 0x00007efd417ef558 in MlasExecuteThreaded(void ()(void, int), void*, int, onnxruntime::concurrency::ThreadPool*) () from /usr/local/lib64/libonnxruntime.so.1.6.0
#4 0x00007efd417b98fc in MlasNchwcConv(long const*, long const*, long const*, long const*, long const*, long const*, unsigned long, float const*, float const*, float const*, float*, MLAS_ACTIVATION const*, bool, onnxruntime::concurrency::ThreadPool*) () from /usr/local/lib64/libonnxruntime.so.1.6.0
To reproduce
loop execute onnnxruntime inference in intel cpu when
one onnxruntime session running at intel cpu :
(1) at first total time is 200ms,
(2) when test many times later, speed is more than 10s.
session options:
Ort::SessionOptions options;
session_options.SetGraphOptimizationLevel(GraphOptimizationLevel::ORT_ENABLE_ALL);
m_session = new Ort::Experimental::Session(m_env, model_path, options);
Urgency
this issue block my project two months, please give some help,thanks.
Platform
Linux
OS Version
CentOS Linux release 7.8.2003 (Core)
ONNX Runtime Installation
build from source
ONNX Runtime Version or Commit ID
1.6.0
ONNX Runtime API
C++
Architecture
X64
Execution Provider
Default CPU
Execution Provider Library Version
cuda 10.2
Model File
No response
Is this a quantized model?
No
The text was updated successfully, but these errors were encountered: