Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disable thread pool creation when enabled OpenMP #2485
Disable thread pool creation when enabled OpenMP #2485
Changes from 3 commits
8f53449
f362218
43f5176
3d9d892
cba7e2b
1686db3
c713bf3
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems this would disable parallel executor when building OpenMP? Say, if running on a 8-HT machine, with OMP_NUM_THREADS=4 and -y 2 in onnxruntime_perf_test, is that a valid setting? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good point. Do you think it's better to keep the previous behavior for inter threadpool (-y) ? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Parallel executor is off by default so it should not cause any perf difference in your case. By keeping this logic, we can try to enable it together with OpenMP in case some models might benefit from the combination.
In reply to: 350977021 [](ancestors = 350977021)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If USE_OPENMP is not defined, please initialize the var to nullptr here. Because the next few lines are referencing this variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we change -x to call omp_set_num_threads() when building with OpenMP?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
currently this is controlled by env var
OMP_NUM_THREADS
. if -x has different value toOMP_NUM_THREADS
, what is the expected behavior? I think this may cause confusingThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe set_omp_num_threads would override the env then, which is what user would expect if OpenMP is mainly used for intra node or data parallelism. Otherwise, we need to ask user to use different ways to specify data parallelism #threads.
In reply to: 350977758 [](ancestors = 350977758)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is one tricky part of the current behavior:
if OpenMP is not enabled, if you set
-x 8
, it means you will have 9 threads (main + 8 workers)if OpenMP is enabled, if you set
OMP_NUM_THREADS=8
, it means you will have 8 threads in total (ie. main + 7 workers)this cause the problem: if we want to extend
-x
, the behavior will be inconsistent.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless we unify the definition of "numThreads" ( number of worker threads? or number of threads in total? ), I would prefer to keep the current status
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even if -x means different things when using OpenMP vs. not using OpenMP, I'd prefer to have -x to set OMP thread count, because it's easier to control than setting env and prevents accidentally picking up previously set env, especially in Windows. In this case, we are not changing any existing behavior, and only made it easy to control omp threads.
In reply to: 351000042 [](ancestors = 351000042)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You cannot expect that everyone fully understand this tricky behavior, and it's not necessarily have to understand. It can't be easier to make mistakes -- create threadpool with the wrong number of threads without being aware of it.
I would even prefer to add a new flag rather than
-x
, for example,--omp-num-threads
or anything you name it, but not combining it with-x
to make it multi-behavior.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me, so we just disable -x in OpenMP build.
In reply to: 351019314 [](ancestors = 351019314)