Skip to content

Commit

Permalink
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Try to fix the links
Browse files Browse the repository at this point in the history
Signed-off-by: Chen, Peter <[email protected]>
peterchen-intel committed Oct 8, 2024
1 parent 18affca commit 29d90ac
Showing 2 changed files with 6 additions and 6 deletions.
Original file line number Diff line number Diff line change
@@ -304,16 +304,16 @@ mentioned above.
Execution on CPU device
##########################

As mentioned on :ref:`Inference threads wait actively <_Inference_threads_wait_actively>`, OpenVINO default threading library
As mentioned on :ref:`Inference threads wait actively <Inference_threads_wait_actively>`, OpenVINO default threading library
oneTBB keeps CPU cores actively for 1ms after inference done. When using Optimum Intel Python API,
it will call Torch (via HF transformers) for postprocessing (for example beam search or gready search).
Torch uses OpenMP for threading, OpenMP will need to wait for CPU cores which are being kept actively by
oneTBB. OpenMP by default has the `busy-wait <https://gcc.gnu.org/onlinedocs/libgomp/GOMP_005fSPINCOUNT.html>__` which can delay the next OpenVINO inference as well.
oneTBB. OpenMP by default has the `busy-wait <https://gcc.gnu.org/onlinedocs/libgomp/GOMP_005fSPINCOUNT.html>`__ which can delay the next OpenVINO inference as well.

The recommendation

* Limit the CPU core number used by Torch. `torch.set_num_threads <https://pytorch.org/docs/stable/generated/torch.set_num_threads.html>__`
* Set environment variable `OMP_WAIT_POLICY <https://gcc.gnu.org/onlinedocs/libgomp/OMP_005fWAIT_005fPOLICY.html>__` to PASSIVE which will disable OpenMP `busy-wait <https://gcc.gnu.org/onlinedocs/libgomp/GOMP_005fSPINCOUNT.html>__`
* Limit the CPU core number used by Torch. `torch.set_num_threads <https://pytorch.org/docs/stable/generated/torch.set_num_threads.html>`__
* Set environment variable `OMP_WAIT_POLICY <https://gcc.gnu.org/onlinedocs/libgomp/OMP_005fWAIT_005fPOLICY.html>`__ to PASSIVE which will disable OpenMP `busy-wait <https://gcc.gnu.org/onlinedocs/libgomp/GOMP_005fSPINCOUNT.html>`__

Additional Resources
#####################
Original file line number Diff line number Diff line change
@@ -193,8 +193,8 @@ For details on multi-stream execution check the
Inference threads wait actively
###############################

OpenVINO is by default built with `oneTBB <https://github.com/oneapi-src/oneTBB/>__` threading library,
oneTBB has a feature worker_wait like `OpenMP <https://www.openmp.org/>` `busy-wait <https://gcc.gnu.org/onlinedocs/libgomp/GOMP_005fSPINCOUNT.html>__` which makes OpenVINO inference
OpenVINO is by default built with `oneTBB <https://github.com/oneapi-src/oneTBB/>`__ threading library,
oneTBB has a feature worker_wait like `OpenMP <https://www.openmp.org/>`__ `busy-wait <https://gcc.gnu.org/onlinedocs/libgomp/GOMP_005fSPINCOUNT.html>`__ which makes OpenVINO inference
threads wait actively for 1ms after task done. The intention is to avoid CPU inactive in the
tranaction time between tasks of inference. If the postprocessing uses another threading library,
for example OpenMP, OpenVINO inference will occupy CPU cores for addtional 1ms after inference done.

0 comments on commit 29d90ac

Please sign in to comment.