You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I did some tests with the tbb back end and it is as fast the OpenMP2 blocks back end (which makes sense) for many cases and architectures, but at least for one test and architecture I did, it is much slower. I checked the generated and executing code and it is exactly the same for tbb and openmp, but nevertheless tbb was two orders of magnitude slower.
My theory is that this happens because of missing thread pinning with tbb. Unfortunately there is no easy way to steer the thread placement in tbb like for OpenMP, but it should be evaluated how this goal can still be achieved. This link is probaly a good start to investigate the topic.
The text was updated successfully, but these errors were encountered:
I did some tests with the tbb back end and it is as fast the OpenMP2 blocks back end (which makes sense) for many cases and architectures, but at least for one test and architecture I did, it is much slower. I checked the generated and executing code and it is exactly the same for tbb and openmp, but nevertheless tbb was two orders of magnitude slower.
My theory is that this happens because of missing thread pinning with tbb. Unfortunately there is no easy way to steer the thread placement in tbb like for OpenMP, but it should be evaluated how this goal can still be achieved. This link is probaly a good start to investigate the topic.
The text was updated successfully, but these errors were encountered: