You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the current implementation, the library node expansion pattern doesn't support loop blocking in vertical direction, see below for the generated C++ code,
This is fine when vertical loop size is small, however, when we run high resolution with larger dimensional size in vertical direction, this vertical loop is no longer cache friendly, we will need to add new library node expansion pattern to support loop blocking in vertical direction, with which more cache friendly code can be generated as fellowing,
The bold code is the new code that added with the new library node expansion pattern. The corresponding PR will be added and update here for further review.
The text was updated successfully, but these errors were encountered:
FlorianDeconinck
changed the title
loop blocking in vertical direction
[cartesian] Loop blocking in vertical direction
Aug 20, 2024
This is a dace:X backend limitation on how we setup the sections in DaCe extensions. Will be fixed as part of a more holistic review of CPU optimization - as vertical loop blocking is but one of the many shortcomings of the CPU strategy with dace:X backends
In the current implementation, the library node expansion pattern doesn't support loop blocking in vertical direction, see below for the generated C++ code,
This is fine when vertical loop size is small, however, when we run high resolution with larger dimensional size in vertical direction, this vertical loop is no longer cache friendly, we will need to add new library node expansion pattern to support loop blocking in vertical direction, with which more cache friendly code can be generated as fellowing,
The bold code is the new code that added with the new library node expansion pattern. The corresponding PR will be added and update here for further review.
The text was updated successfully, but these errors were encountered: