You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
within the learn code, there can be repeated reference to dd_remapXX[iprePos]. Whenever it is used more than once, it would probably be efficient to fetch it into a local register.
The text was updated successfully, but these errors were encountered:
But more generally, introduce an annotation "readonly XXX YYY" e.g. readonly scalar x; for variables. Those would not be written back to global memory; the register copy could be declared const
tnowotny
changed the title
In learning kernel there seem to be inefficiencies in sparse implementation
Refine global - register -global transfers
Dec 4, 2018
Well, the read-only part has been implemented by #247 and the equivalent of dd_remap[iprepos] is now loaded into a register here. The issue is that that is done whether it is required or not as discussed in #248
within the learn code, there can be repeated reference to dd_remapXX[iprePos]. Whenever it is used more than once, it would probably be efficient to fetch it into a local register.
The text was updated successfully, but these errors were encountered: