CpuMath Enhancement: Preamble for hardware intrinsics implementation #830
Labels
enhancement
New feature or request
P2
Priority of the issue for triage purpose: Needs to be fixed at some point.
up-for-grabs
A good issue to fix if you are trying to contribute to the project
Style changes needed to solve part of #823
Details
src\Microsoft.ML.CpuMath\SseIntrinsics.cs
andsrc\Microsoft.ML.CpuMath\AvxIntrinsics.cs
:Preamble:
For large arrays, especially those that cross cache line or page boundaries, doing this should save some measurable amount of time.
Reference: https://github.com/dotnet/machinelearning/pull/562/files/f0f81a5019a3c8cbd795a970e40d633e9e1770c1#r204061074
#1143
Currently these functions are just using Unaligned Loads, we can make them after by aligning the data and doing aligned loads.
The text was updated successfully, but these errors were encountered: