Skip to content

Improves 2d tiled matmulnbits by repeating A, loads N times for each B load #12217

Improves 2d tiled matmulnbits by repeating A, loads N times for each B load

Improves 2d tiled matmulnbits by repeating A, loads N times for each B load #12217

Onnxruntime-SCA-training-CUDA

succeeded Dec 11, 2024 in 1h 17m 56s