Skip to content

Commit

Permalink
Replace FDIV with FMUL in hot loop.
Browse files Browse the repository at this point in the history
Shaves off 25% runtime on Ampere Altra running OCR using
the tessdata_orig Russian language model with --oem 2.
  • Loading branch information
heshpdx committed Sep 23, 2024
1 parent ff0a38d commit 7852c79
Showing 1 changed file with 2 additions and 1 deletion.
3 changes: 2 additions & 1 deletion src/lstm/functions.h
Original file line number Diff line number Diff line change
Expand Up @@ -200,8 +200,9 @@ inline void SoftmaxInPlace(int n, T *inout) {
inout[i] = prob;
}
if (prob_total > 0) {
T inv_prob_total = 1.0/prob_total;
for (int i = 0; i < n; i++) {
inout[i] /= prob_total;
inout[i] *= inv_prob_total;
}
}
}
Expand Down

0 comments on commit 7852c79

Please sign in to comment.