Skip to content

Commit

Permalink
[SECP256K1] Optimization: special-case zero modulus limbs in modinv64
Browse files Browse the repository at this point in the history
Summary:
```
Both the field and scalar modulus can be written in signed{30,62}
notation with one or more zero limbs. Make use of this in the update_de
function to avoid a few wide multiplications when that is the case.

This doesn't appear to be a win in the 32-bit implementation, so only
do it for the 64-bit one.
```

Partial backport of [[bitcoin-core/secp256k1#831 | secp256k1#831]]:
bitcoin-core/secp256k1@9164a1b

Depends on D9408.

Test Plan:
  ninja check-secp256k1

Reviewers: #bitcoin_abc, majcosta

Reviewed By: #bitcoin_abc, majcosta

Differential Revision: https://reviews.bitcoinabc.org/D9409
  • Loading branch information
sipa authored and Fabcien committed Apr 14, 2021
1 parent 9f5afc1 commit df7744c
Showing 1 changed file with 12 additions and 6 deletions.
18 changes: 12 additions & 6 deletions src/modinv64_impl.h
Original file line number Diff line number Diff line change
Expand Up @@ -338,22 +338,28 @@ static void secp256k1_modinv64_update_de_62(secp256k1_modinv64_signed62 *d, secp
/* Compute limb 1 of t*[d,e]+modulus*[md,me], and store it as output limb 0 (= down shift). */
cd += (int128_t)u * d1 + (int128_t)v * e1;
ce += (int128_t)q * d1 + (int128_t)r * e1;
cd += (int128_t)modinfo->modulus.v[1] * md;
ce += (int128_t)modinfo->modulus.v[1] * me;
if (modinfo->modulus.v[1]) { /* Optimize for the case where limb of modulus is zero. */
cd += (int128_t)modinfo->modulus.v[1] * md;
ce += (int128_t)modinfo->modulus.v[1] * me;
}
d->v[0] = (int64_t)cd & M62; cd >>= 62;
e->v[0] = (int64_t)ce & M62; ce >>= 62;
/* Compute limb 2 of t*[d,e]+modulus*[md,me], and store it as output limb 1. */
cd += (int128_t)u * d2 + (int128_t)v * e2;
ce += (int128_t)q * d2 + (int128_t)r * e2;
cd += (int128_t)modinfo->modulus.v[2] * md;
ce += (int128_t)modinfo->modulus.v[2] * me;
if (modinfo->modulus.v[2]) { /* Optimize for the case where limb of modulus is zero. */
cd += (int128_t)modinfo->modulus.v[2] * md;
ce += (int128_t)modinfo->modulus.v[2] * me;
}
d->v[1] = (int64_t)cd & M62; cd >>= 62;
e->v[1] = (int64_t)ce & M62; ce >>= 62;
/* Compute limb 3 of t*[d,e]+modulus*[md,me], and store it as output limb 2. */
cd += (int128_t)u * d3 + (int128_t)v * e3;
ce += (int128_t)q * d3 + (int128_t)r * e3;
cd += (int128_t)modinfo->modulus.v[3] * md;
ce += (int128_t)modinfo->modulus.v[3] * me;
if (modinfo->modulus.v[3]) { /* Optimize for the case where limb of modulus is zero. */
cd += (int128_t)modinfo->modulus.v[3] * md;
ce += (int128_t)modinfo->modulus.v[3] * me;
}
d->v[2] = (int64_t)cd & M62; cd >>= 62;
e->v[2] = (int64_t)ce & M62; ce >>= 62;
/* Compute limb 4 of t*[d,e]+modulus*[md,me], and store it as output limb 3. */
Expand Down

0 comments on commit df7744c

Please sign in to comment.