RFC: Array.usize #4654

fgdorais · 2024-07-05T01:23:49Z

Proposal

I propose to add Array.usize to get the USize size of an array, without going through Nat.

Currently, to get the USize of an array, one needs to use Array.size and then Nat.toUSize. The latter takes the return value of Array.size and calculates the remainder modulo USize.size. This is not optimal since arrays are stored in memory and therefore cannot exceed USize.size. This proposal would bypass the remainder step and return the correct array size as an USize scalar (which is trivial to access in the current array implementation).

Other array operations have optimizations that bypass this trivial size check for the sake of nearly matching C-language performance. Some of my code at UnicodeBasic and elsewhere could benefit a lot from this nearly trivial improvement to the array API.

The drawback is that this adds one more unsafe implementation. So this request needs some careful consideration.

Impact

Add 👍 to issues you consider important. If others benefit from the changes in this proposal being added, please ask them to add 👍 to it.

The text was updated successfully, but these errors were encountered:

Kha · 2024-07-05T07:23:56Z

This proposal would bypass the remainder step

Surely this step is optimized out by LLVM anyway?

fgdorais · 2024-07-05T21:13:07Z

Of course, no actual division is done but it's not quite optimized out. The issue is that the m_size component of a lean_array_object cannot be larger than LEAN_MAX_SMALL_NAT but LLVM doesn't know that which leads to overly complex code. On the small branch, the only one actually used, LLVM does optimize the box/unbox to an (unnecessary!) and to clear the top bit. Anyway, the end result much more complex than what should simply return the m_size component of the array.

Add efficient usize functions for Array, ByteArray, FloatArray. Closes leanprover#4654

Add efficient `usize` functions for `Array`, `ByteArray`, `FloatArray`. This is part 1 of 2 since there is a need to update stage0 between the two parts. (See discussion below.) Closes #4654

This is part 2 of 2 of #4801 (which closes #4654). That PR was split in two to allow a stage0 update between declaring the `usize` functions and using them where they are needed.

fgdorais added the RFC Request for comments label Jul 5, 2024

Kha added RFC accepted RFC is waiting for a corresponding PR (external or internal) P-low We are not planning to work on this issue labels Jul 19, 2024

fgdorais mentioned this issue Jul 21, 2024

feat: usize for array types #4801

Merged

fgdorais added a commit to fgdorais/lean4 that referenced this issue Jul 21, 2024

feat: add usize for array types

1b2ead2

Add efficient usize functions for Array, ByteArray, FloatArray. Closes leanprover#4654

nomeata closed this as completed in #4801 Jul 21, 2024

fgdorais mentioned this issue Jul 21, 2024

feat: use usize for array types #4802

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: Array.usize #4654

RFC: Array.usize #4654

fgdorais commented Jul 5, 2024

Kha commented Jul 5, 2024

fgdorais commented Jul 5, 2024 •

edited

Loading

RFC: Array.usize #4654

RFC: Array.usize #4654

Comments

fgdorais commented Jul 5, 2024

Proposal

Impact

Kha commented Jul 5, 2024

fgdorais commented Jul 5, 2024 • edited Loading

fgdorais commented Jul 5, 2024 •

edited

Loading