More efficient implementation of integers #147
Replies: 6 comments 14 replies
-
@pitrou had a link to portable safe overflow checks: portable-snippets |
Beta Was this translation helpful? Give feedback.
-
I’ve been trying to get @lpereira interested. |
Beta Was this translation helpful? Give feedback.
-
Any progress on this? |
Beta Was this translation helpful? Give feedback.
-
@lpereira did you push a branch to https://github.com/faster-cpython/cpython? |
Beta Was this translation helpful? Give feedback.
-
just as a comment, this is essentially how pypy does it. two representations of int, one storing a machine word with the value, one being a pointer to a list of digits. we have it a bit simpler in that we can hide the tag bit to distinguish the two cases in the header. there's a secondary benefit for algorithms that really operate on big integers. even in such programs, operations that mix a huge int and a machine-word sized one are common, so we have specialized implementations for long+int, long*int, etc. |
Beta Was this translation helpful? Give feedback.
-
We need a better plan for this. I don't think it realistic to expect to merge very large changes, and long lived branches are a pain to maintain. Once python/cpython#30496 is merged, we should consider a way to break this into manageable chunks. One possible plan is this:
typedef struct _ {
uintptr_t size_plus;
digit ob_digits[1];
} PyLongValue;
typedef struct _ {
HEADER;
PyLongValue value;
} PyLongObject;
|
Beta Was this translation helpful? Give feedback.
-
This is a reboot of #42. Any comments mentioning
range
will be deleted!This might be a useful precursor to #138 as we will need to implement distinct paths for small integers vs larger integers for #138 as well.
Currently,
int
is implemented as an array of digits, where a digit is (approx) half a machine word.I would suggest changing this to:
Values that, if represented as
PyLongObject
, would have-2 <= size <= 2
would be represented as aPyIntObject
withobj->tagged_value_or_size_and_sign = value << 2
.Other values would be represented as a
PyLongObject
as they are now, except that the size would be stored as(size<<1)+1
.Intrinsic functions for add-with-overflow will help with performance. GCC has those. Windows has them for unsigned values only (I don't know why).
Even without intrinsics, the overflow checks can be implemented in only a few instructions, so should still be faster than what we have now.
Beta Was this translation helpful? Give feedback.
All reactions