Tagged values on the evaluation stack. #418
Replies: 2 comments 1 reply
-
Doesn't it work on comprehensions? With that PR, I get this:
|
Beta Was this translation helpful? Give feedback.
-
It works for most list and set comprehensions, yes, but not for dict comprehensions, or generator expressions, which are sort of comprehensions. The point is that this the specializations of The ideal is that we need fewer specializations and get the benefits of the complex specializations that handle objects with This isn't to say that should continue to add specializations, just that we want to think about alternative and complementary approaches. |
Beta Was this translation helpful? Give feedback.
-
CPython does a lot of boxing. Tagged values (#138) is the (very) long goal to fix this, that is probably a long way off.
Allowing tagged values on the stack might be a useful intermediate step.
Tagged values can provide a way to:
int
s produced when iterating overrange
andenumerate
Consider iterating over
range
.python/cpython#91713 avoids boxing by forming a superinstruction of the specialize
FOR_ITER
and the followingSTORE_FAST
. This only works in afor
loop, not a comprehension. Allowing tagged values on the stack would allow efficient specialization of any iterator or indexing operation that produces a number (potentially includingnumpy
iterators), without needing a superinstruction for each specialization.The compiler can track which operations consume values from the stack, so we can mark at compile time which operations can produce a tagged value, so only some instructions would need to handle tagged values:
And only some instructions would be able to produced tagged values:
Specialization of
LOAD_FAST
for integers, or floats, is a possibility as well.This won't work with #393, but as it potentially saves boxing, as well reference counts, this might be the more profitable approach.
Avoiding N * M specializations
We already specialize moving values to/from the stack depending on where the value is.
If we add specializations for what a value is, then the combination can blow up.
For example, we already have 10 specializations of
LOAD_ATTR
. If we were to have the same for unboxedint
s andfloat
s, we would need 30 specializations, which is excessive.Implementation
Interpreter
By changing the (C) type of values on the stack from
PyObject *
to astruct
, we can prevent casts, so that all conversions need to go through a function/macro.It is then a relatively simple, is laborious, process of converting all uses of stack values to one of:
For example, the implementation of
BINARY_SUBSCR
includes:which would need to become:
Compiler
The compiler will need to do some backward flow analysis from use to determine which stack values can be unboxed.
One possibility is to have two versions of instructions, one that can produce tagged values, one that cannot.
Another option is to have an explicit
BOX
instruction that would box tagged values.E.g.
Since
CALL
cannot handle tagged values, andBINARY_OP
can produce them, we need to fix this.Option 1
Option 2
Neither option is ideal. Option 1 needs a lot more specializations of
BINARY_OP
. Option 2 adds a fair bit of dispatching overhead.Beta Was this translation helpful? Give feedback.
All reactions