-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Port standard instructions to Rust. #13486
base: main
Are you sure you want to change the base?
Conversation
This way, we can also use this enum to tag the Python classes.
I'm concerned that the I feel like the bitshift and masking operations could be extended over the whole u64, and that'll all automatically work on BE and 32-bit systems, especially with the size of The |
I guess my main point is this: the 64-bit
Introducing If you want a shade more encapsulation than my loose struct PayloadLocation<T: SomeBytemuckOrCustomTrait> {
mask: u64,
shift: u32,
datatype: PhantomData<T>,
}
impl PayloadLocation {
// Note we _mustn't_ accept `PackedOperation` until we've got a valid one,
// because holding a `PackedOperation` implies ownership over its stored
// pointer, and it mustn't `Drop` or attempt to interpret partial data.
#[inline]
fn store(&self, src: T, target: u64) -> u64 {
let src: u64 = bytemuck::cast(src);
target | ((src << self.shift) & self.mask)
}
#[inline]
fn load(&self, target: u64) -> T {
bytemuck::cast((target & self.mask) >> self.shift)
}
}
const PACKED_OPERATION_DISCRIMINANT = PayloadLocation { mask: 0b111, shift: 0 };
const STANDARD_GATE_DISCRIMINANT = PayloadLocation { mask: 0bff << 3, shift: 3 };
const POINTER_PAYLOAD = PayloadLocation { mask: usize::MAX as u64 & 0b000, shift: 0 };
// and so on (Bear in mind I just typed that raw without trying it, and I didn't think all that hard about what the trait bound should be.) If everything's inlined and constant, the compiler absolutely should be able to compile out 32 bit shifts and all-ones masks into partial register reads/loads itself, so there oughtn't to be any overhead. |
This reverts commit b6d8f92.
…hs." This reverts commit 6049498.
Trying out a neat crate for Rust bitfields. The caveats are: * Custom enums used in the bitfield must specify const funcs for bit conversion, and bytemuck's cast functions aren't const. * The bitfield crate implements Clone for you, but does not seem to have a way to disable this. We can't rely on their clone, since for pointer types we need to allocate a new Box. To get around this, PackedOperation is a wrapper around an internal bitfield struct (rather than being a bitfield struct itself). (Note: I'm not yet happy with this. Specifically, I think the abstraction may be cleaner if OpBitField is defined entirely in terms of raw native types and any reinterpretation / transmutation is done from PackedOperation. Consider this a first pass.)
I can't easily test this on my own machine, but I think it'll work.
One or more of the following people are relevant to this code:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't thought hugely hugely about all the implications of this all the way through, but here's comments on the implementation.
The rough summary is:
- this does look like a successful way of doing it
- I agree that you can kind of view these as bitfield structs
- I'm very nervous that
bitfield-struct
is causing us to invoke undefined behaviour, introduce moreunsafe
, and I'm not certain it's making the code a huge amount clearer.
My biggest worry about bitfield-struct
as written here is about accessing the PackedOperation
discriminant - getting it from the StandardGateBits
representation is, I believe, UB. We can swap that to the pointer form, but the fact that it's so easy to produce UB like that makes me nervous.
My other hesitation is that generally these aren't entirely bitfields, because of the overlap. All the extra code needed to support the masking out of the low bits of the pointer feels more complex to me than it was before, and I'm not convinced that bitfield_struct
is offering us enough in return. If anything, it feels like we might have spread the dangerous assumptions further around across more types and more (type-level) indirection than there was before.
That said, I'm not really strongly against it. I worry that it's using a pneumatic drill to solve a hammer-and-nail problem here since there really are not many fields at all in our structs, and these structs should not need much manipulation other than construction. My perception of the complexity of manual bitshift-and-mask operations compared to this might be different to other people's.
@@ -695,6 +701,37 @@ impl<'py> FromPyObject<'py> for OperationFromPython { | |||
extra_attrs: extract_extra()?, | |||
}); | |||
} | |||
'standard_instr: { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're adding this label as a separate one, you probably want to rename the 'standard
label above to 'standard_gate
.
// Our Python standard instructions have a `_standard_instruction_type` field at the | ||
// class level so we can quickly identify them here without an `isinstance` check. | ||
// Once we know the type, we query the object for any type-specific fields we need to | ||
// read (e.g. a Barrier's number of qubits) to build the Rust representation. | ||
let Some(standard_type) = ob_type | ||
.getattr(intern!(py, "_standard_instruction_type")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why _standard_instruction_type
and not _standard_instruction
, which is consistent with _standard_gate
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can change this to _standard_instruction
, if you prefer. I did this because there's also a Rust-side StandardInstruction
enum, and even though it isn't exposed to Python, my thinking was that this would be more consistent to readers of the Qiskit codebase if the name here aligned to the StandardInstructionType
Rust enum that backs it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can only speak to my interpretation, but to me, the mismatched attributed names jumped out as odd. I get that it's not super pleasant with needing the StandardInstruction
/StandardInstructionType
split anyway, and I don't feel strongly about which way you want to keep it.
#[derive(Clone, Copy, Debug, PartialEq, Eq)] | ||
#[pyclass(module = "qiskit._accelerate.circuit", eq, eq_int)] | ||
#[repr(u8)] | ||
pub(crate) enum StandardInstructionType { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I don't think I've seen any compelling reasons to use pub(crate)
yet. Either it should be entirely private to this module, or it's useful outside the module, in which case it's likely to be useful outside the crate too. Feels to me that almost every time we've introduced a pub(crate)
, we've had a PR shortly after making it a pub
, and in this case, it's logically exposed publicly to Python space anyway.
Should this type live here at all, or is it just an implementation detail of PackedOperation
, and could move there?
(Though see other questions about the new PackedOperation
implementation - maybe other stuff should move here.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is pub(crate)
here only so that we can expose it to Python from lib.rs
, IIRC. Perhaps that is not necessary and I was mistaken.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's needed to export for Python, it's fine (though if it's public to Python, is there no chance that it's useful elsewhere in Rust too?). I mostly just knee-jerk against pub(crate)
, I think, because it's caused us a fair amount of churn in the last few months.
if label.is_some() || unit.is_some() || duration.is_some() || condition.is_some() { | ||
let mut mutable = false; | ||
if let Some(condition) = condition { | ||
if !mutable { | ||
out = out.call_method0("to_mutable")?; | ||
mutable = true; | ||
} | ||
out.setattr("condition", condition)?; | ||
} | ||
if let Some(duration) = duration { | ||
if !mutable { | ||
out = out.call_method0("to_mutable")?; | ||
mutable = true; | ||
} | ||
out.setattr("_duration", duration)?; | ||
} | ||
if let Some(unit) = unit { | ||
if !mutable { | ||
out = out.call_method0("to_mutable")?; | ||
} | ||
out.setattr("_unit", unit)?; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does the unit
/duration
stuff here interact with Delay
, which expects those fields already?
// Remember to update StandardGate::is_valid_bit_pattern below | ||
// if you add or remove this enum's variants! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Weird that you added a comment here, but not on the StandardInstructionType
introduced in this PR with the same consideration, but thanks for catching it!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is more an artifact of my workflow, which tends to be fairly iterative. I noticed these were out of sync, so I left a comment and assumed I'd fix up the rest of the docs / comments for the PR in accordance once everything else was fleshed out.
// SAFETY: we read (just!) the discriminant from any of the union's members, | ||
// since we guarantee it is found in the same place for all bitfields. | ||
unsafe { self.gate.discriminant() } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm really nervous about this, because I think this is already invoking undefined behaviour, which wasn't happening before the bitfield structs were introduced.
Not all 8-bit patterns are valid StandardGate
, and doing self.gate
interprets bits 3 to 11 directly as a StandardGate
in order to produce the StandardGateBits
struct. If this object is actually a pointer, those 8 bits may well not be a StandardGate
, and that's UB, even if we don't explicitly access them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The StandardGateBits
struct is actually just a transparent tuple struct around a u64
. And that u64
is zero-initialized, so we should be safe here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can buy that it isn't UB, though I don't think it's because of 0 initialisation (the u64 of self
here isn't 0 - there's no new
to be called, since we're interpreting a union field). If it's not UB, it's because the StandardGate
type isn't constructed by the (internally generated) mask and shift of the getter, but the definition of the struct
certainly makes it look like you should think about the value as having been constructed.
I think I was wrong initially and it's not UB, but I'm still concerned about how non-obvious it is without having really thought about how bitfield_struct
must work internally, which is more mental load.
#[bitfield(u64, new = false)] | ||
struct PointerBits { | ||
#[bits(3, access = RO)] | ||
discriminant: PackedOperationType, | ||
#[bits(61, from = PointerBits::unpack_address, into = PointerBits::pack_address)] | ||
address: usize, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that bitfield_struct
automatically impls Copy
, and there's no way to disable this. I'm really nervous about that, because this PR is introducing a lot more types into the system here, and I think it's really easy to lose the ownership semantics in the middle of all of them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that it ought not to. The author has merged a PR I've made which allows the derived Clone
(and thus Copy
) to be disabled. Once included in a release, we should be able to get rid of the inner BitField
and make PackedOperation
the union type instead, if we should want to.
/// The bitfield layout used for standard gates. | ||
#[bitfield(u64)] | ||
struct StandardGateBits { | ||
#[bits(3, default = PackedOperationType::StandardGate, access = RO)] | ||
discriminant: PackedOperationType, | ||
#[bits(8)] | ||
standard_gate: StandardGate, | ||
#[bits(53)] | ||
__: u64, | ||
} | ||
|
||
/// The bitfield layout used for standard instructions. | ||
#[bitfield(u64)] | ||
struct StandardInstructionBits { | ||
#[bits(3, default = PackedOperationType::StandardInstruction, access = RO)] | ||
discriminant: PackedOperationType, | ||
#[bits(8)] | ||
standard_instruction: StandardInstructionType, | ||
#[bits(21)] | ||
_pad1: u32, | ||
#[bits(32)] | ||
value: ImmediateValue, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's odd to me that you went to all the trouble with the 32-bit pointers to make them aligned, even though they're the "terrible performance" path on platforms we don't even care much about, yet StandardGate
and StandardInstruction
(very much meant to be the absolute fastest paths) have their internal discriminant misaligned.
A second consideration: an alternative to having StandardInstructionBits
forcibly define the bitlayout of StandardInstruction
itself would be to have StandardInstruction
have a to_bits
method that is required to return a 61-bit value, and leave it up to StandardInstruction
. That would let the code be a lot more local to the rest of the type, while in this form, there's still co-operation necessary between StandardInstruction
and StandardInstructionBits
, but they're defined far away from each other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's odd to me that you went to all the trouble with the 32-bit pointers to make them aligned, even though they're the "terrible performance" path on platforms we don't even care much about, yet StandardGate and StandardInstruction (very much meant to be the absolute fastest paths) have their internal discriminant misaligned.
I did consider this, and we can certainly add a bit of padding to get them over the byte boundary. The 32-bit case was just what I was focusing on initially since I understood we would be impeding performance for that case the most. You didn't pad StandardGate
originally, so I figured I wasn't making anything worse by deferring that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'm onboard with any potential modification to StandardGate
being done in a separate PR - if nothing else, it'll let us see if the change is actually worth it.
When I wrote the initial PR, I think my thought process was "a bitshift will be so fast we don't notice it", and "we might need those padding bits later". Certainly worth challenging those assumptions - I didn't test them.
/// An inline value present within [StandardInstructionBits] layouts used to store arbitrary | ||
/// data to be interpreted according to the encoded standard instruction. | ||
#[derive(Clone, Copy, Debug)] | ||
#[repr(transparent)] | ||
struct ImmediateValue(u32); | ||
|
||
impl ImmediateValue { | ||
const fn into_bits(self) -> u32 { | ||
self.0 | ||
} | ||
|
||
const fn from_bits(value: u32) -> Self { | ||
Self(value) | ||
} | ||
|
||
#[inline] | ||
fn from_delay_unit(unit: DelayUnit) -> Self { | ||
Self(unit as u32) | ||
} | ||
|
||
#[inline] | ||
fn delay_unit(&self) -> DelayUnit { | ||
::bytemuck::checked::cast(self.0 as u8) | ||
} | ||
|
||
#[inline] | ||
fn u32(&self) -> u32 { | ||
self.0 | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This ImmediateValue
is largely indicative of the problem I meant above. I feel like there's assumptions and invariants that need to be upheld between two places now, and that makes things a lot more difficult:
- the maximum width of a
StandardInstruction
payload must be the same size asImmediateValue
. This PR already violates that from a typing perspective (Barrier(usize)
). ImmediateValue
has been marked as derivingClone
andCopy
, except the actual type that's been punned to au32
might not logically allow theClone
andCopy
. Obviouslyu32
would let us clone it unfairly too, but it's easier to reason about ownership semantics if the actual owner is the one producing the new type.- (much more minor and future looking) why do we need to apply an arbitrary limit of 32 bits anyway? What if there's a payload in the future that's 2x16 bits, and those don't need to be pushed up to the 32-bit boundary for aligned access anyway. It feels a bit neater to me to give
StandardInstruction
the whole payload if it wants it, and say it's required to return a 64-bit value with the low three bits 0, then let it put things where it wants them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With points 1 and 2: my main problem is that it's not clear at the site of definition of StandardInstruction
that these needs to be upheld, which I imagine is how Barrier(usize)
ended up like the PR state in the first place.
I think something like a trait implementation of the packing might help that a bit, because then the trait defines the interface the type is required to fulfil, and whether the type fulfils them can be audited and understood more locally. The need to have the unsafe bit saying "the low three bits must be 0" isn't super ideal, but it reduces the amount of co-operation we have to enforce.
StandardInstruction::Barrier(num_qubits) => { | ||
let num_qubits: u32 = num_qubits.try_into().expect( | ||
"The PackedOperation representation currently requires barrier size to be <= 32 bits." | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then why don't we enforce that in the typing?
Apologies for the delay in responding to your initial comments, @jakelishman! I did read them, and they were ultimately what led me to rewrite the initial draft using bitfields. I meant to reply to your comments at the same time I pushed up these changes, but didn't have enough time yesterday. Regarding the initial draft:
This is exactly what I was concerned about, though I was also trying to accomplish it in a way that the alignment of
This was true, and ultimately indicative of me trying to jam everything into a single instruction encoding / layout. I liked the Regarding the current version:
This I'm not convinced of. Type-punning is allowed in Rust and is not UB, as long as the access doesn't produce an invalid value, if I understand correctly. It's also of course important that we're not reading uninitialized memory, which we shouldn't be here. From https://doc.rust-lang.org/reference/items/unions.html#reading-and-writing-union-fields
If you're worried about the representations of the fields comprising the union, they are all
In the 64-bit pointer case, I was also thinking about it as "overlapping" a pointer with a discriminant originally, and that was the reason I hadn't thought to use bitfields. But really, there's no overlapping here. We're simply storing the upper 61 bits (the interesting ones!) of an address inside of a 61 bit field. I'd argue that the code is less complex in terms of bit manipulation: there's no explicit masking at all, we just shift an address down by 3 and store it.
The original approach was certainly the right one at the time. In earlier commits, I explored extending it for |
For the UB: I was concerned about the UB because I know (as your quote says) that even constructing an Re the pointer overlap: a pointer is 64 bits, not 61 bits, so what the bitfield is storing isn't a pointer, it's as you say, the informative bits of the pointer - the true value of the actual pointer really is overlapped with the discriminant. Strictly, x86_64 pointers are only 48 bits wide and required to be all 0s in the top 16 bits, so we might even want to use those bits too in the future, and they'd be overlapping. Then it'd really be clearer as a single mask, not a bitfield.
I mentioned before that imo, the fields do overlap for pointers, and we're fighting I can totally believe that the Fwiw, my preferred form probably isn't the struct PackedOperation(u64);
fn do_something(&self) {
let discriminant = self.0 & Self::DISCRIMINANT_MASK;
let payload = self.0 & !Self::DISCRIMINANT_MASK;
match discriminant {
PackedOperationType::StandardGate => <StandardGate as Packable>::unpack(payload),
PackedOperationType::StandardInstruction => <StandardInstruction as Packable>::unpack(payload),
PackedOperationType::PyGate => ...,
} and all the other bit-twiddling / whatever is local to the types themselves. In cases where a bitfield representation is appropriate (certainly I could be convinced by |
Summary
Adds a new Rust enum
StandardInstruction
representation for standard instructions (i.e.Barrier
,Delay
,Measure
andReset
) and updates the encoding ofPackedOperation
to support storing it compactly at rest.The size of a
PackedOperation
has now been anchored to always be 64 bits, even on 32-bit systems. This is necessary to encode the data payload of standard instructions likeBarrier
andDelay
. See the docstrings withinpacked_instruction.rs
for details.Details and comments
The implementation works very similarly to what we currently do for
StandardGate
, but with a bit of extra consideration for an additional data payload used by variadic/template instructions likeBarrier
andDelay
.Similarly to
StandardGate
, the newly addedStandardInstruction
enum serves as the first-class Rust interface for working with standard instructions. The existingOperationRef
enum returned byPackedOperation::view
has a new variantStandardInstruction
which exposes this.Unlike
StandardGate
,StandardInstruction
is not apyclass
because it is not directly used to tag instructions as standard in our Python class definitions. Instead, the simple integral enumStandardInstructionType
is used as the tag, andOperationFromPython::extract_bound
further queries the incoming Python object for variadic/template data when building the completeStandardInstruction
.Encoding
The
PackedOperation
encoding has been updated as follows:PackedOperation
is now always 64 bits wide even on 32 bit systems, and is now a wrapper around a custom union type,BitField
. This union contains threebitfield!(u64)
types,StandardGateBits
,StandardInstructionBits
, andPointerBits
, defined using thebitfield-struct
crate, which gives us compile-time validation of bitfield layout and eliminates the need for manual masking and bit manipulation. Similar to a CPU ISA, the discriminant identifies which bitfield / layout should be used when decoding the rest of the operation. See inline doc strings for more detail.StandardInstruction
.Todo: