Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port standard instructions to Rust. #13486

Open
wants to merge 29 commits into
base: main
Choose a base branch
from

Conversation

kevinhartman
Copy link
Contributor

@kevinhartman kevinhartman commented Nov 25, 2024

Summary

Adds a new Rust enum StandardInstruction representation for standard instructions (i.e. Barrier, Delay, Measure and Reset) and updates the encoding of PackedOperation to support storing it compactly at rest.

The size of a PackedOperation has now been anchored to always be 64 bits, even on 32-bit systems. This is necessary to encode the data payload of standard instructions like Barrier and Delay. See the docstrings within packed_instruction.rs for details.

Details and comments

The implementation works very similarly to what we currently do for StandardGate, but with a bit of extra consideration for an additional data payload used by variadic/template instructions like Barrier and Delay.

Similarly to StandardGate, the newly added StandardInstruction enum serves as the first-class Rust interface for working with standard instructions. The existing OperationRef enum returned by PackedOperation::view has a new variant StandardInstruction which exposes this.

Unlike StandardGate, StandardInstruction is not a pyclass because it is not directly used to tag instructions as standard in our Python class definitions. Instead, the simple integral enum StandardInstructionType is used as the tag, and OperationFromPython::extract_bound further queries the incoming Python object for variadic/template data when building the complete StandardInstruction.

Encoding

The PackedOperation encoding has been updated as follows:

  • A PackedOperation is now always 64 bits wide even on 32 bit systems, and is now a wrapper around a custom union type, BitField. This union contains three bitfield!(u64) types, StandardGateBits, StandardInstructionBits, and PointerBits, defined using the bitfield-struct crate, which gives us compile-time validation of bitfield layout and eliminates the need for manual masking and bit manipulation. Similar to a CPU ISA, the discriminant identifies which bitfield / layout should be used when decoding the rest of the operation. See inline doc strings for more detail.
  • The discriminant has been widened from 2 to 3 bits (it is now at its maximum width, but we still have room for 3 more variants).
  • The discriminant now has additional variant StandardInstruction.

Todo:

  • Add more detail to PR description.
  • Actually test this on 32 bit systems and a big endian 64 bit arch.

@jakelishman
Copy link
Member

jakelishman commented Nov 25, 2024

I'm concerned that the LoHi struct and PackedOperation union are making it easier to make mistakes on BE and/or 32-bit systems, because the encodings now change between them.

I feel like the bitshift and masking operations could be extended over the whole u64, and that'll all automatically work on BE and 32-bit systems, especially with the size of PackedOperation now fixed to 64 bits - we can even have everything be aligned. The shifts of the payload of instructions can be set such that a u32 payload is always in the high bits, with the padding bits between it and the rest of the discriminants if you're concerned about aligned access, because then the bitshifts will be compiled out into a single register load (just read the u32 from the right place) and the compiler will handle endianness for us. The pointer doesn't need to be handled any differently to any other payload - everything's just shifts and masks anyway (LoHi can't avoid that), and I think introducing the union makes that harder to follow.

The LoHi struct to me seems to be forcing every method of PackedOperation to think about the partial register reads/loads. If we use shifts and masks on a u64 with const masks and shifts, there's no logic split where some of it is done with shifts and some with endianness-dependent partial register access in our own code.

@jakelishman
Copy link
Member

jakelishman commented Nov 25, 2024

I guess my main point is this: the 64-bit PackedOperation can always be thought about as a collection of payloads each of which is stored in a particular mask and needs a particular shift. These are:

  • the PackedOperation discriminant at (0b111, 0)
  • a pointer at (usize::MAX as u64 & 0b000, 0) (regardless of 32-bit or 64-bit)
  • a StandardGate discriminant at (0xff << 3, 3)
  • a StandardInstruction discriminant at (0xf << 3, 3)
  • a DelayUnit payload at (for example) (0x7 << 32, 32)
  • a u32 payload at (u32::MAX as u64 << 32, 32)

Introducing LoHi doesn't remove the need to bitshift and mask for most items, it just means that some of my above list are done one way, some are done another, and the union means that the pointer now has more ways to access it. LoHi also restricts the payload size to u32, when we already have payloads that exceed that (the pointers).

If you want a shade more encapsulation than my loose const shift and mask associated items, would something like this look better to you?

struct PayloadLocation<T: SomeBytemuckOrCustomTrait> {
    mask: u64,
    shift: u32,
    datatype: PhantomData<T>,
}
impl PayloadLocation {
    // Note we _mustn't_ accept `PackedOperation` until we've got a valid one,
    // because holding a `PackedOperation` implies ownership over its stored
    // pointer, and it mustn't `Drop` or attempt to interpret partial data.

    #[inline]
    fn store(&self, src: T, target: u64) -> u64 {
        let src: u64 = bytemuck::cast(src);
        target | ((src << self.shift) & self.mask)
    }
    #[inline]
    fn load(&self, target: u64) -> T {
        bytemuck::cast((target & self.mask) >> self.shift)
    }
}

const PACKED_OPERATION_DISCRIMINANT = PayloadLocation { mask: 0b111, shift: 0 };
const STANDARD_GATE_DISCRIMINANT = PayloadLocation { mask: 0bff << 3, shift: 3 };
const POINTER_PAYLOAD = PayloadLocation { mask: usize::MAX as u64 & 0b000, shift: 0 };
// and so on

(Bear in mind I just typed that raw without trying it, and I didn't think all that hard about what the trait bound should be.)

If everything's inlined and constant, the compiler absolutely should be able to compile out 32 bit shifts and all-ones masks into partial register reads/loads itself, so there oughtn't to be any overhead.

Trying out a neat crate for Rust bitfields. The caveats are:

* Custom enums used in the bitfield must specify const funcs for bit
  conversion, and bytemuck's cast functions aren't const.
* The bitfield crate implements Clone for you, but does not seem to have
  a way to disable this. We can't rely on their clone, since for pointer
  types we need to allocate a new Box. To get around this,
  PackedOperation is a wrapper around an internal bitfield struct
  (rather than being a bitfield struct itself).

(Note: I'm not yet happy with this. Specifically, I think the
abstraction may be cleaner if OpBitField is defined entirely in terms of
raw native types and any reinterpretation / transmutation is done from
PackedOperation. Consider this a first pass.)
@1ucian0 1ucian0 added the Rust This PR or issue is related to Rust code in the repository label Dec 11, 2024
I can't easily test this on my own machine, but I think it'll work.
@kevinhartman kevinhartman marked this pull request as ready for review December 17, 2024 20:10
@kevinhartman kevinhartman requested a review from a team as a code owner December 17, 2024 20:10
@qiskit-bot
Copy link
Collaborator

One or more of the following people are relevant to this code:

  • @Qiskit/terra-core

Copy link
Member

@jakelishman jakelishman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't thought hugely hugely about all the implications of this all the way through, but here's comments on the implementation.

The rough summary is:

  • this does look like a successful way of doing it
  • I agree that you can kind of view these as bitfield structs
  • I'm very nervous that bitfield-struct is causing us to invoke undefined behaviour, introduce more unsafe, and I'm not certain it's making the code a huge amount clearer.

My biggest worry about bitfield-struct as written here is about accessing the PackedOperation discriminant - getting it from the StandardGateBits representation is, I believe, UB. We can swap that to the pointer form, but the fact that it's so easy to produce UB like that makes me nervous.

My other hesitation is that generally these aren't entirely bitfields, because of the overlap. All the extra code needed to support the masking out of the low bits of the pointer feels more complex to me than it was before, and I'm not convinced that bitfield_struct is offering us enough in return. If anything, it feels like we might have spread the dangerous assumptions further around across more types and more (type-level) indirection than there was before.

That said, I'm not really strongly against it. I worry that it's using a pneumatic drill to solve a hammer-and-nail problem here since there really are not many fields at all in our structs, and these structs should not need much manipulation other than construction. My perception of the complexity of manual bitshift-and-mask operations compared to this might be different to other people's.

@@ -695,6 +701,37 @@ impl<'py> FromPyObject<'py> for OperationFromPython {
extra_attrs: extract_extra()?,
});
}
'standard_instr: {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're adding this label as a separate one, you probably want to rename the 'standard label above to 'standard_gate.

Comment on lines +705 to +710
// Our Python standard instructions have a `_standard_instruction_type` field at the
// class level so we can quickly identify them here without an `isinstance` check.
// Once we know the type, we query the object for any type-specific fields we need to
// read (e.g. a Barrier's number of qubits) to build the Rust representation.
let Some(standard_type) = ob_type
.getattr(intern!(py, "_standard_instruction_type"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why _standard_instruction_type and not _standard_instruction, which is consistent with _standard_gate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can change this to _standard_instruction, if you prefer. I did this because there's also a Rust-side StandardInstruction enum, and even though it isn't exposed to Python, my thinking was that this would be more consistent to readers of the Qiskit codebase if the name here aligned to the StandardInstructionType Rust enum that backs it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can only speak to my interpretation, but to me, the mismatched attributed names jumped out as odd. I get that it's not super pleasant with needing the StandardInstruction/StandardInstructionType split anyway, and I don't feel strongly about which way you want to keep it.

#[derive(Clone, Copy, Debug, PartialEq, Eq)]
#[pyclass(module = "qiskit._accelerate.circuit", eq, eq_int)]
#[repr(u8)]
pub(crate) enum StandardInstructionType {
Copy link
Member

@jakelishman jakelishman Dec 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I don't think I've seen any compelling reasons to use pub(crate) yet. Either it should be entirely private to this module, or it's useful outside the module, in which case it's likely to be useful outside the crate too. Feels to me that almost every time we've introduced a pub(crate), we've had a PR shortly after making it a pub, and in this case, it's logically exposed publicly to Python space anyway.

Should this type live here at all, or is it just an implementation detail of PackedOperation, and could move there?

(Though see other questions about the new PackedOperation implementation - maybe other stuff should move here.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is pub(crate) here only so that we can expose it to Python from lib.rs, IIRC. Perhaps that is not necessary and I was mistaken.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's needed to export for Python, it's fine (though if it's public to Python, is there no chance that it's useful elsewhere in Rust too?). I mostly just knee-jerk against pub(crate), I think, because it's caused us a fair amount of churn in the last few months.

Comment on lines +468 to +490
if label.is_some() || unit.is_some() || duration.is_some() || condition.is_some() {
let mut mutable = false;
if let Some(condition) = condition {
if !mutable {
out = out.call_method0("to_mutable")?;
mutable = true;
}
out.setattr("condition", condition)?;
}
if let Some(duration) = duration {
if !mutable {
out = out.call_method0("to_mutable")?;
mutable = true;
}
out.setattr("_duration", duration)?;
}
if let Some(unit) = unit {
if !mutable {
out = out.call_method0("to_mutable")?;
}
out.setattr("_unit", unit)?;
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does the unit/duration stuff here interact with Delay, which expects those fields already?

Comment on lines +551 to +552
// Remember to update StandardGate::is_valid_bit_pattern below
// if you add or remove this enum's variants!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird that you added a comment here, but not on the StandardInstructionType introduced in this PR with the same consideration, but thanks for catching it!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more an artifact of my workflow, which tends to be fairly iterative. I noticed these were out of sync, so I left a comment and assumed I'd fix up the rest of the docs / comments for the PR in accordance once everything else was fleshed out.

Comment on lines +210 to +212
// SAFETY: we read (just!) the discriminant from any of the union's members,
// since we guarantee it is found in the same place for all bitfields.
unsafe { self.gate.discriminant() }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm really nervous about this, because I think this is already invoking undefined behaviour, which wasn't happening before the bitfield structs were introduced.

Not all 8-bit patterns are valid StandardGate, and doing self.gate interprets bits 3 to 11 directly as a StandardGate in order to produce the StandardGateBits struct. If this object is actually a pointer, those 8 bits may well not be a StandardGate, and that's UB, even if we don't explicitly access them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The StandardGateBits struct is actually just a transparent tuple struct around a u64. And that u64 is zero-initialized, so we should be safe here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can buy that it isn't UB, though I don't think it's because of 0 initialisation (the u64 of self here isn't 0 - there's no new to be called, since we're interpreting a union field). If it's not UB, it's because the StandardGate type isn't constructed by the (internally generated) mask and shift of the getter, but the definition of the struct certainly makes it look like you should think about the value as having been constructed.

I think I was wrong initially and it's not UB, but I'm still concerned about how non-obvious it is without having really thought about how bitfield_struct must work internally, which is more mental load.

Comment on lines +340 to +346
#[bitfield(u64, new = false)]
struct PointerBits {
#[bits(3, access = RO)]
discriminant: PackedOperationType,
#[bits(61, from = PointerBits::unpack_address, into = PointerBits::pack_address)]
address: usize,
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that bitfield_struct automatically impls Copy, and there's no way to disable this. I'm really nervous about that, because this PR is introducing a lot more types into the system here, and I think it's really easy to lose the ownership semantics in the middle of all of them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it ought not to. The author has merged a PR I've made which allows the derived Clone (and thus Copy) to be disabled. Once included in a release, we should be able to get rid of the inner BitField and make PackedOperation the union type instead, if we should want to.

Comment on lines +279 to +301
/// The bitfield layout used for standard gates.
#[bitfield(u64)]
struct StandardGateBits {
#[bits(3, default = PackedOperationType::StandardGate, access = RO)]
discriminant: PackedOperationType,
#[bits(8)]
standard_gate: StandardGate,
#[bits(53)]
__: u64,
}

/// The bitfield layout used for standard instructions.
#[bitfield(u64)]
struct StandardInstructionBits {
#[bits(3, default = PackedOperationType::StandardInstruction, access = RO)]
discriminant: PackedOperationType,
#[bits(8)]
standard_instruction: StandardInstructionType,
#[bits(21)]
_pad1: u32,
#[bits(32)]
value: ImmediateValue,
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's odd to me that you went to all the trouble with the 32-bit pointers to make them aligned, even though they're the "terrible performance" path on platforms we don't even care much about, yet StandardGate and StandardInstruction (very much meant to be the absolute fastest paths) have their internal discriminant misaligned.

A second consideration: an alternative to having StandardInstructionBits forcibly define the bitlayout of StandardInstruction itself would be to have StandardInstruction have a to_bits method that is required to return a 61-bit value, and leave it up to StandardInstruction. That would let the code be a lot more local to the rest of the type, while in this form, there's still co-operation necessary between StandardInstruction and StandardInstructionBits, but they're defined far away from each other.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's odd to me that you went to all the trouble with the 32-bit pointers to make them aligned, even though they're the "terrible performance" path on platforms we don't even care much about, yet StandardGate and StandardInstruction (very much meant to be the absolute fastest paths) have their internal discriminant misaligned.

I did consider this, and we can certainly add a bit of padding to get them over the byte boundary. The 32-bit case was just what I was focusing on initially since I understood we would be impeding performance for that case the most. You didn't pad StandardGate originally, so I figured I wasn't making anything worse by deferring that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'm onboard with any potential modification to StandardGate being done in a separate PR - if nothing else, it'll let us see if the change is actually worth it.

When I wrote the initial PR, I think my thought process was "a bitshift will be so fast we don't notice it", and "we might need those padding bits later". Certainly worth challenging those assumptions - I didn't test them.

Comment on lines +303 to +332
/// An inline value present within [StandardInstructionBits] layouts used to store arbitrary
/// data to be interpreted according to the encoded standard instruction.
#[derive(Clone, Copy, Debug)]
#[repr(transparent)]
struct ImmediateValue(u32);

impl ImmediateValue {
const fn into_bits(self) -> u32 {
self.0
}

const fn from_bits(value: u32) -> Self {
Self(value)
}

#[inline]
fn from_delay_unit(unit: DelayUnit) -> Self {
Self(unit as u32)
}

#[inline]
fn delay_unit(&self) -> DelayUnit {
::bytemuck::checked::cast(self.0 as u8)
}

#[inline]
fn u32(&self) -> u32 {
self.0
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ImmediateValue is largely indicative of the problem I meant above. I feel like there's assumptions and invariants that need to be upheld between two places now, and that makes things a lot more difficult:

  1. the maximum width of a StandardInstruction payload must be the same size as ImmediateValue. This PR already violates that from a typing perspective (Barrier(usize)).
  2. ImmediateValue has been marked as deriving Clone and Copy, except the actual type that's been punned to a u32 might not logically allow the Clone and Copy. Obviously u32 would let us clone it unfairly too, but it's easier to reason about ownership semantics if the actual owner is the one producing the new type.
  3. (much more minor and future looking) why do we need to apply an arbitrary limit of 32 bits anyway? What if there's a payload in the future that's 2x16 bits, and those don't need to be pushed up to the 32-bit boundary for aligned access anyway. It feels a bit neater to me to give StandardInstruction the whole payload if it wants it, and say it's required to return a 64-bit value with the low three bits 0, then let it put things where it wants them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With points 1 and 2: my main problem is that it's not clear at the site of definition of StandardInstruction that these needs to be upheld, which I imagine is how Barrier(usize) ended up like the PR state in the first place.

I think something like a trait implementation of the packing might help that a bit, because then the trait defines the interface the type is required to fulfil, and whether the type fulfils them can be audited and understood more locally. The need to have the unsafe bit saying "the low three bits must be 0" isn't super ideal, but it reduces the amount of co-operation we have to enforce.

Comment on lines +497 to +500
StandardInstruction::Barrier(num_qubits) => {
let num_qubits: u32 = num_qubits.try_into().expect(
"The PackedOperation representation currently requires barrier size to be <= 32 bits."
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then why don't we enforce that in the typing?

@kevinhartman
Copy link
Contributor Author

Apologies for the delay in responding to your initial comments, @jakelishman! I did read them, and they were ultimately what led me to rewrite the initial draft using bitfields. I meant to reply to your comments at the same time I pushed up these changes, but didn't have enough time yesterday.

Regarding the initial draft:

The shifts of the payload of instructions can be set such that a u32 payload is always in the high bits, with the padding bits between it and the rest of the discriminants if you're concerned about aligned access, because then the bitshifts will be compiled out into a single register load (just read the u32 from the right place) and the compiler will handle endianness for us.

This is exactly what I was concerned about, though I was also trying to accomplish it in a way that the alignment of PackedOperation could be 4 on 32-bit systems, which is partly why I'd ended up with LoHi. In the current approach with bitfields, I've forgone this. The alignment of PackedOperation is now always 8 bytes, even on 32-bit.

Introducing LoHi doesn't remove the need to bitshift and mask for most items, it just means that some of my above list are done one way, some are done another, and the union means that the pointer now has more ways to access it. LoHi also restricts the payload size to u32, when we already have payloads that exceed that (the pointers).

This was true, and ultimately indicative of me trying to jam everything into a single instruction encoding / layout.

I liked the PayloadLocation example you provided, and I actually spent some time locally building it out. I couldn't shake the feeling though that we were implementing a poor-man's bitfield library (and, one without static layout validation at that), so I started poking around and found the bitfield-struct crate. It's really excellent. The author has also already merged a PR I made to optionally disable the derived Clone implementation on its bitfields.

Regarding the current version:

My biggest worry about bitfield-struct as written here is about accessing the PackedOperation discriminant - getting it from the StandardGateBits representation is, I believe, UB. We can swap that to the pointer form, but the fact that it's so easy to produce UB like that makes me nervous.

This I'm not convinced of. Type-punning is allowed in Rust and is not UB, as long as the access doesn't produce an invalid value, if I understand correctly. It's also of course important that we're not reading uninitialized memory, which we shouldn't be here.

From https://doc.rust-lang.org/reference/items/unions.html#reading-and-writing-union-fields

Unions have no notion of an “active field”. Instead, every union access just interprets the storage as the type of the field used for the access. Reading a union field reads the bits of the union at the field’s type. [..] It is the programmer’s responsibility to make sure that the data is valid at the field’s type. Failing to do so results in undefined behavior.

If you're worried about the representations of the fields comprising the union, they are all #[repr(transparent)] structs around a u64. So no matter which field we access, the data gets interpreted as a u64, which is valid. And, the ::new generated by bitfield-struct starts off by setting the inner u64 to 0, so there shouldn't be any case where we are interpreting uninitialized data. The discriminant is located (and statically asserted to be) in the same place for all bitfields, so that should be safe too.

My other hesitation is that generally these aren't entirely bitfields, because of the overlap. All the extra code needed to support the masking out of the low bits of the pointer feels more complex to me than it was before, and I'm not convinced that bitfield_struct is offering us enough in return. If anything, it feels like we might have spread the dangerous assumptions further around across more types and more (type-level) indirection than there was before.

In the 64-bit pointer case, I was also thinking about it as "overlapping" a pointer with a discriminant originally, and that was the reason I hadn't thought to use bitfields. But really, there's no overlapping here. We're simply storing the upper 61 bits (the interesting ones!) of an address inside of a 61 bit field. I'd argue that the code is less complex in terms of bit manipulation: there's no explicit masking at all, we just shift an address down by 3 and store it.

I worry that it's using a pneumatic drill to solve a hammer-and-nail problem here since there really are not many fields at all in our structs, and these structs should not need much manipulation other than construction. My perception of the complexity of manual bitshift-and-mask operations compared to this might be different to other people's.

The original approach was certainly the right one at the time. In earlier commits, I explored extending it for StandardInstruction, but it felt very easy to read/write the wrong part of the field or make the wrong cast with the added complexity. The bitfield approach validates at compile-time that fields aren't overlapping and lets us model PackedOperation's layouts declaratively. I think this is actually quite elegant, overall.

@jakelishman
Copy link
Member

jakelishman commented Dec 18, 2024

For the UB: bytemuck made it near impossible to get this wrong. This implementation with bitfield_struct may be right, because we don't call the problematic getter, and the value isn't constructed til then, but it's nowhere near as obvious to me with the new form, and it feels a lot more dangerous to even access an incorrect union type - much easier to change the code to get it wrong when you've already "constructed" the wrapper type that would allow the invalid access, as opposed to still explicitly treating it as "this is a bag of bits".

I was concerned about the UB because I know (as your quote says) that even constructing an enum value with an invalid discriminant is UB, even if you never read the value. Just constructing it in memory is UB. I guess that probably doesn't happen here, but to know that, you need to really think about how the bitfield_struct crate works, which is more mental load - from the struct definition, you're defining a struct that looks like it's automatically constructed, so you have to fight standard Rust intuition to realise that it's not UB (which obviously I didn't get right the first time round).


Re the pointer overlap: a pointer is 64 bits, not 61 bits, so what the bitfield is storing isn't a pointer, it's as you say, the informative bits of the pointer - the true value of the actual pointer really is overlapped with the discriminant. Strictly, x86_64 pointers are only 48 bits wide and required to be all 0s in the top 16 bits, so we might even want to use those bits too in the future, and they'd be overlapping. Then it'd really be clearer as a single mask, not a bitfield.


The bitfield approach validates at compile-time that fields aren't overlapping and lets us model PackedOperation's layouts declaratively. I think this is actually quite elegant, overall

I mentioned before that imo, the fields do overlap for pointers, and we're fighting bitfield_struct in those cases because it's trying to assert that nothing actually overlaps. About elegance: I do appreciate that you can think about it as an extension of bitfields, but my reading of the resulting code here is that there's more boilerplate accessors/whatever than there was before, and there's a lot more complexity in the type system which is requiring more assumptions to be spread between different objects in the code. To me, the reasoning about this new system is less local (it's the subject of a few of my comments).

I can totally believe that the bitfield_struct crate is really neat. My problem is that the pointer object just isn't a true bitfield - it's a really simple "mask out the low bits" trick, and we're adding a lot of complexity around that to use an abstraction that doesn't fit quite right. It might be super obvious to you, having played around with the bitfield_struct crate a load, but it adds more complexity for me reading the code, and a whole other library that I need to know the intricacies of.

Fwiw, my preferred form probably isn't the PayloadLocation encapsulation I attempted above. As a rough sketch, I was thinking more like: define an unsafe sealed trait that StandardGate and StandardInstruction implement, which supertraits Copy + !Drop, and has methods that encode them into a u64 and back again, that's defined on them (so the reasoning about the data representation is local to those types, including the choices of how to encode themselves), and has a safety comment that they must carry no information in the low three bits. The Copy + !Drop requirement is what PackedOperation really requires for those right now (though tbh there's probably ways to avoid that), and other than that, the "low 3 bits will be overwritten" is the only thing that the type itself needs to know to implement things. In that form, there's no shift operations needed within PackedOperation itself, so little to get wrong - PackedOperation does

struct PackedOperation(u64);

fn do_something(&self) {
    let discriminant = self.0 & Self::DISCRIMINANT_MASK;
    let payload = self.0 & !Self::DISCRIMINANT_MASK;
    match discriminant {
        PackedOperationType::StandardGate => <StandardGate as Packable>::unpack(payload),
        PackedOperationType::StandardInstruction => <StandardInstruction as Packable>::unpack(payload),
        PackedOperationType::PyGate => ...,
    }

and all the other bit-twiddling / whatever is local to the types themselves. In cases where a bitfield representation is appropriate (certainly I could be convinced by StandardInstruction in that case), then we could do that. I don't think the current implementations of the pointer-like objects or StandardGate are served by the bitfields, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Rust This PR or issue is related to Rust code in the repository
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants