-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Idea: guaranteed-zero padding for niches #70230
Comments
LLVM has no concept of "padding" outside of "first-class aggregates", which we only use for multiple (well, 2) return values (via As long a type is EDIT: oh and there's passing values around in calls - that's also not LLVM, it's us, we'd have to make sure we don't apply architecture-specific call ABI rules (for e.g. |
I don't know what you mean. When I define a LLVM is permitted to turn a copy of struct-type into a field-by-field copy, and AFAIK we had examples where it did that? This is the kind of transformation that becomes illegal with "zero-padding" (unless we somehow already know that the target memory is zero, e.g. because it is already valid at the given type). |
What would be the overhead of the actual zeroing of the padding? Especially for cg_clif, which barely performs any memory optimizations I imagine this will have a significant overhead. The only memory optimization performed by cg_clif results in a ~15% improvement. However that optimization is currently so dumb that I think zeroing padding will prevent performing this optimization. For cg_llvm it will just result in more ir to churn through with optimizations (=compiletime slowdown), or a runtime slowdown without optimizations. |
Assuming the backend does naive copying (i.e., it always copies the entire struct and doesn't try to be clever by only copying the fields and ignoring padding), then the only overhead is in the constructor: |
Yes
That is exactly what I am afraid of having a >5% overhead on compile time and, when not using optimizations, runtime. Especially for enums it may result in a lot more stores being generated. |
Also for the mir returned by |
I was thinking of struct padding only so far, where it should be pretty trivial to write some zeroes between the fields. But I also don't see the issue with enums. In a constructor, the enum variant is statically known. So you just look at the fields of the initial variant, and zero the gaps between them.
That sounds like a wild guess to me -- do you have any foundation for it being 5% rather than, say, 0.5%? I'd expect the impact to be tiny, since most structs will have way more data bytes than padding bytes and most code is not running constructors. |
I guess zeroing during the codegen of
Not much, but given the huge speed win of the very very dumb memory optimization I implemented (bail out on address taken, more than one store possibly preceding a load, ...) I expect the cost to be relatively high. I can try to benchmark it for cg_clif though. |
Another potential optimization is that if reading the padding is undefined, the compiler might be able to utilize it to vectorize some writing even if such writing would overwrite the padding area, but if pading is defined to be zero then such optimization may be less feasible. Not sure whether that's something really happening, though. |
Only if you use those types as "first class aggregates", as in, by-value. When used for field access only, like we do, it's just a silly way to specify a field offset, nothing more.
Also, if you look at this example's LLVM IR, this is how we lower non- #[repr(C)] // To avoid field reorder.
pub struct S(i16, i32, i64); %S = type { [0 x i16], i16, [1 x i16], i32, [0 x i32], i64, [0 x i64] } Note how we've made the padding fields explicit, and fill it in even when LLVM would compute the right offsets without it anyway? Well, it's because LLVM has no way to specify a custom alignment. And without tracking "alignment from LLVM leaves" for every layout, we can't easily know in the general case whether LLVM will compute the correct offset or not, so we always force it with explicit padding. Anyway, all that "padding" is just to make LLVM compute the right offset numbers, we should seriously consider just using one zero-length array to force an alignment and one byte array for offsets. @RalfJung I suppose the only way LLVM knows something like If they were, it wouldn't matter how you expressed the offsets, and unless used by value (which you shouldn't do because support is dismal) LLVM struct (tuple) "types" are a red herring. |
Oh I see, so LLVM itself couldn't even do the "replace copy of struct by copy of fields skipping padding" transformation, because it does not know which fields are padding. Nothing would have to change here for guaranteed-zero padding. Btw, with nightly features we can already "manually" zero-pad a #![feature(rustc_attrs)]
#[rustc_layout_scalar_valid_range_start(0)]
#[rustc_layout_scalar_valid_range_end(0)]
struct ZeroPad(u8);
impl ZeroPad {
const ZERO: ZeroPad = unsafe { ZeroPad(0) };
}
struct Test(u8, ZeroPad, u16);
fn main() {
println!("{}", std::mem::size_of::<Option<Test>>());
} |
@RalfJung We can get even simpler, just use an enum! #[repr(u8)]
enum ZeroPad {
Zero = 0
} |
@KrishnaSannasi indeed, that works just as well. Nice catch :) |
I am currently trying to implement zero padding on |
It should be easier for structs though, right? As mentioned before, I mostly had structs in mind, where padding is much simpler than for enums. It may make more sense to say that |
The enum variant is as hard as the struct variant if you take the route of zeroing during SetDiscriminant. I just used layout.for_variant(). |
Not struct variants. Structs. As in |
Yes, I meant structs. "struct variant" as in the variant of zeroing codegen for structs. I can understand the confusion though. By the way, turns out I was zeroing scalars as they get FieldPlacement::Arbitrary without fields. |
But there is no |
Also, for your |
I thought there was a |
I don't think this is a task for codegen. MIR has to make it clear when zeroing is necessary. That is the case when |
Part of the plan to get rid of the You could make |
The cost of zeroing the padding seems to be less than I expected. Both during Zeroing code for cg_cliflet variant_layout = place.layout().for_variant(fx, variant_idx); // Remove `for_variant` when zeroing during the prologue
let mut padding_ranges = vec![];
match &variant_layout.fields {
layout::FieldPlacement::Union(field_count) => {
padding_ranges.push(
(0..*field_count).map(|field| variant_layout.field(fx, field).size).max().unwrap_or(Size::ZERO)
..
place.layout().size,
);
}
layout::FieldPlacement::Array { stride, count } => {
for field in 0..*count as usize {
padding_ranges.push(
variant_layout.fields.offset(field) + variant_layout.field(fx, field).size
..
variant_layout.fields.offset(field) + *stride
);
}
}
layout::FieldPlacement::Arbitrary { .. } => {
let mut last_field = None;
for field in variant_layout.fields.index_by_increasing_offset() {
if let Some(last_field) = last_field {
padding_ranges.push(
variant_layout.fields.offset(last_field) + variant_layout.field(fx, last_field).size
..
variant_layout.fields.offset(field)
);
}
last_field = Some(field);
}
if let Some(last_field) = last_field {
padding_ranges.push(
variant_layout.fields.offset(last_field) + variant_layout.field(fx, last_field).size
..
variant_layout.size
);
} else {
//padding_ranges.push(Size::ZERO..variant_layout.size);
}
}
}
if let ty::Adt(adt_def, _) = place.layout().ty.kind {
//if adt_def.is_struct() {
//} else {
// padding_ranges.clear();
//}
} else {
padding_ranges.clear();
}
for padding_range in padding_ranges {
let addr = place.to_ptr(fx).offset_i64(fx, i64::try_from(padding_range.start.bytes()).unwrap()).get_addr(fx);
let size = padding_range.end.bytes() - padding_range.start.bytes();
fx.bcx.emit_small_memset(
fx.module.target_config(),
addr,
0,
size,
std::cmp::min(/*greatest_divisible_power_of_two*/(size as i64 & -(size as i64)) as u64, 8u64) as u8, /*FIXME*/
);
} |
Full benchmark details:
|
Possibly relevant prior art, approaching guarantees of zero-initialized padding from a security perspective, and avoiding various accidents that might occur due to zero-initialized padding: |
Could it be
|
What about edit: @comex also mentioned this in rust-lang/unsafe-code-guidelines#174 (comment). |
This assumption would not be true for Something has to give, we won't get more niches for free. |
I see, I guess that's what you meant here:
So |
Small nit: But also I would expect something like |
I totally meant this syntax as a strawman. :) |
As suggested here, one option to gain more layout optimizations would be (either for all
repr(Rust)
types or with per-type opt-in) to use a different kind of padding.Right now, padding bytes may contain arbitrary data, and can be omitted when copying as they are "unstable" in a typed copy. Instead, we could also imagine a kind of padding ("zero-padding") that is guaranteed to be zero. This has the disadvantage that padding bytes need to be initialized on struct creation and copied when the struct is copied, but it has the advantage that padding bytes can become "niches" for layout optimizations.
LLVM's padding is of the "unstable" kind, but we could just add explicit fields ourselves (fields that are guaranteed to be zero) when translating zero-padded types to LLVM.
Cc @eddyb
The text was updated successfully, but these errors were encountered: