-
Notifications
You must be signed in to change notification settings - Fork 13.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make Rc and Arc mem::forget safe #27174
Conversation
(rust_highfive has picked a reviewer for you, use r? to override) |
Is it worth evaluating the performance impact and only doing these checks on 32-bit? |
I'm also not clear on the impact on 64-bit platforms, the original report says it's a problem on both. |
Can you add test cases to prove that these do what they say? |
You can't do this only on 32-bit because LLVM will optimize mem::forget loops into a single add. |
I was going to say that it seems weird to have a test that counts to usize::MAX (so long), but maybe LLVM can take us straight to the hoop. |
I can see how the previous "bump refcount" code might optimize to this, but with the |
I don't understand what you're proposing. That this change prevents llvm
|
Oh, right! Duh. May not optimize to one addition on the |
Yes I'll do some tests. That said, I'm not comfortable with leaving memory safety holes in Rust on the premise that they're impractical (edit: particularly if they rely on a "sufficiently stupid compiler" argument) and that preventing them has overhead. I would be interested in including a |
Arc does indeed fail to be optimized to a single add today. With some back of the envelope math, that means that on 64-bit with a single thread just incrementing (probably honestly the best case -- the contention from multiple threads incrementing is probable worse) on a 4ghz processor it would take 150 years to overflow. Assuming a sufficiently stupid compiler. |
However, that said, do we want to include a test for this if compiler optimization shifts and platform specific details mean this could become a 150 year test? |
I'd like to see the performance measured, but I'm not sure what's a reasonable benchmark. Perhaps something in servo uses (That said, I'm not sure I find the arguments about "sufficiently smart compilers" all that persuasive: it assumes a mem-forget in a loop, right? That doesn't seem like a very realistic scenario, as compared to say some kind of bug that causes a leak for every web connection. But I don't know, it could happen.) |
@pcwalton has suggested we test the perf impact by pointing Servo at Wikipedia and profiling with |
The results on Servo: Wikipedia home: https://gist.github.com/Gankro/bed5fee96667c25a4f1c Talking on #servo this seemed to be "in the noise". Micro benchmarks from the original thread by niko (checking instead if https://gist.github.com/nikomatsakis/8ad11f16fd2d24b51b25 I can run any benchmarks people want to see but I must admit I am not the best at creating "good" performance evaluations. |
CC @larsberg |
Thanks for doing the analysis @gankro! I think this is fine to merge, but I'd like to briefly discuss it in triage next week. |
I think you've got the wrong lars berg, maybe you want @larsbergstrom? |
Oops, sorry @larsberg! |
As discussed in the internals thread, the saturating solution requires that all wider-than-address-space stores that accepting arbitrary types cannot offer a safe API. Is this really a good trade-off? For example, I could imagine a If we don't create and document a rule forbidding such wider-than-address-space stores in safe rust, the assumption that a saturated refcount can never again reach zero is invalid, and |
@dgrunwald I don't really see why anyone would assume they could do such a thing. |
Why not? If I have ownership of a value and can move it; why shouldn't I be able to move its bits to disk, and back later? |
Is this something many (any) languages support at all? Is this something anyone actually does in practice? It sounds like a big headache for not much gain. Note that I don't believe this prevent you from mem-mapping files and doing magic compression with a custom FS or whatever. You just have to have everything uniquely addressable by a pointer, which is 2^64 bytes (18 exabytes) on modern platforms! |
Do you have a citation on this? Has anybody ever written a I'm really wary of introducing an extra test-and-branch for every single Rc manipulation for a theoretical use case. My inclination is to tell anybody who wants to write such a container to just use trait bounds to whitelist the kinds of values that can be stored within it. |
I don't think the exact example of a zipping vector exists anywhere, but that's just an example from a large number of possible larger-than-memory stores (a more likely usecase would be using a large memory mapped file where only sections of it are mapped into memory). I have actually written a generic vector with run-length-compression in C#. That one was using Edit: Also, i don't think test-and-predictable-branch is going to be any more expensive than test-and-conditional-move (which is what a saturating add compiles to on x64). |
Discussing it today, while we concluded that no one cares about the external memory thing, we do think it makes sense to be able to identify two values as identical and compress them by storing once and maintaining a count of how many times its stored. As such I will update this to instead abort -- if you get to this point your code is tragically broken anyway. We can later decide to instead permanently saturate (adding branches to Drop) if this proves to be onerous. In practice this is all irrelevant since triggering this code will basically never happen. If you have a legit leak you'll OOM first. You need to be calling mem::forget. |
We talked about this libs triage meeting today and the consensus was that we should merge with the abort strategy (as explained by @gankro) |
Sounds good |
@bors r=alexcrichton |
📌 Commit 8fda0b0 has been approved by |
1d7c787
to
c9effc8
Compare
@bors r=alexcrichton |
📌 Commit 22e2100 has been approved by |
See https://internals.rust-lang.org/t/rc-is-unsafe-mostly-on-32-bit-targets-due-to-overflow/2120 for detailed discussion of this problem.
@gankro Any plans to refine this to (safely) avoid the check on 64-bit ABIs? |
@comex there is no safe way to avoid the check on Rc, llvm is too smart -- it's possible to ditch on Arc if anyome can provide me concrete docs on the C11 atomics that the compiler isn't allowed to combine several fetch_adds into one. |
I believe it's possible to do it with the same trick as |
@comex I might be happy with this as long as we had a test that verifies that llvm never starts optimizing away empty asm... the black-box thing is a tenuous hack to my knowledge. |
@comex it sounds like that might cause more damage than it does good (preventing optimizations). |
@bluss It shouldn't; as long as the function containing asm is inlined, the only constraint given to LLVM is that the pointer should be in a register, which it's almost certain to be anyway. Without any other extended effects (e.g. "memory"), clobber registers, or outputs, it should be able to safely assume that nothing other than that register is changed. |
Incidentally, I recall a discussion way back when about the possibility of optimizing space usage by using This is tangentially related to rust-lang/rfcs#1232 in that both changes have the effect of halving the space overhead on 64-bit platforms. It also occurs to me that if we were to consider optimizing this overhead important enough to make (My personal sense, as I also commented on the other thread, is that these would all be premature optimizations until and unless there were empirical evidence showing concrete gains to be had. But the possibility is there.) |
@glaebhoerl I agree with your analysis -- YAGNI |
See https://internals.rust-lang.org/t/rc-is-unsafe-mostly-on-32-bit-targets-due-to-overflow/2120 for detailed discussion of this problem.