Make Rc and Arc mem::forget safe #27174

Gankra · 2015-07-21T00:13:27Z

See https://internals.rust-lang.org/t/rc-is-unsafe-mostly-on-32-bit-targets-due-to-overflow/2120 for detailed discussion of this problem.

rust-highfive · 2015-07-21T00:13:38Z

r? @nikomatsakis

(rust_highfive has picked a reviewer for you, use r? to override)

alexcrichton · 2015-07-21T00:24:34Z

Is it worth evaluating the performance impact and only doing these checks on 32-bit?

brson · 2015-07-21T00:25:21Z

I'm also not clear on the impact on 64-bit platforms, the original report says it's a problem on both.

brson · 2015-07-21T00:25:56Z

Can you add test cases to prove that these do what they say?

Gankra · 2015-07-21T00:40:46Z

You can't do this only on 32-bit because LLVM will optimize mem::forget loops into a single add.

Gankra · 2015-07-21T00:41:23Z

I was going to say that it seems weird to have a test that counts to usize::MAX (so long), but maybe LLVM can take us straight to the hoop.

alexcrichton · 2015-07-21T06:43:01Z

You can't do this only on 32-bit because LLVM will optimize mem::forget loops into a single add.

I can see how the previous "bump refcount" code might optimize to this, but with the saturating_add instead of just += 1 this may be enough to throw off LLVM's optimizations (especially in the Arc case).

Gankra · 2015-07-21T06:53:44Z

I don't understand what you're proposing. That this change prevents llvm
from fast-forwarding the exploit to a practical level, and therefore it's
fine to not make this change on some platforms...?
On Jul 20, 2015 11:43 PM, "Alex Crichton" [email protected] wrote:

You can't do this only on 32-bit because LLVM will optimize mem::forget
loops into a single add.

I can see how the previous "bump refcount" code might optimize to this,
but with the saturating_add instead of just += 1 this may be enough to
throw off LLVM's optimizations (especially in the Arc case).

—
Reply to this email directly or view it on GitHub
#27174 (comment).

alexcrichton · 2015-07-21T06:54:49Z

Oh, right! Duh.

May not optimize to one addition on the Arc case though?

Gankra · 2015-07-21T17:57:30Z

Yes I'll do some tests.

That said, I'm not comfortable with leaving memory safety holes in Rust on the premise that they're impractical (edit: particularly if they rely on a "sufficiently stupid compiler" argument) and that preventing them has overhead.

I would be interested in including a cfg(allow_preposterous_unsafety) for those who Just Don't Care.

Gankra · 2015-07-21T18:24:58Z

Arc does indeed fail to be optimized to a single add today. With some back of the envelope math, that means that on 64-bit with a single thread just incrementing (probably honestly the best case -- the contention from multiple threads incrementing is probable worse) on a 4ghz processor it would take 150 years to overflow.

Assuming a sufficiently stupid compiler.

Gankra · 2015-07-21T18:41:06Z

However, that said, do we want to include a test for this if compiler optimization shifts and platform specific details mean this could become a 150 year test?

nikomatsakis · 2015-07-21T19:31:16Z

I'd like to see the performance measured, but I'm not sure what's a reasonable benchmark. Perhaps something in servo uses Arc? Microbenchmarks certainly showed small (but not negligible) impact. I think all things considered I'm in favor of landing this, I feel like I can imagine this being a problem in some code somewhere, particularly on 32 bit. But it always seems better to go in with data.

(That said, I'm not sure I find the arguments about "sufficiently smart compilers" all that persuasive: it assumes a mem-forget in a loop, right? That doesn't seem like a very realistic scenario, as compared to say some kind of bug that causes a leak for every web connection. But I don't know, it could happen.)

Gankra · 2015-07-21T23:48:42Z

@pcwalton has suggested we test the perf impact by pointing Servo at Wikipedia and profiling with -p 1.

Gankra · 2015-07-23T21:31:02Z

The results on Servo:

Wikipedia home: https://gist.github.com/Gankro/bed5fee96667c25a4f1c
Wikipedia guardians of the galaxY: https://gist.github.com/Gankro/61b2141c205b3ceb4d9d

Talking on #servo this seemed to be "in the noise".

Micro benchmarks from the original thread by niko (checking instead if x < 0 which presumably shouldn't make a difference?) showed noticeable but fairly low impact (10-20% in the Arc case):

https://gist.github.com/nikomatsakis/8ad11f16fd2d24b51b25

I can run any benchmarks people want to see but I must admit I am not the best at creating "good" performance evaluations.

Gankra · 2015-07-23T21:31:34Z

CC @larsberg

alexcrichton · 2015-07-23T23:18:20Z

Thanks for doing the analysis @gankro! I think this is fine to merge, but I'd like to briefly discuss it in triage next week.

larsberg · 2015-07-24T16:38:29Z

I think you've got the wrong lars berg, maybe you want @larsbergstrom?

Gankra · 2015-07-24T16:42:03Z

Oops, sorry @larsberg!

dgrunwald · 2015-07-26T23:23:26Z

As discussed in the internals thread, the saturating solution requires that all wider-than-address-space stores that accepting arbitrary types cannot offer a safe API. Is this really a good trade-off?

For example, I could imagine a CompressingVec<T> that offers a similar public API as Vec<T> but uses gzip compression on the bytes stores in it. Do we want to forbid people from writing types like CompressingVec<T> and offering a safe API?

If we don't create and document a rule forbidding such wider-than-address-space stores in safe rust, the assumption that a saturated refcount can never again reach zero is invalid, and Rc stays unsafe.

Gankra · 2015-07-26T23:51:14Z

@dgrunwald I don't really see why anyone would assume they could do such a thing.

dgrunwald · 2015-07-27T17:42:29Z

Why not? If I have ownership of a value and can move it; why shouldn't I be able to move its bits to disk, and back later?
An out-of-memory store seems like a reasonable thing to write in a systems programming language; I don't think it's obvious that we want to forbid those just so that Rc can make an assumption on the number of live values.

Gankra · 2015-07-27T17:50:32Z

Is this something many (any) languages support at all? Is this something anyone actually does in practice? It sounds like a big headache for not much gain.

Note that I don't believe this prevent you from mem-mapping files and doing magic compression with a custom FS or whatever. You just have to have everything uniquely addressable by a pointer, which is 2^64 bytes (18 exabytes) on modern platforms!

pcwalton · 2015-07-28T05:37:52Z

Do you have a citation on this? Has anybody ever written a std::vector that transparently compresses its contents? A quick Google search brings up no such example of C++ code in existence.

I'm really wary of introducing an extra test-and-branch for every single Rc manipulation for a theoretical use case. My inclination is to tell anybody who wants to write such a container to just use trait bounds to whitelist the kinds of values that can be stored within it.

dgrunwald · 2015-07-28T18:05:08Z

I don't think the exact example of a zipping vector exists anywhere, but that's just an example from a large number of possible larger-than-memory stores (a more likely usecase would be using a large memory mapped file where only sections of it are mapped into memory).

I have actually written a generic vector with run-length-compression in C#. That one was using object.Equals() and cloning C# references and would be doable in safe rust using Eq + Clone -- but in cases without Clone, the equivalent type that works by comparing the underlying bits (and thus doesn't need any bounds) might be nice-to-have.

Edit: Also, i don't think test-and-predictable-branch is going to be any more expensive than test-and-conditional-move (which is what a saturating add compiles to on x64).

Gankra · 2015-07-29T18:05:17Z

Discussing it today, while we concluded that no one cares about the external memory thing, we do think it makes sense to be able to identify two values as identical and compress them by storing once and maintaining a count of how many times its stored. As such I will update this to instead abort -- if you get to this point your code is tragically broken anyway. We can later decide to instead permanently saturate (adding branches to Drop) if this proves to be onerous. In practice this is all irrelevant since triggering this code will basically never happen. If you have a legit leak you'll OOM first. You need to be calling mem::forget.

alexcrichton · 2015-07-29T21:44:24Z

We talked about this libs triage meeting today and the consensus was that we should merge with the abort strategy (as explained by @gankro)

bluss · 2015-07-29T21:55:39Z

Sounds good

Gankra · 2015-07-29T23:42:54Z

@bors r=alexcrichton

bors · 2015-07-29T23:42:57Z

📌 Commit 8fda0b0 has been approved by alexcrichton

Gankra · 2015-07-30T00:08:00Z

@bors r=alexcrichton

bors · 2015-07-30T00:08:01Z

📌 Commit 22e2100 has been approved by alexcrichton

bors · 2015-07-30T05:54:56Z

⌛ Testing commit 22e2100 with merge db2af71...

See https://internals.rust-lang.org/t/rc-is-unsafe-mostly-on-32-bit-targets-due-to-overflow/2120 for detailed discussion of this problem.

bors · 2015-07-30T07:31:13Z

☀️ Test successful - auto-linux-32-nopt-t, auto-linux-32-opt, auto-linux-64-nopt-t, auto-linux-64-opt, auto-linux-64-x-android-t, auto-mac-32-opt, auto-mac-64-nopt-t, auto-mac-64-opt, auto-win-gnu-32-nopt-t, auto-win-gnu-32-opt, auto-win-gnu-64-nopt-t, auto-win-gnu-64-opt, auto-win-msvc-32-opt, auto-win-msvc-64-opt

comex · 2015-08-04T06:43:14Z

@gankro Any plans to refine this to (safely) avoid the check on 64-bit ABIs?

Gankra · 2015-08-04T15:36:53Z

@comex there is no safe way to avoid the check on Rc, llvm is too smart -- it's possible to ditch on Arc if anyome can provide me concrete docs on the C11 atomics that the compiler isn't allowed to combine several fetch_adds into one.

comex · 2015-08-04T18:59:30Z

I believe it's possible to do it with the same trick as test::black_box (or an inlined version of that function): an empty asm! block. For instance, if you passed the pointer through there, LLVM would have no way of knowing if multiple increases went to the same Rc object, so it couldn't combine them.

Gankra · 2015-08-04T20:35:03Z

@comex I might be happy with this as long as we had a test that verifies that llvm never starts optimizing away empty asm... the black-box thing is a tenuous hack to my knowledge.

bluss · 2015-08-04T20:39:08Z

@comex it sounds like that might cause more damage than it does good (preventing optimizations).

comex · 2015-08-04T20:55:08Z

@bluss It shouldn't; as long as the function containing asm is inlined, the only constraint given to LLVM is that the pointer should be in a register, which it's almost certain to be anyway. Without any other extended effects (e.g. "memory"), clobber registers, or outputs, it should be able to safely assume that nothing other than that register is changed.

glaebhoerl · 2015-08-08T14:26:05Z

Incidentally, I recall a discussion way back when about the possibility of optimizing space usage by using (u32, u32) instead of (usize, usize) for the reference counts. Back then, this idea was rejected on the grounds that it would be memory-unsafe if the counts were to overflow. Now that it turns out we need to explicitly check for overflow anyways, though, this reasoning no longer applies. The question is now only whether we consider it important to allow a reference-counted box to have over 4 billion references. (It's quite possible that we do; I just wanted to point out the possibility.)

This is tangentially related to rust-lang/rfcs#1232 in that both changes have the effect of halving the space overhead on 64-bit platforms. It also occurs to me that if we were to consider optimizing this overhead important enough to make Rc generic over whether it supports weak references, then perhaps we should also consider making it generic over the type used for the reference counts - so you could, for example, use Rc<SomeTypeWithAlignof1, NoWeak, u8> with an overhead of only a single byte per box if you didn't need weak references and were completely sure that there would never be more than 255 references to a given box at runtime.

(My personal sense, as I also commented on the other thread, is that these would all be premature optimizations until and unless there were empirical evidence showing concrete gains to be had. But the possibility is there.)

Gankra · 2015-08-08T16:10:02Z

@glaebhoerl I agree with your analysis -- YAGNI

rust-highfive assigned nikomatsakis Jul 21, 2015

Gankra force-pushed the rc-sat branch from 4eaf357 to 41d1ab5 Compare July 21, 2015 00:15

alexcrichton added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label Jul 21, 2015

alexcrichton assigned alexcrichton and unassigned nikomatsakis Jul 21, 2015

brson added the relnotes Marks issues that should be documented in the release notes of the next release. label Jul 21, 2015

alexcrichton added the I-needs-decision Issue: In need of a decision. label Jul 21, 2015

alexcrichton removed the I-needs-decision Issue: In need of a decision. label Jul 29, 2015

Gankra force-pushed the rc-sat branch from 41d1ab5 to 8fda0b0 Compare July 29, 2015 23:42

Gankra force-pushed the rc-sat branch 2 times, most recently from 1d7c787 to c9effc8 Compare July 30, 2015 00:01

Gankra added 2 commits July 29, 2015 17:05

make Rc mem::forget safe

1ffe345

make Arc mem::forget safe

22e2100

Gankra force-pushed the rc-sat branch from c9effc8 to 22e2100 Compare July 30, 2015 00:05

bors added a commit that referenced this pull request Jul 30, 2015

Auto merge of #27174 - Gankro:rc-sat, r=alexcrichton

db2af71

See https://internals.rust-lang.org/t/rc-is-unsafe-mostly-on-32-bit-targets-due-to-overflow/2120 for detailed discussion of this problem.

bors merged commit 22e2100 into rust-lang:master Jul 30, 2015

apasel422 mentioned this pull request Nov 24, 2015

arc::Weak::upgrade does not account for overflow #30031

Closed

apasel422 mentioned this pull request Aug 6, 2016

Root count overflows in mem::forget loop, leading to erroneous drop Manishearth/rust-gc#34

Closed

Make Rc and Arc mem::forget safe #27174

Make Rc and Arc mem::forget safe #27174

Conversation

Gankra commented Jul 21, 2015

rust-highfive commented Jul 21, 2015

alexcrichton commented Jul 21, 2015

brson commented Jul 21, 2015

brson commented Jul 21, 2015

Gankra commented Jul 21, 2015

Gankra commented Jul 21, 2015

alexcrichton commented Jul 21, 2015

Gankra commented Jul 21, 2015

alexcrichton commented Jul 21, 2015

Gankra commented Jul 21, 2015

Gankra commented Jul 21, 2015

Gankra commented Jul 21, 2015

nikomatsakis commented Jul 21, 2015

Gankra commented Jul 21, 2015

Gankra commented Jul 23, 2015

Gankra commented Jul 23, 2015

alexcrichton commented Jul 23, 2015

larsberg commented Jul 24, 2015

Gankra commented Jul 24, 2015

dgrunwald commented Jul 26, 2015

Gankra commented Jul 26, 2015

dgrunwald commented Jul 27, 2015

Gankra commented Jul 27, 2015

pcwalton commented Jul 28, 2015

dgrunwald commented Jul 28, 2015

Gankra commented Jul 29, 2015

alexcrichton commented Jul 29, 2015

bluss commented Jul 29, 2015

Gankra commented Jul 29, 2015

bors commented Jul 29, 2015

Gankra commented Jul 30, 2015

bors commented Jul 30, 2015

bors commented Jul 30, 2015

bors commented Jul 30, 2015

comex commented Aug 4, 2015

Gankra commented Aug 4, 2015

comex commented Aug 4, 2015

Gankra commented Aug 4, 2015

bluss commented Aug 4, 2015

comex commented Aug 4, 2015

glaebhoerl commented Aug 8, 2015

Gankra commented Aug 8, 2015