Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

what happens if we exceed isize::MAX on 32-bit platforms? #69

Open
oconnor663 opened this issue Aug 24, 2018 · 10 comments
Open

what happens if we exceed isize::MAX on 32-bit platforms? #69

oconnor663 opened this issue Aug 24, 2018 · 10 comments

Comments

@oconnor663
Copy link

oconnor663 commented Aug 24, 2018

The ptr::offset function, which underlies safe slice indexing, says in its docs:

If any of the following conditions are violated, the result is Undefined Behavior...The computed offset, in bytes, cannot overflow an isize...memory acquired directly from allocators or memory mapped files may be too large to handle with this function.

Since the map function is unsafe, it's arguably fine for it to expose possible UB in this way. But I think most people reading the docs won't have any idea that this is a requirement. Maybe it would be better for Deref to panic rather than to return a slice that's "unsoundly large"? (Edit: Probably just return an error if we try to mmap something larger than isize::MAX?)

@danburkert
Copy link
Owner

@oconnor663 can you be more specific with your concern? Do you have a specific potentially buggy line of code in mind, or perhaps a reproducible example?

@oconnor663
Copy link
Author

I'm worried about what happens in the case where someone on a 32-bit system memory maps a 3 GB file (that is, longer than std::isize::MAX but shorter than std::usize::MAX), and then tries to read one of the bytes near the end. Here's what I believe will happen:

  • memmap checks that the file isn't too long for usize, and it'll assemble a &[u8] slice that's 3 GB long.
  • Next, the caller indexes into that slice to read a byte, somewhere near the end. Maybe just to print it or something.
  • That indexing operation translates into usize's SliceIndex::get_unchecked() implementation, which in turn calls std::ptr::add, with a usize argument that's larger than std::isize::MAX.

According to the ptr::add / ptr::offset docs I linked above, that's all it takes to trigger undefined behavior. And those docs specifically note that memory mapping is one way to get your hands on a slice large enough to hit this.

That said, I have no idea what actually happens in this case. It could be that, even though the usize gets cast into an isize and becomes negative, the result of all the pointer arithmetic still ends up being what the caller thought they were asking for? Maybe everything appears to work? Even if that's the case, I don't think we want to rely on it remaining the case.

My instinct is that MmapOptions::get_len, which is currently making sure the length doesn't overflow usize, should instead be making sure the length doesn't overflow isize. But maybe someone who knows more about why these limitations were originally put in place could give better advice.

@danburkert
Copy link
Owner

@oconnor663 thanks for the detail. I put together #74 to try and reproduce this scenario; we'll see whether it crashes and burns on the 32bit CI targets. If you don't mind, please take a look and check if that test accurately reflects the scenario you're concerned about.

@danburkert
Copy link
Owner

Early indications are that test fails with ENOMEM on 32bit platforms, which seems like it might be the correct behavior.

@oconnor663
Copy link
Author

Hmm, maybe we should test whether a 2^31 - 2 byte mmap (just under isize::MAX) succeeds? If not, it could be that we're blowing past some per-process limit on the system.

@oconnor663
Copy link
Author

Some more discussion here: https://www.reddit.com/r/rust/comments/9ghwuv/hey_rustaceans_got_an_easy_question_ask_here/e6fx4h2/
Relevant article: https://trust-in-soft.com/objects-larger-than-ptrdiff_max-bytes/

I get the feeling that these are problems we don't want to touch with a 10 foot pole :)

oconnor663 added a commit to oconnor663/memmap-rs that referenced this issue Sep 22, 2018
I think there were two separate issues with the previous version:

1. The main issue, which we've been discussing in danburkert#69, is that it's
   unsound to construct a slice larger that isize::MAX. That said, it
   might be impossible to trigger this behavior on some 32-bit systems.
   The comment linked to in raw_vec.rs says it requires "a platform
   which can use all 4GB in user-space. e.g. PAE or x32."

2. The previous version was subtracting `self.offset` from the file
   length without checking for negative overflow. Very large offsets
   could have turned into an accidentally valid section of the file, in
   release mode when the overflow didn't panic.
@oconnor663
Copy link
Author

Update: As of rust-lang/rust#53784, slice::from_raw_parts will panic in debug mode, if the length in bytes exceeds isize::MAX.

@Gankra
Copy link

Gankra commented Oct 2, 2018

Note that stdlib debug asserts don’t ever run in official compiler builds. Only for like the rust-lang test suite.

@danburkert
Copy link
Owner

@oconnor663 The only places that use ptr::offset in the code use it either with an alignment, which never exceeds the length of a page (4069), or the offset provided to MmapOptions::offset(usize). I agree that if this offset usize overflows isize::MAX it will cause issues, but I'm not seeing any reason why the length exceeding isize::MAX would be problematic, since slice::from_raw_parts is used for creating the slice.

@danburkert
Copy link
Owner

Woops, sorry I completely missed the link between ptr::offset and from_raw_parts (rust-lang/rust#53784 makes it clear).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants