Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking Issue for RFC 2930 (read-buf) #78485

Open
1 of 5 tasks
nikomatsakis opened this issue Oct 28, 2020 · 51 comments
Open
1 of 5 tasks

Tracking Issue for RFC 2930 (read-buf) #78485

nikomatsakis opened this issue Oct 28, 2020 · 51 comments
Assignees
Labels
A-io Area: `std::io`, `std::fs`, `std::net` and `std::path` B-RFC-approved Blocker: Approved by a merged RFC but not yet implemented. C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC Libs-Tracked Libs issues that are tracked on the team's project board. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@nikomatsakis
Copy link
Contributor

nikomatsakis commented Oct 28, 2020

This is a tracking issue for the RFC "2930" (rust-lang/rfcs#2930).
The feature gate for the issue is #![feature(read_buf)].

This is now called BorrowedBuf rather than ReadBuf.

About tracking issues

Tracking issues are used to record the overall progress of implementation.
They are also used as hubs connecting to other relevant issues, e.g., bugs or open design questions.
A tracking issue is however not meant for large scale discussion, questions, or bug reports about a feature.
Instead, open a dedicated issue for the specific matter and add the relevant feature gate label.

Steps

Unresolved Questions

Implementation history

@nikomatsakis nikomatsakis added B-RFC-approved Blocker: Approved by a merged RFC but not yet implemented. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC labels Oct 28, 2020
@nikomatsakis
Copy link
Contributor Author

( cc @rust-lang/libs )

@beepster4096
Copy link
Contributor

I'm interested in working on this.

@rustbot claim

@KodrAus KodrAus added the Libs-Tracked Libs issues that are tracked on the team's project board. label Nov 6, 2020
@beepster4096
Copy link
Contributor

beepster4096 commented Nov 8, 2020

I am slightly confused by the API of ReadBufs. What return type should methods like initialized have: &[u8] or &[IoSlice]? If its the former, how do you know which slices are initialized/filled? If it's the latter, what happens if a slice is partially initialized/filled?

edit: probably I should ask this on zulip instead of github

@sfackler
Copy link
Member

sfackler commented Nov 8, 2020

It would return &[IoSliceMut], and only include the slices that are fully initialized.

@programmerjake
Copy link
Member

It would return &[IoSliceMut], and only include the slices that are fully initialized.

I would have expected it to return (&[IoSliceMut], &[u8]) where it is some number of fully initialized buffers and the initialized portion of the partially initialized buffer.

@sfackler
Copy link
Member

sfackler commented Nov 8, 2020

Sure, that makes sense.

@sfackler
Copy link
Member

sfackler commented Dec 5, 2020

@drmeepster are you still working on this?

@beepster4096
Copy link
Contributor

Yeah, I am. Although I'm currently working on #79607 because I needed it for this.

@beepster4096
Copy link
Contributor

Okay, I can continue working on this now that we have MaybeUninit::write_slice

@sunshowers
Copy link
Contributor

I have a PR to add an inner_mut method to Tokio's implementation of ReadBuf: tokio-rs/tokio#3443. As far as I can tell it's a valid use case that there's no other way to do short of pointer arithmetic, so it may make sense to have this be in upstream Rust's ReadBuf as well.

@erickt
Copy link
Contributor

erickt commented Apr 28, 2021

Have we considered extending ReadBuf to be generic on the type, rather than be constrained to u8? I'm guessing much of ReadBuf is generic over the type.

This came up because I'm looking into fixing some UB in rayon, which is passing around uninitialized &mut [T] in its collect iterator. I think the main thing making rayon use this over a &mut Vec<T> is that it wants to split the output buffer across threads, but there's no safe way to do this without initializing the slice. I'd like to replace this with a safe abstraction that's probably quite similar to the design proposed for ReadBuf (plus a split_at method), so I thought maybe there are other people potentially interested in this functionality.

Going further, it would also be interesting to see if we could rewrite Vec to sit upon ReadBuf.

@djc
Copy link
Contributor

djc commented Apr 28, 2021

@Amanieu
Copy link
Member

Amanieu commented Apr 28, 2021

Note that we now have a spare_capacity_mut method on Vec which gives you a &mut [MaybeUninit<T>] for the uninitialized part of a vector.

fmease added a commit to fmease/rust that referenced this issue Nov 8, 2023
…nic, r=dtolnay

Don't panic in `<BorrowedCursor as io::Write>::write`

Instead of panicking if the BorrowedCursor does not have enough capacity for the whole buffer, just return a short write, [like `<&mut [u8] as io::Write>::write` does](https://doc.rust-lang.org/src/std/io/impls.rs.html#349).

(cc `@ChayimFriedman2` rust-lang#78485 (comment))

(I'm not sure if this needs an ACP? since it's not changing the "API", just what the function does)
bors added a commit to rust-lang-ci/rust that referenced this issue Nov 8, 2023
…c, r=dtolnay

Don't panic in `<BorrowedCursor as io::Write>::write`

Instead of panicking if the BorrowedCursor does not have enough capacity for the whole buffer, just return a short write, [like `<&mut [u8] as io::Write>::write` does](https://doc.rust-lang.org/src/std/io/impls.rs.html#349).

(cc `@ChayimFriedman2` rust-lang#78485 (comment))

(I'm not sure if this needs an ACP? since it's not changing the "API", just what the function does)
github-actions bot pushed a commit to rust-lang/miri that referenced this issue Nov 15, 2023
Don't panic in `<BorrowedCursor as io::Write>::write`

Instead of panicking if the BorrowedCursor does not have enough capacity for the whole buffer, just return a short write, [like `<&mut [u8] as io::Write>::write` does](https://doc.rust-lang.org/src/std/io/impls.rs.html#349).

(cc `@ChayimFriedman2` rust-lang/rust#78485 (comment))

(I'm not sure if this needs an ACP? since it's not changing the "API", just what the function does)
@the8472
Copy link
Member

the8472 commented Nov 23, 2023

The BorrowedCursor documentation could use some polish. It says some slightly contradictory or misleading things. The docs start with

A writeable view of the unfilled portion of a BorrowedBuf.

but the very next paragraph:

Provides access to the initialized and uninitialized parts of the underlying BorrowedBuf.

Looking at the actual implementations makes it obvious that they mostly pass-through and grab slices from the underlying BorrowedBuf and totally ignore the write position (start). The write position is only relevant for written().

@tgross35
Copy link
Contributor

Does core_io_borrowed_buf really need to be a separate feature gate, or could it be rolled into read_buf? Bit confusing that the BorrowedBuf/BorrowedCursor docs all point to #117693 rather than this issue, assuming the same types are designed to work in core.

Also related to @the8472's comment, docs need examples.

lnicola pushed a commit to lnicola/rust-analyzer that referenced this issue Apr 7, 2024
Don't panic in `<BorrowedCursor as io::Write>::write`

Instead of panicking if the BorrowedCursor does not have enough capacity for the whole buffer, just return a short write, [like `<&mut [u8] as io::Write>::write` does](https://doc.rust-lang.org/src/std/io/impls.rs.html#349).

(cc `@ChayimFriedman2` rust-lang/rust#78485 (comment))

(I'm not sure if this needs an ACP? since it's not changing the "API", just what the function does)
@a1phyr
Copy link
Contributor

a1phyr commented Apr 9, 2024

There is something "fun" with read_buf as it is defined today, but I am not sure it was done on purpose: unlike read, it is possible to return both data and an error.
This is possible because the cursor keeps tracks of the read bytes itself, so returning an error is not incompatible with reading bytes.

I don't know if this is expected and it is probably useful, but it might be surprising if some implementations start doing that.

In fact, read_buf_exact documents that it has such a behavior:

If this function returns an error, all bytes read will be appended to cursor.

Something more problematic is that it is impossible to write a correct read implementation on top of read_buf (either the written bytes or the error would have to be discarded).

@ChrisDenton
Copy link
Member

Tbh, I think read is in the wrong here. For example, I/O reads very much depends on the behaviour of the underlying OS which we have no control over. Currently we are indeed forced to discard any OS errors if any bytes are read.

RalfJung pushed a commit to RalfJung/rust-analyzer that referenced this issue Apr 27, 2024
Don't panic in `<BorrowedCursor as io::Write>::write`

Instead of panicking if the BorrowedCursor does not have enough capacity for the whole buffer, just return a short write, [like `<&mut [u8] as io::Write>::write` does](https://doc.rust-lang.org/src/std/io/impls.rs.html#349).

(cc `@ChayimFriedman2` rust-lang/rust#78485 (comment))

(I'm not sure if this needs an ACP? since it's not changing the "API", just what the function does)
@a1phyr
Copy link
Contributor

a1phyr commented May 21, 2024

It would be great to have a final word from libs-team on this:

  • Say that this undesirable and change the API to prevent this
  • Say that this undesirable and document about such behavior (and that such read_buf implementations are buggy)
  • Say that this fine (or even great), document that this may happen because it is not obvious and change the few uses of read_buf in std (that don't take this possibility into account).
  • Some other choice that I didn't think about ?

Currently we are indeed forced to discard any OS errors if any bytes are read.

I don't about Windows at all but on Unix at least I don't think this is possible. Can you link such an implementation ?

@ChrisDenton ChrisDenton added the I-libs-api-nominated Nominated for discussion during a libs-api team meeting. label May 21, 2024
@ChrisDenton
Copy link
Member

See the post above for why this is nominated.

Currently we are indeed forced to discard any OS errors if any bytes are read.

I don't about Windows at all but on Unix at least I don't think this is possible. Can you link such an implementation ?

Neither the Windows API nor the documentation guarantees that no bytes will be read in an error case. One documented case is pipes in message mode:

If a named pipe is being read in message mode and the next message is longer than the nNumberOfBytesToRead parameter specifies, ReadFile returns FALSE and GetLastError returns ERROR_MORE_DATA. The remainder of the message can be read by a subsequent call to the ReadFile or PeekNamedPipe function.

To be clear, I'd have to check if we actually do handle this (I suspect not) but if we're going strictly by the Read trait documentation then we should test if any bytes are read and return success if so. The other option, of course, would be to weaken the std documentation.

@de-vri-es
Copy link
Contributor

de-vri-es commented May 21, 2024

See the post above for why this is nominated.

Currently we are indeed forced to discard any OS errors if any bytes are read.

I don't about Windows at all but on Unix at least I don't think this is possible. Can you link such an implementation ?

Neither the Windows API nor the documentation guarantees that no bytes will be read in an error case. One documented case is pipes in message mode:

If a named pipe is being read in message mode and the next message is longer than the nNumberOfBytesToRead parameter specifies, ReadFile returns FALSE and GetLastError returns ERROR_MORE_DATA. The remainder of the message can be read by a subsequent call to the ReadFile or PeekNamedPipe function.

To be clear, I'd have to check if we actually do handle this (I suspect not) but if we're going strictly by the Read trait documentation then we should test if any bytes are read and return success if so. The other option, of course, would be to weaken the std documentation.

Does this case really matter? The Read trait is for byte streams, not message streams. For example, UdpSocket doesn't implement the Read trait (I assume for this reason).

The example you give is very similar to the MSG_TRUNC flag in a recvmsg() call. But for a bytestream this can never happen. For Read, reading a partial message, swallowing the error and performing another read is the correct thing to do.

@ChrisDenton
Copy link
Member

The Read trait is implemented for File which means it needs to work for anything you can open with File::open, The read method also has zero context for the read. All this means it's completely oblivious to what it is reading and has to work (for some definition of "work") in all situations, especially when we provide a "guarantee".

Maybe swallowing any error is fine. But I think in an ideal world we'd have some way for the user to access this information.

@Amanieu
Copy link
Member

Amanieu commented May 21, 2024

We discussed this in the libs-api meeting today. We agree that this can be surprising behavior, but don't see a good way to change the API to avoid it. Our conclusion was to update the documentation to say that returning an error after reading some bytes is allowed, but strongly discouraged.

@Amanieu Amanieu removed the I-libs-api-nominated Nominated for discussion during a libs-api team meeting. label May 21, 2024
@Pr0methean
Copy link

Maybe the return type should be Result<usize, (usize, io::Error)> so that the length of the partial read is available?

@fintelia
Copy link
Contributor

I disagree that the signature of read is to blame here. That method is intended to be called repeatedly in a loop to get all the data you need. Thus if an implementation wants to return both data and an error it'll return the data the first time read is called and cache the error to return on the second call.

The problem comes in because a default implementation of read in terms of read_buf cannot implement caching behavior because it doesn't have access to fields within the reader. Thus, another option would be for the standard library to document that the default implementation of read swallows errors if any bytes are produced, and recommend that if read_buf is capable of producing both data and an error that the implementer should also provide a read implementation that caches errors.

@farnz
Copy link
Contributor

farnz commented Jul 11, 2024

On internals.r-l.o, keepsimple1 commented that they didn't realise that ReadBuf has been renamed to BorrowedBuf rather than removed, because it's not obvious from the summary.

Could you update the summary to include something like:

This is now called BorrowedBuf rather than ReadBuf.

Thanks!

@Amanieu
Copy link
Member

Amanieu commented Jul 12, 2024

I've added a note in the summary, hopefully that should help.

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Sep 28, 2024
…gjubilee

Fix `read_buf` uses in `std`

Following lib-team decision here: rust-lang#78485 (comment)

Guard against the pathological behavior of both returning an error and performing a read.
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Sep 28, 2024
Rollup merge of rust-lang#125404 - a1phyr:fix-read_buf-uses, r=workingjubilee

Fix `read_buf` uses in `std`

Following lib-team decision here: rust-lang#78485 (comment)

Guard against the pathological behavior of both returning an error and performing a read.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-io Area: `std::io`, `std::fs`, `std::net` and `std::path` B-RFC-approved Blocker: Approved by a merged RFC but not yet implemented. C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC Libs-Tracked Libs issues that are tracked on the team's project board. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests