-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Correctly flush when persisting temporary files. #2668
Conversation
Operations on `TempFile` that persisted buffered contents to disk used tokio's `File` struct to write the contents to disk. Unfortunately, dropping a `File` at the end of the scope does not wait for the writes to complete, since this could block. When reading from the file immediately, it is possible for the contents to be missing and for the file to appear empty. This is actually more likely than not if the read is done synchronously. This can easily be reproduced with the following snippet: ```rust let mut f = TempFile::Buffered { content: b"Hello" }; f.persist_to("file.txt").await.unwrap(); assert_eq!(std::fs::read("file.txt").unwrap(), b"Hello"); ``` This is behaviour is mentioned in [tokio's documentation](https://docs.rs/tokio/latest/tokio/fs/struct.File.html): > A file will not be closed immediately when it goes out of scope if > there are any IO operations that have not yet completed. To ensure that > a file is closed immediately when it is dropped, you should call `flush` > before dropping it An awaited call to flush would solve the issue. However the fs::write function performs the same sequence of operations in fewer lines of code, and removes this particular footgun.
To be clear, here we're depending on the fact that |
Roughly yeah, except it's not Synchronous Rust doesn't have that problem because it is allowed to block in |
Do you know where this is documented for std::fs::File? I understand the differences between flushes and syncs, and I only alluded to the latter as that's the only reference to either flushing or syncing the std docs make when referring to operations called on close. That I was able to find, at least. |
Actually I might have been wrong about the specifics, and in the case of Regardless, while I don’t know of any authoritative source on this, I think it is generally accepted that writing and closing a file in sync Rust is sufficient for the writes to be visible immediately, regardless of the platform. Even if buffering was involved, I’d expect it the be automatically performed on drop, as is the case for BufWriter for example https://doc.rust-lang.org/src/std/io/buffered/bufwriter.rs.html#669 |
In your original comment, you say:
This seems to ignore the fact that we are It seems that this exact concern was raised in tokio-rs/tokio#4296 which culminated in tokio-rs/tokio#4316 which seems to resolve the issue entirely. Unless I'm missing something, tokio-rs/tokio#4316 being merged means that this PR is unnecessary, and the code was correct as it was. Do you believe this to not be the case? |
I did come across tokio-rs/tokio#4296 when researching this, and I believe this to be a slightly different problem. In that Tokio issue, In the present issue I had with Looking at the tokio implementation, You could argue that is actually a bug in tokio's semantics for The way around this is to either call |
This is remarkably surprising. Especially because the docs for the method say:
The docs for
I would indeed argue such a thing. What's worse is that I can't find this documented anywhere. |
Thank you for bringing this issue to light. I've merged your PR in 67ad831 with an update commit message representing our findings here. Fantastic find. |
Wonderful, thanks. |
Operations on
TempFile
that persisted buffered contents to disk used tokio'sFile
struct to write the contents to disk. Unfortunately, dropping aFile
at the end of the scope does not wait for the writes to complete, since this could block.When reading from the file immediately, it is possible for the contents to be missing and for the file to appear empty. This is actually more likely than not if the read is done synchronously.
This can easily be reproduced with the following snippet:
This is behaviour is mentioned in tokio's documentation:
An awaited call to flush would solve the issue. However the fs::write function performs the same sequence of operations in fewer lines of code, and removes this particular footgun.