Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correctly flush when persisting temporary files. #2668

Closed
wants to merge 1 commit into from

Conversation

plietar
Copy link
Contributor

@plietar plietar commented Dec 8, 2023

Operations on TempFile that persisted buffered contents to disk used tokio's File struct to write the contents to disk. Unfortunately, dropping a File at the end of the scope does not wait for the writes to complete, since this could block.

When reading from the file immediately, it is possible for the contents to be missing and for the file to appear empty. This is actually more likely than not if the read is done synchronously.

This can easily be reproduced with the following snippet:

let mut f = TempFile::Buffered { content: b"Hello" };
f.persist_to("file.txt").await.unwrap();
assert_eq!(std::fs::read("file.txt").unwrap(), b"Hello");

This is behaviour is mentioned in tokio's documentation:

A file will not be closed immediately when it goes out of scope if
there are any IO operations that have not yet completed. To ensure that
a file is closed immediately when it is dropped, you should call flush
before dropping it

An awaited call to flush would solve the issue. However the fs::write function performs the same sequence of operations in fewer lines of code, and removes this particular footgun.

Operations on `TempFile` that persisted buffered contents to disk used
tokio's `File` struct to write the contents to disk. Unfortunately,
dropping a `File` at the end of the scope does not wait for the writes
to complete, since this could block.

When reading from the file immediately, it is possible for the contents
to be missing and for the file to appear empty. This is actually more
likely than not if the read is done synchronously.

This can easily be reproduced with the following snippet:
```rust
let mut f = TempFile::Buffered { content: b"Hello" };
f.persist_to("file.txt").await.unwrap();
assert_eq!(std::fs::read("file.txt").unwrap(), b"Hello");
```

This is behaviour is mentioned in [tokio's documentation](https://docs.rs/tokio/latest/tokio/fs/struct.File.html):
> A file will not be closed immediately when it goes out of scope if
> there are any IO operations that have not yet completed. To ensure that
> a file is closed immediately when it is dropped, you should call `flush`
> before dropping it

An awaited call to flush would solve the issue. However the fs::write
function performs the same sequence of operations in fewer lines of
code, and removes this particular footgun.
@SergioBenitez
Copy link
Member

An awaited call to flush would solve the issue. However the fs::write function performs the same sequence of operations in fewer lines of code, and removes this particular footgun.

To be clear, here we're depending on the fact that Drop on std::fs::File calls sync_all() (or some moral equivalent), and that the drop occurs inside a spawn_blocking(), meaning the file is closed by the time write().await returns, is that right?

@plietar
Copy link
Contributor Author

plietar commented Dec 8, 2023

To be clear, here we're depending on the fact that Drop on std::fs::File calls sync_all() (or some moral equivalent), and that the drop occurs inside a spawn_blocking(), meaning the file is closed by the time write().await returns, is that right?

Roughly yeah, except it's not sync_all but a simple flush. Flush makes sure the writes have been issued to the OS and are visible to other applications (and other reads from the same application). It's possible that after a flush the writes would only be in the OS's buffers but not written to physical disk. sync_all does more by asking the OS to actually write the data to the hardware.

Synchronous Rust doesn't have that problem because it is allowed to block in Drop. Tokio is limited by the fact that destructors can't be async and therefore you would need an explicit flush call.

@SergioBenitez
Copy link
Member

SergioBenitez commented Dec 8, 2023

To be clear, here we're depending on the fact that Drop on std::fs::File calls sync_all() (or some moral equivalent), and that the drop occurs inside a spawn_blocking(), meaning the file is closed by the time write().await returns, is that right?

Roughly yeah, except it's not sync_all but a simple flush. Flush makes sure the writes have been issued to the OS and are visible to other applications (and other reads from the same application). It's possible that after a flush the writes would only be in the OS's buffers but not written to physical disk. sync_all does more by asking the OS to actually write the data to the hardware.

Do you know where this is documented for std::fs::File? I understand the differences between flushes and syncs, and I only alluded to the latter as that's the only reference to either flushing or syncing the std docs make when referring to operations called on close. That I was able to find, at least.

@plietar
Copy link
Contributor Author

plietar commented Dec 8, 2023

Actually I might have been wrong about the specifics, and in the case of std::fs::File it might be that no buffering is done anywhere, thus calls to flush may be unnecessary. Looking at the implementation, flush is actually a no-op, on unix at least (and a few other platforms I’ve looked at).

Regardless, while I don’t know of any authoritative source on this, I think it is generally accepted that writing and closing a file in sync Rust is sufficient for the writes to be visible immediately, regardless of the platform. Even if buffering was involved, I’d expect it the be automatically performed on drop, as is the case for BufWriter for example https://doc.rust-lang.org/src/std/io/buffered/bufwriter.rs.html#669

@SergioBenitez
Copy link
Member

In your original comment, you say:

Unfortunately, dropping a File at the end of the scope does not wait for the writes to complete, since this could block. When reading from the file immediately, it is possible for the contents to be missing and for the file to appear empty.

This seems to ignore the fact that we are .awaiting the future returned by write_all(). The point of async/await syntax is that the operations are happening asynchronously and yet the program flow appears synchronous. That is, write_all().await is (should be) logically equivalent to write_all() // blocking. What happens on drop after this should be irrelevant: we are not merely scheduling operations to happen, we are actually making them happen, and we are waiting for them with .await. In other words, we don't need to block on Drop when we're already "blocking" via .await.

It seems that this exact concern was raised in tokio-rs/tokio#4296 which culminated in tokio-rs/tokio#4316 which seems to resolve the issue entirely. Unless I'm missing something, tokio-rs/tokio#4316 being merged means that this PR is unnecessary, and the code was correct as it was. Do you believe this to not be the case?

@plietar
Copy link
Contributor Author

plietar commented Dec 12, 2023

I did come across tokio-rs/tokio#4296 when researching this, and I believe this to be a slightly different problem.

In that Tokio issue, write_all call is issued, the main function exits, the tokio runtime is shutdown. The write is never even issued to the OS, and the bytes never get written.

In the present issue I had with TempFile, write_all is called, the data is passed to a background task to be written, the File object is dropped but the tokio runtime stays alive and processes the write in the background task. The bytes are not lost and do eventually make it to the OS no matter what. The problem is that write_all(..).await can return before that happens. My application tries to read the file back immediately after calling persist_to(...).await and sees an empty file (as showed in the snippet of code in the first comment). I'm sure if I had a delay between the persist_to call and the reading back I would in fact see the data fine.

Looking at the tokio implementation, write_all is just a sequence of poll_write calls (tokio/src/io/util/write_all.rs#L34-L55), and File's poll_write implementation spawns a background task to do the write and returns Poll::Ready(Ok(n)); immediately, storing a handle to the background task in the File (tokio/src/fs/file.rs#L674-L689). Therefore when the write_all future completes, the bytes have only been handed off to the background task, but may not have been written to the OS yet.

You could argue that is actually a bug in tokio's semantics for write/write_all, but that just seems to be how it is. I found tokio-rs/tokio#5531 which discusses it and was closed as "not planned".

The way around this is to either call flush(...).await, which waits for the last inflight request to complete (tokio/src/fs/file.rs#L894), or to use tokio::fs::write function which spawns std::fs::write in a background task and, crucially, waits for it to complete. I've gone for the second option in the PR because it was more concise and stops someone from accidentally removing the flush call at a later point, but I'm happy with either way.

@SergioBenitez
Copy link
Member

Looking at the tokio implementation, write_all is just a sequence of poll_write calls (tokio/src/io/util/write_all.rs#L34-L55), and File's poll_write implementation spawns a background task to do the write and returns Poll::Ready(Ok(n)); immediately, storing a handle to the background task in the File (tokio/src/fs/file.rs#L674-L689). Therefore when the write_all future completes, the bytes have only been handed off to the background task, but may not have been written to the OS yet.

This is remarkably surprising. Especially because the docs for the method say:

This method will continuously call write until there is no more data to be written. This method will not return until the entire buffer has been successfully written or such an error occurs. The first error generated from this method will be returned.

The docs for write() don't allude to this queueing behavior either.

You could argue that is actually a bug in tokio's semantics for write/write_all, but that just seems to be how it is. I found tokio-rs/tokio#5531 which discusses it and was closed as "not planned".

I would indeed argue such a thing. What's worse is that I can't find this documented anywhere.

@SergioBenitez
Copy link
Member

Thank you for bringing this issue to light. I've merged your PR in 67ad831 with an update commit message representing our findings here. Fantastic find.

@SergioBenitez SergioBenitez added the pr: merged This pull request was merged manually. label Dec 20, 2023
@plietar plietar deleted the fix-tempfile branch December 20, 2023 01:16
@plietar
Copy link
Contributor Author

plietar commented Dec 20, 2023

Wonderful, thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr: merged This pull request was merged manually.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants