Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clear PG_writeback after zil_commit() for sync I/O #912

Closed
wants to merge 1 commit into from

Conversation

behlendorf
Copy link
Contributor

When writing via ->writepage() the writeback bit was always cleared
as part of the txg commit callback. However, when the I/O is being
done synchronsously to a slog device we can safely run the callback
immediately after zil_commit().

This will significantly reduce the delay for calls such as fsync()
and msync() which block until the data is safe on disk.

Issue #700
Issue #907

When writing via ->writepage() the writeback bit was always cleared
as part of the txg commit callback.  However, when the I/O is being
done synchronsously to a slog device we can safely run the callback
immediately after zil_commit().

This will significantly reduce the delay for calls such as fsync()
and msync() which block until the data is safe on disk.

Issue openzfs#700
Issue openzfs#907
@dechamps
Copy link
Contributor

One thing worries me: are you sure there's no side effects to clearing the writeback bit early? Sure, the data has been written to the ZIL, but if it has been written in immediate mode, then the buffer is still dirty and it's still waiting to be written to the pool in the next TXG sync. I'm not familiar with how buffers and writeback work in Linux memory management, but if clearing the writeback bit means "you can reclaim this", then maybe it would be wrong to do it at this point. Just wondering.

@behlendorf
Copy link
Contributor Author

One thing worries me: are you sure there's no side effects to clearing the writeback bit early?

Yes. I'm sure, and here is why.

The zfs_putpage() function will make a copy of the page in an ARC buffer which is attached to the TX. So even if the VM were to release the page immediately after we cleared the writeback bit (which isn't that unlikely) the ARC buffer will still get written as part of the next TXG. The page at this point is clean and can be discarded safely whenever the VM likes. It can also be immediately re-dirtied and the process repeated.

Obviously we're safe against a crash here too since the ZIL will get replayed.

@behlendorf
Copy link
Contributor Author

2b28613 Clear PG_writeback after zil_commit() for sync I/O

@behlendorf behlendorf closed this Aug 31, 2012
pcd1193182 pushed a commit to pcd1193182/zfs that referenced this pull request Sep 26, 2023
`zcache hits` displays the hits-by-size histogram, showing the predicted
cache hit rate with varying cache size.  The data is based on cache
accesses since the last `zcache hits --clear`.

The problem is that the relevant time period for evaluating cache
effectiveness may not match the time since the histogram was cleared.

This commit makes the zettacache store a history of hits-by-size
histograms, by default one each day.  The `zcache hits` command will now
sum up the histograms for the requested time/date range, which by
default is the entire history (since the cache size was last changed by
`zcache add` or `zcache remove`).  The new `--begin`, `--end`, and
`--duration` flags can be used to select a more specific time range.

The `zcache hits --clear` command will create a new histogram, saving
the current one in the history.

The internal representation of the history can be dumped with `zcache
hits --raw`.  This can be fed into `zcache hits --from-raw` to query and
display the histogram from a different system.

The implementation required a new kind of block-based log to store the
large, variable-size histograms.  The JsonBlockBasedLog stores Rust
structs that are serializable by Serde (as opposed to a normal
BlockBasedLog which stores fixed-size C structs, anotated with
`#[repr(C)]`).

There is no mechanism to discard the historical hits-by-size histograms.
Performance evaluation found good performance even with enough entries
to cover many decades of history.  Therefore, reducing the number of
entries is left to future work.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants