-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Metaslab max_size should be persisted while unloaded #9055
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
conflicts with #9045 @@ -2519,9 +2594,16 @@ metaslab_segment_weight(metaslab_t *msp)
* weights we rely on the entire weight (excluding the weight-type bit).
*/
boolean_t
-metaslab_should_allocate(metaslab_t *msp, uint64_t asize)
+metaslab_should_allocate(metaslab_t *msp, uint64_t asize, boolean_t try_hard)
{
- if (msp->ms_loaded) {
+ /*
+ * If the metaslab is loaded, ms_max_size is definitive and we can use
+ * the fast check. If it's not, the ms_max_size is a lower bound (once
+ * set), and we should use the fast check unless we're in try_hard.
+ */
+ if (msp->ms_loaded ||
+ (msp->ms_max_size != 0 && !try_hard &&
+ gethrtime() < msp->ms_unload_time + SEC2NSEC(max_size_cache_sec)))
return (msp->ms_max_size >= asize);
} else {
ASSERT0(msp->ms_max_size); |
ahrens
reviewed
Jul 24, 2019
behlendorf
reviewed
Jul 25, 2019
Codecov Report
@@ Coverage Diff @@
## master #9055 +/- ##
==========================================
+ Coverage 70.83% 78.77% +7.94%
==========================================
Files 351 400 +49
Lines 115431 121049 +5618
==========================================
+ Hits 81760 95352 +13592
+ Misses 33671 25697 -7974
Continue to review full report at Codecov.
|
Signed-off-by: Paul Dagnelie <[email protected]>
behlendorf
approved these changes
Jul 30, 2019
ahrens
approved these changes
Aug 2, 2019
pcd1193182
added
Status: Accepted
Ready to integrate (reviewed, tested)
and removed
Status: Code Review Needed
Ready for review and testing
labels
Aug 2, 2019
allanjude
pushed a commit
to allanjude/zfs
that referenced
this pull request
Aug 12, 2019
When we unload metaslabs today in ZFS, the cached max_size value is discarded. We instead use the histogram to determine whether or not we think we can satisfy an allocation from the metaslab. This can result in situations where, if we're doing I/Os of a size not aligned to a histogram bucket, a metaslab is loaded even though it cannot satisfy the allocation we think it can. For example, a metaslab with 16 entries in the 16k-32k bucket may have entirely 16kB entries. If we try to allocate a 24kB buffer, we will load that metaslab because we think it should be able to handle the allocation. Doing so is expensive in CPU time, disk reads, and average IO latency. This is exacerbated if the write being attempted is a sync write. This change makes ZFS cache the max_size after the metaslab is unloaded. If we ever get a free (or a coalesced group of frees) larger than the max_size, we will update it. Otherwise, we leave it as is. When attempting to allocate, we use the max_size as a lower bound, and respect it unless we are in try_hard. However, we do age the max_size out at some point, since we expect the actual max_size to increase as we do more frees. A more sophisticated algorithm here might be helpful, but this works reasonably well. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Matt Ahrens <[email protected]> Signed-off-by: Paul Dagnelie <[email protected]> Closes openzfs#9055
allanjude
pushed a commit
to allanjude/zfs
that referenced
this pull request
Oct 13, 2019
When we unload metaslabs today in ZFS, the cached max_size value is discarded. We instead use the histogram to determine whether or not we think we can satisfy an allocation from the metaslab. This can result in situations where, if we're doing I/Os of a size not aligned to a histogram bucket, a metaslab is loaded even though it cannot satisfy the allocation we think it can. For example, a metaslab with 16 entries in the 16k-32k bucket may have entirely 16kB entries. If we try to allocate a 24kB buffer, we will load that metaslab because we think it should be able to handle the allocation. Doing so is expensive in CPU time, disk reads, and average IO latency. This is exacerbated if the write being attempted is a sync write. This change makes ZFS cache the max_size after the metaslab is unloaded. If we ever get a free (or a coalesced group of frees) larger than the max_size, we will update it. Otherwise, we leave it as is. When attempting to allocate, we use the max_size as a lower bound, and respect it unless we are in try_hard. However, we do age the max_size out at some point, since we expect the actual max_size to increase as we do more frees. A more sophisticated algorithm here might be helpful, but this works reasonably well. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Matt Ahrens <[email protected]> Signed-off-by: Paul Dagnelie <[email protected]> Closes openzfs#9055
allanjude
pushed a commit
to KlaraSystems/zfs
that referenced
this pull request
Apr 28, 2020
When we unload metaslabs today in ZFS, the cached max_size value is discarded. We instead use the histogram to determine whether or not we think we can satisfy an allocation from the metaslab. This can result in situations where, if we're doing I/Os of a size not aligned to a histogram bucket, a metaslab is loaded even though it cannot satisfy the allocation we think it can. For example, a metaslab with 16 entries in the 16k-32k bucket may have entirely 16kB entries. If we try to allocate a 24kB buffer, we will load that metaslab because we think it should be able to handle the allocation. Doing so is expensive in CPU time, disk reads, and average IO latency. This is exacerbated if the write being attempted is a sync write. This change makes ZFS cache the max_size after the metaslab is unloaded. If we ever get a free (or a coalesced group of frees) larger than the max_size, we will update it. Otherwise, we leave it as is. When attempting to allocate, we use the max_size as a lower bound, and respect it unless we are in try_hard. However, we do age the max_size out at some point, since we expect the actual max_size to increase as we do more frees. A more sophisticated algorithm here might be helpful, but this works reasonably well. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Matt Ahrens <[email protected]> Signed-off-by: Paul Dagnelie <[email protected]> Closes openzfs#9055
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Reviewed-by: Matt Ahrens [email protected]
Reviewed-by: Sebastien Roy [email protected]
Reviewed-by: Serapheim Dimitropoulos [email protected]
Motivation and Context
When we unload metaslabs today in ZFS, the cached max_size value is discarded. We instead use the histogram to determine whether or not we think we can satisfy an allocation from the metaslab. This can result in situations where, if we're doing I/Os of a size not aligned to a histogram bucket, a metaslab is loaded even though it cannot satisfy the allocation we think it can. For example, a metaslab with 16 entries in the 16k-32k bucket may have entirely 16kb entries. If we try to allocate a 24kb buffer, we will load that metaslab because we think it should be able to handle the allocation. Doing so is expensive in CPU time, disk reads, and average IO latency. This is exacerbated if the write being attempted is a sync write.
Description
We cache the max_size after the metaslab is unloaded. If we ever get a free (or a coalesced group of frees) larger than the max_size, we will update it. Otherwise, we leave it as is. When attempting to allocate, we use the max_size as a lower bound, and respect it unless we are in try_hard. However, we do age the max_size out at some point, since we expect the actual max_size to increase as we do more frees. A more sophisticated algorithm here might be helpful, but this works reasonably well.
How Has This Been Tested?
Extensive zfs-test suite and zloop testing has been run. In terms of performance testing, tests ran on Illumos. Under normal circumstances with a random read/write workload, the change produced a small (~5%) performance increase. In situations with heavy fragmentation, performance increased on the order of 30%. Metaslab loads resulting in failed allocation attempts drop to approximately 0 with the change in place.
Types of changes
Checklist:
Signed-off-by
.