Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BRT: Skip getting length in brt_entry_lookup() #15950

Merged
merged 1 commit into from
Mar 26, 2024

Conversation

amotin
Copy link
Member

@amotin amotin commented Mar 1, 2024

Unlike DDT, where ZAP values may have different lengths due to compression, all BRT entries are identical 8-byte counters. It does not make sense to first fetch the length only to assert it. zap_lookup_uint64() is specifically designed to work with counters of different size and should return error if something odd found. Calling it straight allows to save some measurable CPU time.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

@amotin amotin added the Status: Code Review Needed Ready for review and testing label Mar 1, 2024
Unlike DDT, where ZAP values may have different lengths due to
compression, all BRT entries are identical 8-byte counters.  It
does not make sense to first fetch the length only to assert it.
zap_lookup_uint64() is specifically designed to work with counters
of different size and should return error if something odd found.
Calling it straight allows to save some measurable CPU time.

Signed-off-by:	Alexander Motin <[email protected]>
Sponsored by:	iXsystems, Inc.
error == 0 ? (u_longlong_t)bre->bre_refcount : 0, error);
}
error = zap_lookup_uint64(brt->brt_mos, mos_entries, &bre->bre_offset,
BRT_KEY_WORDS, 1, sizeof (bre->bre_refcount), &bre->bre_refcount);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't bother me too much, but should we keep the BRT_DEBUG in there?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my previous tests I've noticed that excessive per-block logging makes this code very slow in debug builds, plus wipes out other messages from the log. I was going to clean it up any way, but refactoring in this set of patches just provided me an excuse to do it.

@amotin
Copy link
Member Author

amotin commented Mar 3, 2024

While working on this PR, I've noticed more serious issue. BRT code specifies to ZAP functions integer_size = 1 and num_integers = 8 (sizeof (bre->bre_refcount)), that in my understanding stores/reads the value as binary data in host byte order. Same time, unlike DDT, which stores byte order in the header and doing explicit byteswap, BRT doesn't do it. Unless I miss something, it looks like BRT currently is not endian-safe. The proper solution IMO would be to specify to ZAP integer_size = 8 and num_integers = 1, that would make ZAP to store counters as big-endian and handle byte swaps, but that would change on-disk format and would probably require another feature flag. On read we could just retry reading in old format if we get EOVERFLOW, but on write we'd need to know what format to write for reader compatibility.

@amotin
Copy link
Member Author

amotin commented Mar 5, 2024

As alternative to adding new feature and changing ZAP format, we could reuse bv_need_byteswap logic. It would break compatibility for any big-endian systems, if already existing, keeping little-endian just as-is. Though I am not sure it is a good way.

@behlendorf
Copy link
Contributor

Unless I miss something, it looks like BRT currently is not endian-safe. The proper solution IMO would be to specify to ZAP integer_size = 8 and num_integers = 1, that would make ZAP to store counters as big-endian and handle byte swaps.

I agree this doesn't look to be endian-safe unless there's some subtle cleverness here we're overlooking. Maybe @pjd can shed some light on while it was done this way initially. It'd be nice to avoid a second feature flag for this if at all possible.

@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Mar 26, 2024
@behlendorf behlendorf merged commit 8cd8ccc into openzfs:master Mar 26, 2024
22 of 26 checks passed
@amotin amotin deleted the brt_no_length branch March 26, 2024 00:16
amotin added a commit to amotin/zfs that referenced this pull request Apr 17, 2024
Unlike DDT, where ZAP values may have different lengths due to
compression, all BRT entries are identical 8-byte counters.  It
does not make sense to first fetch the length only to assert it.
zap_lookup_uint64() is specifically designed to work with counters
of different size and should return error if something odd found.
Calling it straight allows to save some measurable CPU time.

Reviewed-by: Pawel Jakub Dawidek <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Rob Norris <[email protected]>
Signed-off-by:	Alexander Motin <[email protected]>
Sponsored by:	iXsystems, Inc.
Closes openzfs#15950
behlendorf pushed a commit that referenced this pull request Apr 19, 2024
Unlike DDT, where ZAP values may have different lengths due to
compression, all BRT entries are identical 8-byte counters.  It
does not make sense to first fetch the length only to assert it.
zap_lookup_uint64() is specifically designed to work with counters
of different size and should return error if something odd found.
Calling it straight allows to save some measurable CPU time.

Reviewed-by: Pawel Jakub Dawidek <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Rob Norris <[email protected]>
Signed-off-by:	Alexander Motin <[email protected]>
Sponsored by:	iXsystems, Inc.
Closes #15950
lundman pushed a commit to openzfsonwindows/openzfs that referenced this pull request Sep 4, 2024
Unlike DDT, where ZAP values may have different lengths due to
compression, all BRT entries are identical 8-byte counters.  It
does not make sense to first fetch the length only to assert it.
zap_lookup_uint64() is specifically designed to work with counters
of different size and should return error if something odd found.
Calling it straight allows to save some measurable CPU time.

Reviewed-by: Pawel Jakub Dawidek <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Rob Norris <[email protected]>
Signed-off-by:	Alexander Motin <[email protected]>
Sponsored by:	iXsystems, Inc.
Closes openzfs#15950
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants