Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of flushing single objects #4017

Merged
merged 5 commits into from
Feb 23, 2024

Conversation

fortnern
Copy link
Member

Previously the algorithm for flushing a single object was:

  1. Use a hash table to quickly look up cache entries with the matching tag, and set the "flush_marker" field in the entry
  2. Iterate over ALL cache entries, adding ALL dirty entries to the (initially empty) main skip list
  3. Iterate over the skip list, flushing entries that have the flush_marker field set
  4. Destroy the skip list

This causes obvious performance problems when there are many cache entries, with multiple iterations over all entries (instead of only the tagged entries), and an O(N log N) build of the skip list, where N is the total number of entries. This PR changes the algorithm to:

  1. Use a hash table to quickly look up cache entries with the matching tag, adding each to the main skip list
  2. Iterate over the skip list, flushing all entries
  3. Destroy the skip list

Since this algorithm quickly finds only the tagged cache entries and does not operate on other entries it is much faster (verified in benchmarks). With this change, flush markers are no longer used, so they have been removed, simplifying the code in some places (especially in the cache test).

This should have a minor change in behavior in that, if a new cache entry is created during the flush, it will be flushed. Previously this would not happen because the flush marker was not set. I believe this is closer to the intent of single object flushes.

It should be possible to further improve performance by using a data structure other than a skip list here, since it is only used to sort entries so they are flushed in increasing address order, and there are more efficient ways to sort an array, but I believe this should improve performance enough for now, though replacing the skip list would also improve the performance of full file flushes (as on file close).

@qkoziol
Copy link
Contributor

qkoziol commented Feb 14, 2024

Sounds very cool! I'll take a look at the code tomorrow (Thurs) morning.

@fortnern
Copy link
Member Author

Addresses #3084

Copy link
Contributor

@qkoziol qkoziol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice optimization & code cleanup!

@derobins derobins added Priority - 1. High 🔼 These are important issues that should be resolved in the next release Component - C Library Core C library issues (usually in the src directory) labels Feb 16, 2024
@fortnern
Copy link
Member Author

I also removed the "clear_slist" parameter from H5C_set_slist_enabled(). This used to be needed for the old way of flushing tagged entries, and was also used for H5C_evict(), but it turns out it isn't necessary in the latter case since, while pinned entries aren't evicted, they're still flushed and therefore removed from the skiplist. It's also apparently not necessary in the place in the test where it was used.

@derobins derobins merged commit 560e80c into HDFGroup:develop Feb 23, 2024
lrknox pushed a commit to lrknox/hdf5 that referenced this pull request Mar 4, 2024
Improve performance of flushing a single object, and remove metadata
cache flush markers
lrknox added a commit that referenced this pull request Mar 5, 2024
* Remove oneapi return value warning. (#4028)

* Replaced last sprintf with snprintf (#4007)

* Replaced last sprintf with snprintf

To have the size of the buffer, it was required to change a function signature, and change all users of it.

In most cases, determining the buffer size wasn't  trivial and so SIZE_MAX is passed. But at least this improves the infrastructure. Someone can later figure out the correct sizes.

* Test vlen sequence IO in API tests (#4027)

* Check argument for CMake REGEX FCMangle.h. (#4029)

* Replace deprecated Fortran 'include mpif.h' with 'USE mpi' (#4031)

With MPI 4.1 the use of the mpif.h include file has been deprecated. Codes
should transition to USE mpi or USE mpi_f08.

Signed-off-by: Christoph Niethammer <[email protected]>

* Fix H5F_get_access_plist to copy file locking settings (#4030)

H5F_get_access_plist previously did not copy over the file locking settings
from a file into the new File Access Property List that it creates. This would
make it difficult to match the file locking settings between an external file
and its parent file.

* Fix missing NOT from if check in HL folder (#4036)

* Fix the datatype passed to H5*exists_async APIs in tests. (#4033)

Add a new testing function to verify C_BOOL values.

* Add deb and rpm binaries to snapshots (#4035)

* Update and Add general INSTALL (#4016)

* Improve performance of flushing single objects (#4017)

Improve performance of flushing a single object, and remove metadata
cache flush markers

* Fix memory leak in H5LTopen_file_image when H5LT_FILE_IMAGE_DONT_COPY flag is used (#4021)

When the H5LT_FILE_IMAGE_DONT_COPY flag is passed to H5LTopen_file_image, the internally-allocated
udata structure gets leaked as the core file driver doesn't have a way to determine when or if it
needs to call the 'udata_free' callback. This has been fixed by freeing the udata structure when
the 'image_free' callback gets made during file close, where the file is holding the last reference
to the udata structure.

* Fix allocating too much memory in dset API test (#4041)

* Don't try to load general-19 warnings file for icc (#4042)

The Autools Classic Intel compiler configuration attempts to load a file
named `general-19` from the intel-warnings/classic directory, which does
not exist.

This removes the attempted load of the file.

* Remove unused AIX cross-compile cache overrides (#4043)

The ibm-aix Autotools config file had some unmaintained and unnecessary
Autoconf cache overrides. These have been removed.

* Consolidate Autotools linux files (#4044)

There are many architecture-specific linux files in the config
directory, all of which simply redirect to linux-gnulibc1.

This change renames linux-gnulibc1 to linux-gnu and deletes the more
specific files.

* Remove check for gettimeofday + tz in CMake (#4045)

This is not used in the library

* Remove limitations on preset generators (#4051)

* Fix issue with FAPL file locking setting inheriting test (#4053)

Fixes an issue where the HDF5_USE_FILE_LOCKING environment variable being
set can interfere with the file locking setting that the test expects to
be returned.

* Bump the github-actions group with 2 updates (#4054)

Bumps the github-actions group with 2 updates: [actions/download-artifact](https://github.com/actions/download-artifact) and [github/codeql-action](https://github.com/github/codeql-action).

* Fix VOL-compatibility issues in External Link API test  (#4039)

Fix link API tests with incorrect filename

* Add upddated cmake tools from source location (#4040)

* Add options to allow tools type selection and naming (#4046)

* Improve error messages when tools attempt to use non-enabled S3 and HDFS VFDs. (#4047)

* Correct several 1.15/1.15.0 references to 1.14/1.14.4.

* Ignore HDF5Examples/CMakeUserPresets.json
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component - C Library Core C library issues (usually in the src directory) Priority - 1. High 🔼 These are important issues that should be resolved in the next release
Projects
Status: Needs Merged
Development

Successfully merging this pull request may close these issues.

4 participants