-
-
Notifications
You must be signed in to change notification settings - Fork 270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve performance of flushing single objects #4017
Conversation
Sounds very cool! I'll take a look at the code tomorrow (Thurs) morning. |
Addresses #3084 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice optimization & code cleanup!
I also removed the "clear_slist" parameter from H5C_set_slist_enabled(). This used to be needed for the old way of flushing tagged entries, and was also used for H5C_evict(), but it turns out it isn't necessary in the latter case since, while pinned entries aren't evicted, they're still flushed and therefore removed from the skiplist. It's also apparently not necessary in the place in the test where it was used. |
Improve performance of flushing a single object, and remove metadata cache flush markers
* Remove oneapi return value warning. (#4028) * Replaced last sprintf with snprintf (#4007) * Replaced last sprintf with snprintf To have the size of the buffer, it was required to change a function signature, and change all users of it. In most cases, determining the buffer size wasn't trivial and so SIZE_MAX is passed. But at least this improves the infrastructure. Someone can later figure out the correct sizes. * Test vlen sequence IO in API tests (#4027) * Check argument for CMake REGEX FCMangle.h. (#4029) * Replace deprecated Fortran 'include mpif.h' with 'USE mpi' (#4031) With MPI 4.1 the use of the mpif.h include file has been deprecated. Codes should transition to USE mpi or USE mpi_f08. Signed-off-by: Christoph Niethammer <[email protected]> * Fix H5F_get_access_plist to copy file locking settings (#4030) H5F_get_access_plist previously did not copy over the file locking settings from a file into the new File Access Property List that it creates. This would make it difficult to match the file locking settings between an external file and its parent file. * Fix missing NOT from if check in HL folder (#4036) * Fix the datatype passed to H5*exists_async APIs in tests. (#4033) Add a new testing function to verify C_BOOL values. * Add deb and rpm binaries to snapshots (#4035) * Update and Add general INSTALL (#4016) * Improve performance of flushing single objects (#4017) Improve performance of flushing a single object, and remove metadata cache flush markers * Fix memory leak in H5LTopen_file_image when H5LT_FILE_IMAGE_DONT_COPY flag is used (#4021) When the H5LT_FILE_IMAGE_DONT_COPY flag is passed to H5LTopen_file_image, the internally-allocated udata structure gets leaked as the core file driver doesn't have a way to determine when or if it needs to call the 'udata_free' callback. This has been fixed by freeing the udata structure when the 'image_free' callback gets made during file close, where the file is holding the last reference to the udata structure. * Fix allocating too much memory in dset API test (#4041) * Don't try to load general-19 warnings file for icc (#4042) The Autools Classic Intel compiler configuration attempts to load a file named `general-19` from the intel-warnings/classic directory, which does not exist. This removes the attempted load of the file. * Remove unused AIX cross-compile cache overrides (#4043) The ibm-aix Autotools config file had some unmaintained and unnecessary Autoconf cache overrides. These have been removed. * Consolidate Autotools linux files (#4044) There are many architecture-specific linux files in the config directory, all of which simply redirect to linux-gnulibc1. This change renames linux-gnulibc1 to linux-gnu and deletes the more specific files. * Remove check for gettimeofday + tz in CMake (#4045) This is not used in the library * Remove limitations on preset generators (#4051) * Fix issue with FAPL file locking setting inheriting test (#4053) Fixes an issue where the HDF5_USE_FILE_LOCKING environment variable being set can interfere with the file locking setting that the test expects to be returned. * Bump the github-actions group with 2 updates (#4054) Bumps the github-actions group with 2 updates: [actions/download-artifact](https://github.com/actions/download-artifact) and [github/codeql-action](https://github.com/github/codeql-action). * Fix VOL-compatibility issues in External Link API test (#4039) Fix link API tests with incorrect filename * Add upddated cmake tools from source location (#4040) * Add options to allow tools type selection and naming (#4046) * Improve error messages when tools attempt to use non-enabled S3 and HDFS VFDs. (#4047) * Correct several 1.15/1.15.0 references to 1.14/1.14.4. * Ignore HDF5Examples/CMakeUserPresets.json
Previously the algorithm for flushing a single object was:
This causes obvious performance problems when there are many cache entries, with multiple iterations over all entries (instead of only the tagged entries), and an O(N log N) build of the skip list, where N is the total number of entries. This PR changes the algorithm to:
Since this algorithm quickly finds only the tagged cache entries and does not operate on other entries it is much faster (verified in benchmarks). With this change, flush markers are no longer used, so they have been removed, simplifying the code in some places (especially in the cache test).
This should have a minor change in behavior in that, if a new cache entry is created during the flush, it will be flushed. Previously this would not happen because the flush marker was not set. I believe this is closer to the intent of single object flushes.
It should be possible to further improve performance by using a data structure other than a skip list here, since it is only used to sort entries so they are flushed in increasing address order, and there are more efficient ways to sort an array, but I believe this should improve performance enough for now, though replacing the skip list would also improve the performance of full file flushes (as on file close).