Skip to content

Commit

Permalink
Fix some issues with collective metadata reads for chunked datasets (H…
Browse files Browse the repository at this point in the history
…DFGroup#3716)

Add functions/callbacks for explicit control over chunk index open/close

Add functions/callbacks to check if chunk index is open or not so
that it can be opened if necessary before temporarily disabling
collective metadata reads in the library

Add functions/callbacks for requesting loading of additional chunk
index metadata beyond the chunk index itself
  • Loading branch information
jhendersonHDF committed Oct 24, 2023
1 parent e1437b1 commit cec993e
Show file tree
Hide file tree
Showing 10 changed files with 1,292 additions and 624 deletions.
40 changes: 40 additions & 0 deletions release_docs/RELEASE.txt
Original file line number Diff line number Diff line change
Expand Up @@ -238,6 +238,46 @@ Bug Fixes since HDF5-1.14.2 release
===================================
Library
-------
- Fixed some issues with chunk index metadata not getting read
collectively when collective metadata reads are enabled

When looking up dataset chunks during I/O, the parallel library
temporarily disables collective metadata reads since it's generally
unlikely that the application will read the same chunks from all
MPI ranks. Leaving collective metadata reads enabled during
chunk lookups can lead to hangs or other bad behavior depending
on the chunk indexing structure used for the dataset in question.
However, due to the way that dataset chunk index metadata was
previously loaded in a deferred manner, this could mean that
the metadata for the main chunk index structure or its
accompanying pieces of metadata (e.g., fixed array data blocks)
could end up being read independently if these chunk lookup
operations are the first chunk index-related operation that
occurs on a dataset. This behavior is generally observed when
opening a dataset for which the metadata isn't in the metadata
cache yet and then immediately performing I/O on that dataset.
This behavior is not generally observed when creating a dataset
and then performing I/O on it, as the relevant metadata will
usually be in the metadata cache as a side effect of creating
the chunk index structures during dataset creation.

This issue has been fixed by adding callbacks to the different
chunk indexing structure classes that allow more explicit control
over when chunk index metadata gets loaded. When collective
metadata reads are enabled, the necessary index metadata will now
get loaded collectively by all MPI ranks at the start of dataset
I/O to ensure that the ranks don't unintentionally read this
metadata independently further on. These changes fix collective
loading of the main chunk index structure, as well as v2 B-tree
root nodes, extensible array index blocks and fixed array data
blocks. There are still pieces of metadata that cannot currently
be loaded collectively, however, such as extensible array data
blocks, data block pages and super blocks, as well as fixed array
data block pages. These pieces of metadata are not necessarily
read in by all MPI ranks since this depends on which chunks the
ranks have selected in the dataset. Therefore, reading of these
pieces of metadata remains an independent operation.

- Fixed potential hangs in parallel library during collective I/O with
independent metadata writes

Expand Down
442 changes: 268 additions & 174 deletions src/H5Dbtree.c

Large diffs are not rendered by default.

Loading

0 comments on commit cec993e

Please sign in to comment.