From 73157768f929072902971449a73845650c6356d7 Mon Sep 17 00:00:00 2001 From: Allen Byrne <50328838+byrnHDF@users.noreply.github.com> Date: Wed, 20 Nov 2024 19:55:32 -0600 Subject: [PATCH] Convert external chunking documentation to doxygen (#5131) --- doxygen/dox/DSChunkingIssues.dox | 139 +++++++++++ doxygen/dox/IntroParHDF5.dox | 2 +- doxygen/dox/LearnBasics3.dox | 10 +- doxygen/dox/LearnHDFView.dox | 3 +- doxygen/dox/UsersGuide.dox | 2 +- doxygen/dox/chunking_in_hdf5.dox | 398 +++++++++++++++++++++++++++++++ doxygen/img/Chunk_f1.gif | Bin 0 -> 3664 bytes doxygen/img/Chunk_f2.gif | Bin 0 -> 3986 bytes doxygen/img/Chunk_f3.gif | Bin 0 -> 6815 bytes doxygen/img/Chunk_f4.gif | Bin 0 -> 5772 bytes doxygen/img/Chunk_f5.gif | Bin 0 -> 5455 bytes doxygen/img/Chunk_f6.gif | Bin 0 -> 4949 bytes doxygen/img/chunking1and2.png | Bin 0 -> 160528 bytes doxygen/img/chunking3and4.png | Bin 0 -> 332358 bytes doxygen/img/chunking5.png | Bin 0 -> 108034 bytes doxygen/img/chunking6.png | Bin 0 -> 246707 bytes doxygen/img/chunking7.png | Bin 0 -> 161937 bytes doxygen/img/chunking8.png | Bin 0 -> 7627 bytes src/H5Gpublic.h | 2 +- src/H5Tmodule.h | 1 + 20 files changed, 547 insertions(+), 10 deletions(-) create mode 100644 doxygen/dox/DSChunkingIssues.dox create mode 100644 doxygen/dox/chunking_in_hdf5.dox create mode 100644 doxygen/img/Chunk_f1.gif create mode 100644 doxygen/img/Chunk_f2.gif create mode 100644 doxygen/img/Chunk_f3.gif create mode 100644 doxygen/img/Chunk_f4.gif create mode 100644 doxygen/img/Chunk_f5.gif create mode 100644 doxygen/img/Chunk_f6.gif create mode 100644 doxygen/img/chunking1and2.png create mode 100644 doxygen/img/chunking3and4.png create mode 100644 doxygen/img/chunking5.png create mode 100644 doxygen/img/chunking6.png create mode 100644 doxygen/img/chunking7.png create mode 100644 doxygen/img/chunking8.png diff --git a/doxygen/dox/DSChunkingIssues.dox b/doxygen/dox/DSChunkingIssues.dox new file mode 100644 index 00000000000..a6477db5e2c --- /dev/null +++ b/doxygen/dox/DSChunkingIssues.dox @@ -0,0 +1,139 @@ +/** \page hdf5_chunk_issues Dataset Chunking Issues + * + * \section sec_hdf5_chunk_issues_intro Introduction + * Chunking refers to a storage layout where a dataset is partitioned into fixed-size multi-dimensional chunks. + * The chunks cover the dataset but the dataset need not be an integral number of chunks. If no data is ever written + * to a chunk then that chunk isn't allocated on disk. Figure 1 shows a 25x48 element dataset covered by nine 10x20 chunks + * and 11 data points written to the dataset. No data was written to the region of the dataset covered by three of the chunks + * so those chunks were never allocated in the file -- the other chunks are allocated at independent locations in the file + * and written in their entirety. + * + * + * + * + * + *
+ * \image html Chunk_f1.gif "Figure 1" + *
+ * + * The HDF5 library treats chunks as atomic objects -- disk I/O is always in terms of complete chunks (parallel versions + * of the library can access individual bytes of a chunk when the underlying file uses MPI-IO.). This allows data filters + * to be defined by the application to perform tasks such as compression, encryption, checksumming, etc. on entire chunks. + * As shown in Figure 2, if #H5Dwrite touches only a few bytes of the chunk, the entire chunk is read from the file, the + * data passes upward through the filter pipeline, the few bytes are modified, the data passes downward through the filter + * pipeline, and the entire chunk is written back to the file. + * + * + * + * + * + *
+ * \image html Chunk_f2.gif "Figure 2" + *
+ * + * \section sec_hdf5_chunk_issues_data The Raw Data Chunk Cache + * It's obvious from Figure 2 that calling #H5Dwrite many times from the application would result in poor performance even + * if the data being written all falls within a single chunk. A raw data chunk cache layer was added between the top of + * the filter stack and the bottom of the byte modification layer. + * By default, the chunk cache will store 521 chunks + * or 1MB of data (whichever is less) but these values can be modified with #H5Pset_cache. + * + * The preemption policy for the cache favors certain chunks and tries not to preempt them. + * \li Chunks that have been accessed frequently in the near past are favored. + * \li A chunk which has just entered the cache is favored. + * \li A chunk which has been completely read or completely written but not partially read or written is penalized according + * to some application specified weighting between zero and one. + * \li A chunk which is larger than the maximum cache size is not eligible for caching. + * + * \section sec_hdf5_chunk_issues_effic Cache Efficiency + * Now for some real numbers... A 2000x2000 element dataset is created and covered by a 20x20 array of chunks (each chunk is + * 100x100 elements). The raw data cache is adjusted to hold at most 25 chunks by setting the maximum number of bytes to 25 + * times the chunk size in bytes. Then the application creates a square, two-dimensional memory buffer and uses it as a window + * into the dataset, first reading and then rewriting in row-major order by moving the window across the dataset (the read and + * write tests both start with a cold cache). + * + * The measure of efficiency in Figure 3 is the number of bytes requested by the application divided by the number of bytes + * transferred from the file. There are at least a couple ways to get an estimate of the cache performance: one way is to turn + * on cache debugging and look at the number of cache misses. A more accurate and specific way is to register a data filter whose + * sole purpose is to count the number of bytes that pass through it (that's the method used below). + * + * + * + * + * + *
+ * \image html Chunk_f3.gif "Figure 3" + *
+ * + * The read efficiency is less than one for two reasons: collisions in the cache are handled by preempting one of + * the colliding chunks, and the preemption algorithm occasionally preempts a chunk which hasn't been referenced for + * a long time but is about to be referenced in the near future. + * + * The write test results in lower efficiency for most window sizes because HDF5 is unaware that the application is about + * to overwrite the entire dataset and must read in most chunks before modifying parts of them. + * + * There is a simple way to improve efficiency for this example. It turns out that any chunk that has been completely + * read or written is a good candidate for preemption. If we increase the penalty for such chunks from the default 0.75 + * to the maximum 1.00 then efficiency improves. + * + * + * + * + * + *
+ * \image html Chunk_f4.gif "Figure 4" + *
+ * + * The read efficiency is still less than one because of collisions in the cache. The number of collisions can often + * be reduced by increasing the number of slots in the cache. Figure 5 shows what happens when the maximum number of + * slots is increased by an order of magnitude from the default (this change has no major effect on memory used by + * the test since the byte limit was not increased for the cache). + * + * + * + * + * + *
+ * \image html Chunk_f5.gif "Figure 5" + *
+ * + * Although the application eventually overwrites every chunk completely the library has no way of knowing this + * beforehand since most calls to #H5Dwrite modify only a portion of any given chunk. Therefore, the first modification of a + * chunk will cause the chunk to be read from disk into the chunk buffer through the filter pipeline. Eventually HDF5 might + * contain a dataset transfer property that can turn off this read operation resulting in write efficiency which is equal + * to read efficiency. + * + * \section sec_hdf5_chunk_issues_frag Fragmentation + * Even if the application transfers the entire dataset contents with a single call to #H5Dread or #H5Dwrite it's + * possible the request will be broken into smaller, more manageable pieces by the library. This is almost certainly + * true if the data transfer includes a type conversion. + * + * + * + * + * + *
+ * \image html Chunk_f6.gif "Figure 6" + *
+ * + * By default the strip size is 1MB but it can be changed by calling #H5Pset_buffer. + * + * \section sec_hdf5_chunk_issues_store File Storage Overhead + * The chunks of the dataset are allocated at independent locations throughout the HDF5 file and a B-tree maps chunk + * N-dimensional addresses to file addresses. The more chunks that are allocated for a dataset the larger the B-tree. + * + * Large B-trees have two disadvantages: + * \li The file storage overhead is higher and more disk I/O is required to traverse the tree from root to leaves. + * \li The increased number of B-tree nodes will result in higher contention for the metadata cache. + * There are three ways to reduce the number of B-tree nodes. The obvious way is to reduce the number of chunks by + * choosing a larger chunk size (doubling the chunk size will cut the number of B-tree nodes in half). Another method + * is to adjust the split ratios for the B-tree by calling #H5Pset_btree_ratios, but this method typically results in only a + * slight improvement over the default settings. Finally, the out-degree of each node can be increased by calling + * #H5Pset_istore_k (increasing the out degree actually increases file overhead while decreasing the number of nodes). + * + * \section sec_hdf5_chunk_issues_comp Chunk Compression + * Dataset chunks can be compressed through the use of filters. See the chapter \ref subsec_dataset_filters in the \ref UG. + * + * Reading and rewriting compressed chunked data can result in holes in an HDF5 file. In time, enough such holes can increase + * the file size enough to impair application or library performance when working with that file. See @ref H5TOOL_RP_UG. + */ diff --git a/doxygen/dox/IntroParHDF5.dox b/doxygen/dox/IntroParHDF5.dox index 58a6e7958b0..a02cbbb5253 100644 --- a/doxygen/dox/IntroParHDF5.dox +++ b/doxygen/dox/IntroParHDF5.dox @@ -35,7 +35,7 @@ The following shows the Parallel HDF5 implementation layers: This tutorial assumes that you are somewhat familiar with parallel programming with MPI (Message Passing Interface). If you are not familiar with parallel programming, here is a tutorial that may be of interest: -Tutorial on HDF5 I/O tuning at NERSC. +Tutorial on HDF5 I/O tuning at NERSC (PDF). (NOTE: As of 2024, the specific systems described in this tutorial are outdated.) Some of the terms that you must understand in this tutorial are: diff --git a/doxygen/dox/LearnBasics3.dox b/doxygen/dox/LearnBasics3.dox index 13cb4f43abd..6e569aa41bb 100644 --- a/doxygen/dox/LearnBasics3.dox +++ b/doxygen/dox/LearnBasics3.dox @@ -181,8 +181,7 @@ created the dataset layout cannot be changed. The h5repack utility can be used t to a new with a new layout. \section secLBDsetLayoutSource Sources of Information -Chunking in HDF5 -(See the documentation on Advanced Topics in HDF5) +\ref hdf5_chunking \see \ref sec_plist in the HDF5 \ref UG.
@@ -201,7 +200,7 @@ certain initial dimensions, then to later increase the size of any of the initia HDF5 requires you to use chunking to define extendible datasets. This makes it possible to extend datasets efficiently without having to excessively reorganize storage. (To use chunking efficiently, -be sure to see the advanced topic, Chunking in HDF5.) +be sure to see the advanced topic, \ref hdf5_chunking.) The following operations are required in order to extend a dataset: \li Declare the dataspace of the dataset to have unlimited dimensions for all dimensions that might eventually be extended. @@ -243,7 +242,7 @@ Navigate back: \ref index "Main" / \ref GettingStarted / \ref LearnBasics \section secLBComDsetCreate Creating a Compressed Dataset HDF5 requires you to use chunking to create a compressed dataset. (To use chunking efficiently, -be sure to see the advanced topic, Chunking in HDF5.) +be sure to see the advanced topic, \ref hdf5_chunking.) The following operations are required in order to create a compressed dataset: \li Create a dataset creation property list. @@ -251,7 +250,8 @@ The following operations are required in order to create a compressed dataset: \li Create the dataset. \li Close the dataset creation property list and dataset. -For more information on compression, see the FAQ question on Using Compression in HDF5. +For more information on troubleshooting compression issues, see the + HDF5 Compression Troubleshooting (PDF). \section secLBComDsetProg Programming Example diff --git a/doxygen/dox/LearnHDFView.dox b/doxygen/dox/LearnHDFView.dox index 3b9afdb1ef1..298ae52ca4c 100644 --- a/doxygen/dox/LearnHDFView.dox +++ b/doxygen/dox/LearnHDFView.dox @@ -245,8 +245,7 @@ dataset must be stored with a chunked dataset layout (as multiple chunksChunking in HDF5 documentation. +information on chunking and specifying an appropriate chunk size, see the \ref hdf5_chunking documentation. Also see the HDF5 Tutorial topic on \ref secLBComDsetCreate.