From dec53146340aec4725c3043441fdcc14c9e13bf2 Mon Sep 17 00:00:00 2001 From: Allen Byrne <50328838+byrnHDF@users.noreply.github.com> Date: Fri, 13 Dec 2024 10:31:55 -0600 Subject: [PATCH] Split doxygen pages from spec and TN into files (#5165) --- doxygen/dox/GettingStarted.dox | 10 +- doxygen/dox/Specifications.dox | 122 +--- doxygen/dox/TechnicalNotes.dox | 1028 -------------------------------- 3 files changed, 13 insertions(+), 1147 deletions(-) diff --git a/doxygen/dox/GettingStarted.dox b/doxygen/dox/GettingStarted.dox index 75a6b017279..a37b197afea 100644 --- a/doxygen/dox/GettingStarted.dox +++ b/doxygen/dox/GettingStarted.dox @@ -50,7 +50,7 @@ The high-level HDF5 library includes several sets of convenience and standard-us -\ref IntroParHDF5 +@ref IntroParHDF5 A brief introduction to Parallel HDF5. If you are new to HDF5 please see the @ref LearnBasics topic first. @@ -58,7 +58,7 @@ A brief introduction to Parallel HDF5. If you are new to HDF5 please see the @re -\ref ViewTools +@ref ViewTools \li @ref LearnHDFView @@ -71,8 +71,8 @@ A brief introduction to Parallel HDF5. If you are new to HDF5 please see the @re New Features since HDF5-1.10 -\li \ref VDSTN -\li \ref SWMRTN +\li @ref VDSTN +\li @ref SWMRTN @@ -80,7 +80,7 @@ New Features since HDF5-1.10 Example Programs -\ref HDF5Examples +@ref HDF5Examples diff --git a/doxygen/dox/Specifications.dox b/doxygen/dox/Specifications.dox index b0d9c750742..84cead020b9 100644 --- a/doxygen/dox/Specifications.dox +++ b/doxygen/dox/Specifications.dox @@ -9,145 +9,39 @@ \section File Format -\li \ref FMT1 -\li \ref FMT11 -\li \ref FMT2 -\li \ref FMT3 +\li \ref FMT1SPEC +\li \ref FMT11SPEC +\li \ref FMT2SPEC +\li \ref FMT3SPEC \section Other \li \ref IMG -<<<<<<< Upstream, based on branch 'develop-doxy-fformat' of https://github.com/byrnHDF/hdf5.git \li \ref TBLSPEC \li \ref sec_dim_scales_spec -======= -\li \ref TBL -\li - HDF5 Dimension Scale Specification */ -/** \page FMT3 HDF5 File Format Specification Version 3.0 +/** \page FMT3SPEC HDF5 File Format Specification Version 3.0 \htmlinclude H5.format.html */ -/** \page FMT2 HDF5 File Format Specification Version 2.0 +/** \page FMT2SPEC HDF5 File Format Specification Version 2.0 \htmlinclude H5.format.2.0.html */ -/** \page FMT11 HDF5 File Format Specification Version 1.1 +/** \page FMT11SPEC HDF5 File Format Specification Version 1.1 \htmlinclude H5.format.1.1.html */ -/** \page FMT1 HDF5 File Format Specification Version 1.0 +/** \page FMT1SPEC HDF5 File Format Specification Version 1.0 \htmlinclude H5.format.1.0.html */ - -/** \page TBL HDF5 Table Specification Version 1.0 -The HDF5 specification defines the standard objects and storage for the standard HDF5 -objects. (For information about the HDF5 library, model and specification, see the HDF -documentation.) This document is an additional specification do define a standard profile -for how to store tables in HDF5. Table data in HDF5 is stored as HDF5 datasets with standard -attributes to define the properties of the tables. - -\section sec_tab_spec_intro Introduction -A generic table is a sequence of records, each record has a name and a type. Table data -is stored as an HDF5 one dimensional compound dataset. A table is defined as a collection -of records whose values are stored in fixed-length fields. All records have the same structure -and all values in each field have the same data type. - -The dataset for a table is distinguished from other datasets by giving it an attribute -"CLASS=TABLE". Optional attributes allow the storage of a title for the Table and for -each column, and a fill value for each column. - -\section sec_tab_spec_attr Table Attributes -The attributes for the Table are strings. They are written with the #H5LTset_attribute_string -Lite API function. "Required" attributes must always be used. "Optional" attributes must be -used when required. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Table 1. Attributes of an Image Dataset
Attribute NameRequired
Optional
TypeString SizeValueDescription
CLASSRequiredString5"TABLE"This attribute is type #H5T_C_S1, with size 5. For all Tables, the value of this attribute is -TABLE. This attribute identifies this data set as intended to be interpreted as Table that -conforms to the specifications on this page.
VERSIONRequiredString3"0.2"This attribute is of type #H5T_C_S1, with size corresponding to the length of the version string. -This attribute identifies the version number of this specification to which it conforms. The current -version number is "0.2".
TITLEOptionalString  The TITLE is an optional String that is to be used as the informative title of the whole table. -The TITLE is set with the parameter table_title of the function #H5TBmake_table.
FIELD_(n)_NAMERequiredString  The FIELD_(n)_NAME is an optional String that is to be used as the informative title of column n -of the table. For each of the fields the word FIELD_ is concatenated with the zero based field (n) -index together with the name of the field.
FIELD_(n)_FILLOptionalString  The FIELD_(n)_FILL is an optional String that is the fill value for column n of the table. -For each of the fields the word FIELD_ is concatenated with the zero based field (n) index -together with the fill value, if present. This value is written only when a fill value is defined -for the table.
- -The following section of code shows the calls necessary to the creation of a table. -\code -// Create a new HDF5 file using default properties. -file_id = H5Fcreate("my_table.h5", H5F_ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT); - -// Call the make table function -H5TBmake_table("Table Title", file_id, "Table1", NFIELDS, NRECORDS, dst_size, field_names, dst_offset, field_type, chunk_size, fill_data, compress, p_data); - -// Close the file. -status = H5Fclose(file_id); -\endcode - -For more information see the @ref H5TB reference manual page and the @ref H5TB_UG, which includes examples. - ->>>>>>> c7cb5ae Convert ImageSpec html file to doxygen (#5163) - -*/ diff --git a/doxygen/dox/TechnicalNotes.dox b/doxygen/dox/TechnicalNotes.dox index 3e3a0edfb7d..47ba029cc92 100644 --- a/doxygen/dox/TechnicalNotes.dox +++ b/doxygen/dox/TechnicalNotes.dox @@ -24,1034 +24,6 @@ */ -<<<<<<< Upstream, based on branch 'develop-doxy-fformat' of https://github.com/byrnHDF/hdf5.git -======= -/** \page VFL HDF5 Virtual File Layer - -\section sec_vfl_intro Introduction -The HDF5 file format describes how HDF5 data structures and dataset raw data are mapped -to a linear format address space and the HDF5 library implements that bidirectional mapping -in terms of an API. However, the HDF5 format specifications do not indicate how the format -address space is mapped onto storage and HDF (version 5 and earlier) simply mapped the format -address space directly onto a single file by convention. - -Since early versions of HDF5 it became apparent that users want the ability to map the -format address space onto different types of storage (a single file, multiple files, local -memory, global memory, network distributed global memory, a network protocol, etc.) with -various types of maps. For instance, some users want to be able to handle very large format -address spaces on operating systems that support only 2GB files by partitioning the format -address space into equal-sized parts each served by a separate file. Other users want the -same multi-file storage capability but want to partition the address space according to -purpose (raw data in one file, object headers in another, global heap in a third, etc.) -in order to improve I/O speeds. - -In fact, the number of storage variations is probably larger than the number of methods -that the HDF5 team is capable of implementing and supporting. Therefore, a Virtual File -Layer API is being implemented which will allow application teams or departments to design -and implement their own mapping between the HDF5 format address space and storage, with each -mapping being a separate file driver (possibly written in terms of other file drivers). The -HDF5 team will provide a small set of useful file drivers which will also serve as examples -for those who which to write their own: - - - - - - - - - - - - - - - - -
#H5FD_SEC2This is the default driver which uses Posix file-system functions -like read and write to perform I/O to a single file. All I/O requests are unbuffered -although the driver does optimize file seeking operations to some extent. -
#H5FD_STDIOThis driver uses functions from 'stdio.h' to perform buffered I/O to a single file. -
#H5FD_COREThis driver performs I/O directly to memory and can be -used to create small temporary files that never exist on permanent storage. This -type of storage is generally very fast since the I/O consists only of memory-to-memory copy operations. -
#H5FD_MPIOThis is the driver of choice for accessing files in parallel -using MPI and MPI-IO. It is only predefined if the library is compiled with parallel I/O support. -
#H5FD_FAMILYLarge format address spaces are partitioned into more -manageable pieces and sent to separate storage locations using an underlying driver -of the user's choice. \ref H5TOOL_RT_UG can be used to change the sizes of the family -members when stored as files or to convert a family of files to a single file or vice versa. -
- -\section sec_vfl_use Using a File Driver -Most application writers will use a driver defined by the HDF5 library or contributed by another -programming team. This chapter describes how existing drivers are used. - -\subsection subsec_vfl_use_hdr Driver Header Files -Each file driver is defined in its own public header file which should be included by any -application which plans to use that driver. The predefined drivers are in header files whose -names begin with 'H5FD' followed by the driver name and '.h'. The 'hdf5.h' header file includes -all the predefined driver header files. - -Once the appropriate header file is included a symbol of the form 'H5FD_' followed by the -upper-case driver name will be the driver identification number.(The driver name is by convention -and might not apply to drivers which are not distributed with HDF5.) However, the value may -change if the library is closed (e.g., by calling #H5close) and the symbol is referenced again. - -\subsection subsec_vfl_use_create Creating and Opening Files -In order to create or open a file one must define the method by which the storage is -accessed(The access method also indicates how to translate the storage name to a storage server -such as a file, network protocol, or memory.) and does so by creating a file access property -list(The term "file access property list" is a misnomer since storage isn't required to be a file.) -which is passed to the #H5Fcreate or #H5Fopen function. A default file access property list is created -by calling #H5Pcreate and then the file driver information is inserted by calling a driver initialization -function such as #H5Pset_fapl_family: -\code -hid_t fapl = H5Pcreate(H5P_FILE_ACCESS); -size_t member_size = 100*1024*1024; /*100MB*/ -H5Pset_fapl_family(fapl, member_size, H5P_DEFAULT); -hid_t file = H5Fcreate("foo%05d.h5", H5F_ACC_TRUNC, H5P_DEFAULT, fapl); -H5Pclose(fapl); -\endcode - -Each file driver will have its own initialization function whose name is H5Pset_fapl_ followed by -the driver name and which takes a file access property list as the first argument followed by additional -driver-dependent arguments. - -An alternative to using the driver initialization function is to set the driver directly using the -#H5Pset_driver function.(This function is overloaded to operate on data transfer property lists also, as described below.) -Its second argument is the file driver identifier, which may have a different numeric value from run to run -depending on the order in which the file drivers are registered with the library. The third argument encapsulates -the additional arguments of the driver initialization function. This method only works if the file driver -writer has made the driver-specific property list structure a public datatype, which is often not the case. -\code -hid_t fapl = H5Pcreate(H5P_FILE_ACCESS); -static H5FD_family_fapl_t fa = {100*1024*1024, H5P_DEFAULT}; -H5Pset_driver(fapl, H5FD_FAMILY, &fa); -hid_t file = H5Fcreate("foo.h5", H5F_ACC_TRUNC, H5P_DEFAULT, fapl); -H5Pclose(fapl); -\endcode - -It is also possible to query the file driver information from a file access property list by -calling #H5Pget_driver to determine the driver and then calling a driver-defined query function -to obtain the driver information: -\code -hid_t driver = H5Pget_driver(fapl); -if (H5FD_SEC2==driver) { - /*nothing further to get*/ -} else if (H5FD_FAMILY==driver) { - hid_t member_fapl; - haddr_t member_size; - H5Pget_fapl_family(fapl, &member_size, &member_fapl); -} else if (....) { - .... -} -\endcode - -\subsection subsec_vfl_use_per Performing I/O -The #H5Dread and #H5Dwrite functions transfer data between application memory and the file. They both take -an optional data transfer property list which has some general driver-independent properties and optional -driver-defined properties. An application will typically perform I/O in one of three styles via the -#H5Dread or #H5Dwrite function: - -Like file access properties in the previous section, data transfer properties can be set using a driver -initialization function or a general purpose function. For example, to set the MPI-IO driver to use -independent access for I/O operations one would say: -\code -hid_t dxpl = H5Pcreate(H5P_DATA_XFER); -H5Pset_dxpl_mpio(dxpl, H5FD_MPIO_INDEPENDENT); -H5Dread(dataset, type, mspace, fspace, buffer, dxpl); -H5Pclose(dxpl); -\endcode - -The alternative is to initialize a driver defined C struct and pass it to the #H5Pset_driver function: -\code -hid_t dxpl = H5Pcreate(H5P_DATA_XFER); -static H5FD_mpio_dxpl_t dx = {H5FD_MPIO_INDEPENDENT}; -H5Pset_driver(dxpl, H5FD_MPIO, &dx); -H5Dread(dataset, type, mspace, fspace, buffer, dxpl); -\endcode - -The transfer property list can be queried in a manner similar to the file access property list: the driver -provides a function (or functions) to return various information about the transfer property list: -\code -hid_t driver = H5Pget_driver(dxpl); -if (H5FD_MPIO==driver) { - H5FD_mpio_xfer_t xfer_mode; - H5Pget_dxpl_mpio(dxpl, &xfer_mode); -} else { - .... -} -\endcode - -\subsection subsec_vfl_use_inter File Driver Interchangeability -The HDF5 specifications describe two things: the mapping of data onto a linear format address -space and the C API which performs the mapping. However, the mapping of the format address space -onto storage intentionally falls outside the scope of the HDF5 specs. This is a direct result of the -fact that it is not generally possible to store information about how to access storage inside the -storage itself. For instance, given only the file name '/arborea/1225/work/f%03d' the HDF5 library -is unable to tell whether the name refers to a file on the local file system, a family of files on -the local file system, a file on host 'arborea' port 1225, a family of files on a remote system, etc. - -Two ways which library could figure out where the storage is located are: storage access information -can be provided by the user, or the library can try all known file access methods. This implementation -uses the former method. - -In general, if a file was created with one driver then it isn't possible to open it with another driver. -There are of course exceptions: a file created with MPIO could probably be opened with the sec2 driver, -any file created by the sec2 driver could be opened as a family of files with one member, etc. In fact, -sometimes a file must not only be opened with the same driver but also with the same driver properties. -The predefined drivers are written in such a way that specifying the correct driver is sufficient for -opening a file. - -\section sec_vfl_imp Implementation of a Driver -A driver is simply a collection of functions and data structures which are registered with the HDF5 -library at runtime. The functions fall into these categories: -\li Functions which operate on modes -\li Functions which operate on files -\li Functions which operate on the address space -\li Functions which operate on data -\li Functions for driver initialization -\li Optimization functions - -\subsection subsec_vfl_imp_mode Mode Functions -Some drivers need information about file access and data transfers which are very specific to the driver. -The information is usually implemented as a pair of pointers to C structs which are allocated and -initialized as part of an HDF5 property list and passed down to various driver functions. There are two -classes of settings: file access modes that describe how to access the file through the driver, and -data transfer modes which are settings that control I/O operations. Each file opened by a particular -driver may have a different access mode; each dataset I/O request for a particular file may have a -different data transfer mode. - -Since each driver has its own particular requirements for various settings, each driver is responsible -for defining the mode structures that it needs. Higher layers of the library treat the structures as -opaque but must be able to copy and free them. Thus, the driver provides either the size of the -structure or a pair of function pointers for each of the mode types. - -Example: The family driver needs to know how the format address space is partitioned and the file -access property list to use for the family members. -\code -// Driver-specific file access properties -typedef struct H5FD_family_fapl_t { - hsize_t memb_size; // size of each family member - hid_t memb_fapl; // file access property list for each family member -} H5FD_family_fapl_t; - -// Driver specific data transfer properties -typedef struct H5FD_family_dxpl_t { - hid_t memb_dxpl_id; //data xfer property list of each member -} H5FD_family_dxpl_t; -\endcode -n order to copy or free one of these structures the member file access or data transfer properties must -also be copied or freed. This is done by providing a copy and close function for each structure: - -Example: The file access property list copy and close functions for the family driver: -\code -static void * -H5FD_family_fapl_copy(const void *_old_fa) -{ - const H5FD_family_fapl_t *old_fa = (const H5FD_family_fapl_t*)_old_fa; - H5FD_family_fapl_t *new_fa = malloc(sizeof(H5FD_family_fapl_t)); - assert(new_fa); - - memcpy(new_fa, old_fa, sizeof(H5FD_family_fapl_t)); - new_fa->memb_fapl_id = H5Pcopy(old_fa->memb_fapl_id); - return new_fa; -} - -static herr_t -H5FD_family_fapl_free(void *_fa) -{ - H5FD_family_fapl_t *fa = (H5FD_family_fapl_t*)_fa; - H5Pclose(fa->memb_fapl_id); - free(fa); - return 0; -} -\endcode - -Generally when a file is created or opened the file access properties for the driver are copied into the -file pointer which is returned and they may be modified from their original value (for instance, the file -family driver modifies the member size property when opening an existing family). In order to support the -#H5Fget_access_plist function the driver must provide a fapl_get callback which creates a copy of the -driver-specific properties based on a particular file. - -Example: The file family driver copies the member size file access property list into the return value: -\code -static void * -H5FD_family_fapl_get(H5FD_t *_file) -{ - H5FD_family_t *file = (H5FD_family_t*)_file; - H5FD_family_fapl_t *fa = calloc(1, sizeof(H5FD_family_fapl_t*)); - - fa->memb_size = file->memb_size; - fa->memb_fapl_id = H5Pcopy(file->memb_fapl_id); - return fa; -} -\endcode - -\subsection subsec_vfl_imp_file File Functions -The higher layers of the library expect files to have a name and allow the file to be accessed in various modes. -The driver must be able to create a new file, replace an existing file, or open an existing file. Opening or -creating a file should return a handle, a pointer to a specialization of the H5FD_t struct, which allows read-only -or read-write access and which will be passed to the other driver functions as they are called.(Read-only access is -only appropriate when opening an existing file.) -\code -typedef struct { - // Public fields - H5FD_class_t *cls; //class data defined below - - // Private fields -- driver-defined - -} H5FD_t; -\endcode - -Example: The family driver requires handles to the underlying storage, the size of the members for this -particular file (which might be different than the member size specified in the file access property list -if an existing file family is being opened), the name used to open the file in case additional members -must be created, and the flags to use for creating those additional members. The eoa member caches the -size of the format address space so the family members don't have to be queried in order to find it. -\code -// The description of a file belonging to this driver. -typedef struct H5FD_family_t { - H5FD_t pub; // public stuff, must be first - hid_t memb_fapl_id; // file access property list for members - hsize_t memb_size; // maximum size of each member file - int nmembs; // number of family members - int amembs; // number of member slots allocated - H5FD_t **memb; // dynamic array of member pointers - haddr_t eoa; // end of allocated addresses - char *name; // name generator printf format - unsigned flags; // flags for opening additional members -} H5FD_family_t; -\endcode - -Example: The sec2 driver needs to keep track of the underlying Unix file descriptor and also the -end of format address space and current Unix file size. It also keeps track of the current file -position and last operation (read, write, or unknown) in order to optimize calls to lseek. The -device and inode fields are defined on Unix in order to uniquely identify the file and will be -discussed below. -\code -typedef struct H5FD_sec2_t { - H5FD_t pub; // public stuff, must be first - int fd; // the unix file - haddr_t eoa; // end of allocated region - haddr_t eof; // end of file; current file size - haddr_t pos; // current file I/O position - int op; // last operation - dev_t device; // file device number - ino_t inode; // file i-node number -} H5FD_sec2_t; -\endcode - -\subsection subsec_vfl_imp_open Open Files -All drivers must define a function for opening/creating a file. This function should have a prototype which is: - - - - - -
static H5FD_t * open (const char *name, unsigned flags, hid_t fapl, haddr_t maxaddr)The file name name and file access property list fapl are the same as were specified in the #H5Fcreate -or #H5Fopen call. The flags are the same as in those calls also except the flag #H5F_ACC_CREAT is also -present if the call was to H5Fcreate and they are documented in the 'H5Fpublic.h' file. The maxaddr -argument is the maximum format address that the driver should be prepared to handle (the minimum address is always zero).
- -Example: The sec2 driver opens a Unix file with the requested name and saves information which -uniquely identifies the file (the Unix device number and inode). -\code -static H5FD_t * -H5FD_sec2_open(const char *name, unsigned flags, hid_t fapl_id/*unused*/, - haddr_t maxaddr) -{ - unsigned o_flags; - int fd; - struct stat sb; - H5FD_sec2_t *file=NULL; - - // Check arguments - if (!name || !*name) return NULL; - if (0==maxaddr || HADDR_UNDEF==maxaddr) return NULL; - if (ADDR_OVERFLOW(maxaddr)) return NULL; - - // Build the open flags - o_flags = (H5F_ACC_RDWR & flags) ? O_RDWR : O_RDONLY; - if (H5F_ACC_TRUNC & flags) o_flags |= O_TRUNC; - if (H5F_ACC_CREAT & flags) o_flags |= O_CREAT; - if (H5F_ACC_EXCL & flags) o_flags |= O_EXCL; - - // Open the file - if ((fd=open(name, o_flags, 0666))<0) return NULL; - if (fstat(fd, &sb)<0) { - close(fd); - return NULL; - } - - // Create the new file struct - file = calloc(1, sizeof(H5FD_sec2_t)); - file->fd = fd; - file->eof = sb.st_size; - file->pos = HADDR_UNDEF; - file->op = OP_UNKNOWN; - file->device = sb.st_dev; - file->inode = sb.st_ino; - - return (H5FD_t*)file; -} -\endcode - -\subsection subsec_vfl_imp_close Closing Files -Closing a file simply means that all cached data should be flushed to the next lower layer, the -file should be closed at the next lower layer, and all file-related data structures should be -freed. All information needed by the close function is already present in the file handle. - - - - - -
static herr_t close (H5FD_t *file)The file argument is the handle which was returned by the open function, and the close should -free only memory associated with the driver-specific part of the handle (the public parts will -have already been released by HDF5's virtual file layer).
- -Example: The sec2 driver just closes the underlying Unix file, making sure that the actual -file size is the same as that known to the library by writing a zero to the last file position -it hasn't been written by some previous operation (which happens in the same code which flushes -the file contents and is shown below). -\code -static herr_t -H5FD_sec2_close(H5FD_t *_file) -{ - H5FD_sec2_t *file = (H5FD_sec2_t*)_file; - - if (H5FD_sec2_flush(_file)<0) return -1; - if (close(file->fd)<0) return -1; - free(file); - return 0; -} -\endcode - -\subsection subsec_vfl_imp_key File Keys -Occasionally an application will attempt to open a single file more than one time in order -to obtain multiple handles to the file. HDF5 allows the files to share information(For instance, -writing data to one handle will cause the data to be immediately visible on the other handle.) -but in order to accomplish this HDF5 must be able to tell when two names refer to the same file. -It does this by associating a driver-defined key with each file opened by a driver and comparing -the key for an open request with the keys for all other files currently open by the same driver. - - - - - -
const int cmp (const H5FD_t *f1, const H5FD_t *f2)The driver may provide a function which compares two files f1 and f2 belonging to the same -driver and returns a negative, positive, or zero value a la the strcmp function.(The ordering -is arbitrary as long as it's consistent within a particular file driver.) If this function is -not provided then HDF5 assumes that all calls to the open callback return unique files regardless -of the arguments and it is up to the application to avoid doing this if that assumption is incorrect.
- -Each time a file is opened the library calls the cmp function to compare that file with all other files -currently open by the same driver and if one of them matches (at most one can match) then the file -which was just opened is closed and the previously opened file is used instead. - -Opening a file twice with incompatible flags will result in failure. For instance, opening a file with -the truncate flag is a two step process which first opens the file without truncation so keys can be -compared, and if no matching file is found already open then the file is closed and immediately reopened -with the truncation flag set (if a matching file is already open then the truncating open will fail). - -Example: The sec2 driver uses the Unix device and i-node as the key. They were initialized when -the file was opened. -\code -static int -H5FD_sec2_cmp(const H5FD_t *_f1, const H5FD_t *_f2) -{ - const H5FD_sec2_t *f1 = (const H5FD_sec2_t*)_f1; - const H5FD_sec2_t *f2 = (const H5FD_sec2_t*)_f2; - - if (f1->device < f2->device) return -1; - if (f1->device > f2->device) return 1; - - if (f1->inode < f2->inode) return -1; - if (f1->inode > f2->inode) return 1; - - return 0; -} -\endcode - -\subsection subsec_vfl_imp_save Saving Modes Across Opens -Some drivers may also need to store certain information in the file superblock in order -to be able to reliably open the file at a later date. This is done by three functions: -one to determine how much space will be necessary to store the information in the superblock, -one to encode the information, -and one to decode the information. These functions are optional, but if any one is defined -then the other two must also be defined. - - - - - - - - - - - - - - - - - -
FunctionDescription
static hsize_t sb_size (H5FD_t *file)The sb_size function returns the number of bytes necessary to encode -information needed later if the file is reopened.
static herr_t sb_encode (H5FD_t *file, char *name, unsigned char *buf)The sb_encode function encodes information from the file into buffer buf -allocated by the caller. It also writes an 8-character (plus null termination) into -the name argument, which should be a unique identification for the driver.
static herr_t sb_decode (H5FD_t *file, const char *name, const unsigned char *buf)The sb_decode function looks at the name decodes data from the buffer buf and -updates the file argument with the new information, advancing *p in the process.
-The part of this which is somewhat tricky is that the file must be readable before the -superblock information is decoded. File access modes fall outside the scope of the HDF5 -file format, but they are placed inside the boot block for convenience.(File access modes -do not describe data, but rather describe how the HDF5 format address space is mapped to -the underlying file(s). Thus, in general the mapping must be known before the file -superblock can be read. However, the user usually knows enough about the mapping for -the superblock to be readable and once the superblock is read the library can fill -in the missing parts of the mapping.) - -\section sec_vfl_address Address Space Functions -HDF5 does not assume that a file is a linear address space of bytes. Instead, the library -will call functions to allocate and free portions of the HDF5 format address space, which -in turn map onto functions in the file driver to allocate and free portions of file address -space. The library tells the file driver how much format address space it wants to allocate -and the driver decides what format address to use and how that format address is mapped -onto the file address space. Usually the format address is chosen so that the file address -can be calculated in constant time for data I/O operations (which are always specified by format addresses). - -\subsection subsec_vfl_address_blk Userblock and Superblock -The HDF5 format allows an optional userblock to appear before the actual HDF5 data in such -a way that if the userblock is sucked out of the file and everything remaining is -shifted downward in the file address space, then the file is still a valid HDF5 file. -The userblock size can be zero or any multiple of two greater than or equal to 512 and -the file superblock begins immediately after the userblock. - -HDF5 allocates space for the userblock and superblock by calling an allocation function -defined below, which must return a chunk of memory at format address zero on the first call. - -\subsection subsec_vfl_address_alloc Allocatiion of Format Regions -The library makes many types of allocation requests: - - - - - - - - - - - - - - - - - - - - -
#H5FD_MEM_SUPERuserblock
#H5FD_MEM_BTREEAn allocation request for a node of a B-tree. -
#H5FD_MEM_DRAWAn allocation request for the raw data of a dataset. -
#H5FD_MEM_GHEAPAn allocation request for a global heap collection. Global -heaps are used to store certain types of references such as dataset region references. -The set of all global heap collections can become quite large. -
#H5FD_MEM_LHEAPAn allocation request for a local heap. Local heaps are used -to store the names which are members of a group. The combined size of all local heaps is -a function of the number of object names in the file. -
#H5FD_MEM_OHDRAn allocation request for (part of) an object header. Object -headers are relatively small and include meta information about objects (like the data -space and type of a dataset) and attributes. -
- -When a chunk of memory is freed the library adds it to a free list and allocation requests -are satisfied from the free list before requesting memory from the file driver. Each type of -allocation request enumerated above has its own free list, but the file driver can specify that -certain object types can share a free list. It does so by providing an array which maps a -request type to a free list. If any value of the map is H5MF_DEFAULT (zero) then the object's -own free list is used. The special value H5MF_NOLIST indicates that the library should not -attempt to maintain a free list for that particular object type, instead calling the file driver -each time an object of that type is freed. - -Mappings predefined in the 'H5FDpublic.h' file are: - - - - - - - - - - -
#H5FD_FLMAP_SINGLEAll memory usage types are mapped to a single free list. -
#H5FD_FLMAP_DICHOTOMYMemory usage is segregated into meta data and raw data -for the purposes of memory management. -
#H5FD_FLMAP_DEFAULTEach memory usage type has its own free list. -
- -Example: To make a map that manages object headers on one free list and everything else on -another free list one might initialize the map with the following code: (the use of #H5FD_MEM_SUPER is arbitrary) -\code -H5FD_mem_t mt, map[H5FD_MEM_NTYPES]; - -for (mt = 0; mt < H5FD_MEM_NTYPES; mt++) { - map[mt] = (H5FD_MEM_OHDR== mt) ? mt : H5FD_MEM_SUPER; -} -\endcode - -If an allocation request cannot be satisfied from the free list then one of two things happen. -If the driver defines an allocation callback then it is used to allocate space; otherwise new -memory is allocated from the end of the format address space by incrementing the end-of-address marker. - - - - - -
static haddr_t alloc (H5FD_t *file, H5MF_type_t type, hsize_t size)The file argument is the file from which space is to be allocated, type is the type of -memory being requested (from the list above) without being mapped according to the freelist -map and size is the number of bytes being requested. The library is allowed to allocate large -chunks of storage and manage them in a layer above the file driver (although the current library -doesn't do that). The allocation function should return a format address for the first byte -allocated. The allocated region extends from that address for size bytes. If the request cannot -be honored then the undefined address value is returned (#HADDR_UNDEF). The first call to this -function for a file which has never had memory allocated must return a format address of zero -or #HADDR_UNDEF since this is how the library allocates space for the userblock and/or superblock.
- -\subsection subsec_vfl_address_free Freeing Format Regions -When the library is finished using a certain region of the format address space it will return the -space to the free list according to the type of memory being freed and the free list map described above. -If the free list has been disabled for a particular memory usage type (according to the free list map) -and the driver defines a free callback then it will be invoked. The free callback is also invoked for -all entries on the free list when the file is closed. - - - - - - -
static herr_t free (H5FD_t *file, H5MF_type_t type, haddr_t addr, hsize_t size)The file argument is the file for which space is being freed; type is the type of object being -freed (from the list above) without being mapped according to the freelist map; addr is the first -format address to free; and size is the size in bytes of the region being freed. The region being -freed may refer to just part of the region originally allocated and/or may cross allocation boundaries -provided all regions being freed have the same usage type. However, the library will never attempt -to free regions which have already been freed or which have never been allocated.
-A driver may choose to not define the free function, in which case format addresses will be leaked. -This isn't normally a huge problem since the library contains a simple free list of its own and freeing -parts of the format address space is not a common occurrence. - -\subsection subsec_vfl_address_query Querying the Address Range -Each file driver must have some mechanism for setting and querying the end of address, or -EOA, marker. The EOA marker is the first format address after the last format address ever allocated. -If the last part of the allocated address range is freed then the driver may optionally decrease the eoa marker. - - - - - -
static haddr_t get_eoa (H5FD_t *file)This function returns the current value of the EOA marker for the specified file.
- -Example: The sec2 driver just returns the current eoa marker value which is cached in the file structure: -\code -static haddr_t -H5FD_sec2_get_eoa(H5FD_t *_file) -{ - H5FD_sec2_t *file = (H5FD_sec2_t*)_file; - return file->eoa; -} -\endcode - -The eoa marker is initially zero when a file is opened and the library may set it to some other value -shortly after the file is opened (after the superblock is read and the saved eoa marker is determined) -or when allocating additional memory in the absence of an alloc callback (described above). - -Example: The sec2 driver simply caches the eoa marker in the file structure and does not extend the -underlying Unix file. When the file is flushed or closed then the Unix file size is extended to match -the eoa marker. -\code -static herr_t -H5FD_sec2_set_eoa(H5FD_t *_file, haddr_t addr) -{ - H5FD_sec2_t *file = (H5FD_sec2_t*)_file; - file->eoa = addr; - return 0; -} -\endcode - -\section sec_vfl_data Data Functions -These functions operate on data, transferring a region of the format address space between memory and files. - -\subsection subsec_vfl_data_cont Contiguous I/O Functions -A driver must specify two functions to transfer data from the library to the file and vice versa. - - - - - - - - - -
static herr_t read (H5FD_t *file, H5FD_mem_t type, hid_t dxpl, haddr_t addr, hsize_t size, void *buf)The read function reads data from file file beginning at address addr and continuing -for size bytes into the buffer buf supplied by the caller.
static herr_t write (H5FD_t *file, H5FD_mem_t type, hid_t dxpl, haddr_t addr, hsize_t size, const void *buf)The write function transfers data -in the opposite direction.
-\li Both functions take a data transfer property list dxpl which -indicates the fine points of how the data is to be transferred and which comes directly -from the #H5Dread or #H5Dwrite function. -\li Both functions receive type of data being written, -which may allow a driver to tune it's behavior for different kinds of data. -\li Both functions should return -a negative value if they fail to transfer the requested data, or non-negative if they -succeed. The library will never attempt to read from unallocated regions of the format address space. - -Example: The sec2 driver just makes system calls. It tries not to call lseek if the current operation -is the same as the previous operation and the file position is correct. It also fills the output buffer -with zeros when reading between the current EOF and EOA markers and restarts system calls which were interrupted. -\code -static herr_t -H5FD_sec2_read(H5FD_t *_file, H5FD_mem_t type/*unused*/, hid_t dxpl_id/*unused*/, - haddr_t addr, hsize_t size, void *buf/*out*/) -{ - H5FD_sec2_t *file = (H5FD_sec2_t*)_file; - ssize_t nbytes; - - assert(file && file->pub.cls); - assert(buf); - - /* Check for overflow conditions */ - if (REGION_OVERFLOW(addr, size)) return -1; - if (addr+size>file->eoa) return -1; - - /* Seek to the correct location */ - if ((addr!=file->pos || OP_READ!=file->op) && - file_seek(file->fd, (file_offset_t)addr, SEEK_SET)<0) { - file->pos = HADDR_UNDEF; - file->op = OP_UNKNOWN; - return -1; - } - - /* - * Read data, being careful of interrupted system calls, partial results, - * and the end of the file. - */ - while (size>0) { - do nbytes = read(file->fd, buf, size); - while (-1==nbytes && EINTR==errno); - if (-1==nbytes) { - /* error */ - file->pos = HADDR_UNDEF; - file->op = OP_UNKNOWN; - return -1; - } - if (0==nbytes) { - /* end of file but not end of format address space */ - memset(buf, 0, size); - size = 0; - } - assert(nbytes>=0); - assert((hsize_t)nbytes<=size); - size -= (hsize_t)nbytes; - addr += (haddr_t)nbytes; - buf = (char*)buf + nbytes; - } - - /* Update current position */ - file->pos = addr; - file->op = OP_READ; - return 0; -} -\endcode -Example: The sec2 write callback is similar except it updates the file EOF marker when extending the file. - -\subsection subsec_vfl_data_flush Flushing Cached Data -Some drivers may desire to cache data in memory in order to make larger I/O requests to the -underlying file and thus improving bandwidth. Such drivers should register a cache flushing -function so that the library can insure that data has been flushed out of the drivers in -response to the application calling #H5Fflush. - - - - - -
static herr_t flush (H5FD_t *file)Flush all data for file file to storage.
- -Example: The sec2 driver doesn't cache any data but it also doesn't extend the Unix file as -aggressively as it should. Therefore, when finalizing a file it should write a zero to the last -byte of the allocated region so that when reopening the file later the EOF marker will be at -least as large as the EOA marker saved in the superblock (otherwise HDF5 will refuse to open -the file, claiming that the data appears to be truncated). -\code -static herr_t -H5FD_sec2_flush(H5FD_t *_file) -{ - H5FD_sec2_t *file = (H5FD_sec2_t*)_file; - - if (file->eoa>file->eof) { - if (-1==file_seek(file->fd, file->eoa-1, SEEK_SET)) return -1; - if (write(file->fd, "", 1)!=1) return -1; - file->eof = file->eoa; - file->pos = file->eoa; - file->op = OP_WRITE; - } - - return 0; -} -\endcode - -\section sec_vfl_opt Optimization Functions -The library is capable of performing several generic optimizations on I/O, but these types of -optimizations may not be appropriate for a given VFL driver. - -Each driver may provide a query function to allow the library to query whether to enable these -optimizations. If a driver lacks a query function, the library will disable all types of -optimizations which can be queried. - - - - - - -
static herr_t query (const H5FD_t *file, unsigned long *flags)This function is called by the library to query which optimizations to enable for I/O to this driver.
- -These are the flags which are currently defined: - - - - - - - - - - - - - -
H5FD_FEAT_AGGREGATE_METADATA (0x00000001)Defining the H5FD_FEAT_AGGREGATE_METADATA for a VFL driver means that the library will attempt to allocate -a larger block for metadata and then sub-allocate each metadata request from that larger block.
H5FD_FEAT_ACCUMULATE_METADATA (0x00000002)Defining the H5FD_FEAT_ACCUMULATE_METADATA for a VFL driver means that the library will attempt to cache -metadata as it is written to the file and build up a larger block of metadata to eventually pass to the -VFL 'write' routine.
H5FD_FEAT_DATA_SIEVE (0x00000004)Defining the H5FD_FEAT_DATA_SIEVE for a VFL driver means that the library will attempt to cache raw data - as it is read from/written to a file in a "data sieve" buffer.
- -See Rajeev Thakur's papers: -http://www.mcs.anl.gov/~thakur/papers/romio-coll.ps.gz -http://www.mcs.anl.gov/~thakur/papers/mpio-high-perf.ps.gz - -\section sec_vfl_reg Registration of a Driver -Before a driver can be used the HDF5 library needs to be told of its existence. This is done by -registering the driver, which results in a driver identification number. Instead of passing many -arguments to the registration function, the driver information is entered into a structure and the -address of the structure is passed to the registration function where it is copied. This allows -the HDF5 API to be extended while providing backward compatibility at the source level. - - - - - - -
hid_t H5FDregister (H5FD_class_t *cls)The driver described by struct cls is registered with the library and an ID number for the driver is returned.
- -The H5FD_class_t type is a struct with the following fields: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
const char *nameA pointer to a constant, null-terminated driver name to be used for debugging purposes.
size_t fapl_sizeThe size in bytes of the file access mode structure or zero if the driver supplies a copy function -or doesn't define the structure.
void *(*fapl_copy)(const void *fapl)An optional function which copies a driver-defined file access mode structure. This field takes -precedence over fm_size when both are defined.
void (*fapl_free)(void *fapl)An optional function to free the driver-defined file access mode structure. If null, then the -library calls the C free function to free the structure.
size_t dxpl_sizeThe size in bytes of the data transfer mode structure or zero if the driver supplies a copy -function or doesn't define the structure.
void *(*dxpl_copy)(const void *dxpl)An optional function which copies a driver-defined data transfer mode structure. This field -takes precedence over xm_size when both are defined.
void (*dxpl_free)(void *dxpl)An optional function to free the driver-defined data transfer mode structure. If null, then -the library calls the C free function to free the structure.
H5FD_t *(*open)(const char *name, unsigned flags, hid_t fapl, haddr_t maxaddr)The function which opens or creates a new file.
herr_t (*close)(H5FD_t *file)The function which ends access to a file.
int (*cmp)(const H5FD_t *f1, const H5FD_t *f2)An optional function to determine whether two open files have the same key. If this function -is not present then the library assumes that two files will never be the same.
int (*query)(const H5FD_t *f, unsigned long *flags)An optional function to determine which library optimizations a driver can support.
haddr_t (*alloc)(H5FD_t *file, H5FD_mem_t type, hsize_t size)An optional function to allocate space in the file.
herr_t (*free)(H5FD_t *file, H5FD_mem_t type, haddr_t addr, hsize_t size)An optional function to free space in the file.
haddr_t (*get_eoa)(H5FD_t *file)A function to query how much of the format address space has been allocated.
herr_t (*set_eoa)(H5FD_t *file, haddr_t)A function to set the end of address space.
haddr_t (*get_eof)(H5FD_t *file)A function to return the current end-of-file marker value.
herr_t (*read)(H5FD_t *file, H5FD_mem_t type, hid_t dxpl, haddr_t addr, hsize_t size, void *buffer)A function to read data from a file.
herr_t (*write)(H5FD_t *file, H5FD_mem_t type, hid_t dxpl, haddr_t addr, hsize_t size, const void *buffer)A function to write data to a file.
herr_t (*flush)(H5FD_t *file)A function which flushes cached data to the file.
H5FD_mem_t fl_map[H5FD_MEM_NTYPES]An array which maps a file allocation request type to a free list.
- -Example: The sec2 driver would be registered as: -\code -static const H5FD_class_t H5FD_sec2_g = { - "sec2", /*name */ - MAXADDR, /*maxaddr */ - NULL, /*sb_size */ - NULL, /*sb_encode */ - NULL, /*sb_decode */ - 0, /*fapl_size */ - NULL, /*fapl_get */ - NULL, /*fapl_copy */ - NULL, /*fapl_free */ - 0, /*dxpl_size */ - NULL, /*dxpl_copy */ - NULL, /*dxpl_free */ - H5FD_sec2_open, /*open */ - H5FD_sec2_close, /*close */ - H5FD_sec2_cmp, /*cmp */ - H5FD_sec2_query, /*query */ - NULL, /*alloc */ - NULL, /*free */ - H5FD_sec2_get_eoa, /*get_eoa */ - H5FD_sec2_set_eoa, /*set_eoa */ - H5FD_sec2_get_eof, /*get_eof */ - H5FD_sec2_read, /*read */ - H5FD_sec2_write, /*write */ - H5FD_sec2_flush, /*flush */ - H5FD_FLMAP_SINGLE, /*fl_map */ -}; - -hid_t -H5FD_sec2_init(void) -{ - if (!H5FD_SEC2_g) { - H5FD_SEC2_g = H5FDregister(&H5FD_sec2_g); - } - return H5FD_SEC2_g; -} -\endcode - -A driver can be removed from the library by unregistering it - - - - - -
herr_t H5Dunregister (hid_t driver)Where driver is the ID number returned when the driver was registered.
-Unregistering a driver makes it unusable for creating new file access or data transfer property -lists but doesn't affect any property lists or files that already use that driver. - -\subsection subsec_vfl_reg_prog Programming Note for C++ Developers Using C Functions -If a C routine that takes a function pointer as an argument is called from within C++ code, -the C routine should be returned from normally. - -Examples of this kind of routine include callbacks such as #H5Pset_elink_cb -and #H5Pset_type_conv_cb and functions such as #H5Tconvert and #H5Ewalk2. - -Exiting the routine in its normal fashion allows the HDF5 C Library to clean up -its work properly. In other words, if the C++ application jumps out of the routine -back to the C++ “catch” statement, the library is not given the opportunity to close -any temporary data structures that were set up when the routine was called. The C++ -application should save some state as the routine is started so that any problem that -occurs might be diagnosed. - -\section sec_vfl_query Querying Driver Information - - - - - -
void * H5Pget_driver_data (hid_t fapl)
void * H5Pget_driver_data (hid_t fxpl)
This function is intended to be used by driver functions, not applications. It returns a pointer -directly into the file access property list fapl which is a copy of the driver's file access mode -originally provided to the H5Pset_driver function. If its argument is a data transfer property list -fxpl then it returns a pointer to the driver-specific data transfer information instead. -
- -\section sec_vfl_misc Miscellaneous -The various private H5F_low_* functions will be replaced by public H5FD* functions so they -can be called from drivers. - -All private functions H5F_addr_* which operate on addresses will be renamed as public functions -by removing the first underscore so they can be called by drivers. - -The haddr_t address data type will be passed by value throughout the library. The original -intent was that this type would eventually be a union of file address types for the various -drivers and may become quite large, but that was back when drivers were part of HDF5. It will -become an alias for an unsigned integer type (32 or 64 bits depending on how the library was configured). - -The various H5F*.c driver files will be renamed H5FD*.c and each will have a corresponding header -file. All driver functions except the initializer and API will be declared static. - -This documentation didn't cover optimization functions which would be useful to drivers like MPI-IO. -Some drivers may be able to perform data pipeline operations more efficiently than HDF5 and need to -be given a chance to override those parts of the pipeline. The pipeline would be designed to call -various H5FD optimization functions at various points which return one of three values: the operation -is not implemented by the driver, the operation is implemented but failed in a non-recoverable manner, -the operation is implemented and succeeded. - -Various parts of HDF5 check the only the top-level file driver and do something special if it is -the MPI-IO driver. However, we might want to be able to put the MPI-IO driver under other drivers -such as the raw part of a split driver or under a debug driver whose sole purpose is to accumulate -statistics as it passes all requests through to the MPI-IO driver. Therefore we will probably need -a function which takes a format address and or object type and returns the driver which would have -been used at the lowest level to process the request. - -*/ - ->>>>>>> ae27da2 Convert Library Version html to doxygen (#5162) /** \page FMTDISC HDF5 File Format Discussion \htmlinclude FileFormat.html