Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement support for complex number datatypes #4630

Merged
merged 4 commits into from
Oct 31, 2024

Conversation

jhendersonHDF
Copy link
Collaborator

Adds the new datatype class H5T_COMPLEX

Adds the new API function H5Tcomplex_create which creates a complex number datatype from an ID of a base floating-point datatype

Adds the new feature check macros H5_HAVE_COMPLEX_NUMBERS and H5_HAVE_C99_COMPLEX_NUMBERS

Adds the new datatype size macros H5_SIZEOF_FLOAT_COMPLEX, H5_SIZEOF_DOUBLE_COMPLEX and H5_SIZEOF_LONG_DOUBLE_COMPLEX

Adds the new datatype ID macros H5T_NATIVE_FLOAT_COMPLEX, H5T_NATIVE_DOUBLE_COMPLEX, H5T_NATIVE_LDOUBLE_COMPLEX, H5T_CPLX_IEEE_F16LE, H5T_CPLX_IEEE_F16BE,
H5T_CPLX_IEEE_F32LE, H5T_CPLX_IEEE_F32BE,
H5T_CPLX_IEEE_F64LE and H5T_CPLX_IEEE_F64BE

Adds hard and soft datatype conversion paths between complex number datatypes and all the integer and floating-point datatypes, as well as between other complex number datatypes

Adds a special conversion path between complex number datatypes and array or compound datatypes where the in-memory layout of data is the same between the datatypes and data can be converted directly

Adds support for complex number datatypes to the h5dump, h5ls and h5diff/ph5diff tools. Allows h5dump '-m' option to change floating-point printing format for float complex and double complex datatypes, as well as long double complex if it has the same size as double complex

Adds minimal support to the h5watch and h5import tools

Adds support for the predefined complex number datatypes and H5Tcomplex_create function to the Java wrappers. Also adds initial, untested support to the JNI for future use with HDFView

Adds support for just the H5T_COMPLEX datatype class to the Fortran wrappers

Adds support for the predefined complex number datatypes and H5Tcomplex_create function to the high level library H5LT interface for use with the H5LTtext_to_dtype and H5LTdtype_to_text functions

Changes some usages of "complex" in the library since it conflicts with the "complex" keyword from the complex.h header. Also changes various usages of the word "complex" throughout the library to distinguish compound datatypes from complex datatypes.

@jhendersonHDF jhendersonHDF added Merge - Develop Only This cannot be merged to previous versions of HDF5 (file format or breaking API changes) Priority - 0. Blocker ⛔ This MUST be merged for the release to happen Component - C Library Core C library issues (usually in the src directory) Component - Tools Command-line tools like h5dump, includes high-level tools Component - High-Level Library Code in the hl directory Component - Documentation Doxygen, markdown, etc. Component - Java Java wrappers Component - Fortran Fortran wrappers Component - Testing Code in test or testpar directories, GitHub workflows Component - Build CMake, Autotools Type - New Feature Add a new API call, functionality, or tool Component - Misc Anything else (CODEOWNERS, etc.) labels Jul 8, 2024
@jhendersonHDF
Copy link
Collaborator Author

Additional complex number features that haven't been implemented yet are mentioned in #4631.

if (c_classtype == H5T_OPAQUE) *classtype = H5T_OPAQUE_F;
if (c_classtype == H5T_COMPOUND) *classtype = H5T_COMPOUND_F;
if (c_classtype == H5T_REFERENCE) *classtype = H5T_REFERENCE_F;
if (c_classtype == H5T_ENUM) *classtype = H5T_ENUM_F;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't look like this commented out section was useful, but can be put back.

@@ -3915,7 +3915,7 @@ test_nbit_compound_2(hid_t file)
unsigned int v;
char b[2][2];
atomic d[2][2];
} complex;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This structure had to be renamed since "complex" is a keyword from complex.h and conflicted

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

h5dump XML support was added and I verified it with a couple files, but no XML tests were added because the XML output always prints data (either dataset data or fill values) and long double/long double _Complex data displays differently across platforms due to sizeof(long double) == sizeof(double) with MSVC. h5dump would need some work for these tests to be portable.

*
* Return: void
*-------------------------------------------------------------------------
*/

static void
aux_tblinsert_filter(pack_opttbl_t *table, unsigned int I, filter_info_t filt)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to be changed since it is a keyword from the complex.h header and conflicted.

@@ -1051,11 +1053,41 @@ parse_command_line(int argc, const char *const *argv)
h5tools_nCols = 0;
break;

case 'm':
case 'm': {
char *format_specifier, *plus_flag;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

h5dump's "-m" option allows specifying the printf specifier for float and double data, as well as long double data if sizeof(long double) == sizeof(double). Rather than having a separate flag for float/double _Complex data and a possible additional one for long double _Complex data, I just re-used this flag. A better solution is likely needed if XML output will continue to be supported, since the "-m" flag applying to long double (_Complex) data or not, depending on the platform, makes portable h5dump tests difficult.

goto padding;

if (except_ret == H5T_CONV_UNHANDLED)
H5T__bit_set(d, (dst_atomic.offset + dst_atomic.prec - 1), (size_t)1, true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're using the H5T__bit_set, etc. routines so heavily now, I suggest moving them into one of the H5T headers and making them 'static inline', to speed them up.

@@ -185,27 +177,47 @@ H5T_get_force_conv(const H5T_t *dt)
*-------------------------------------------------------------------------
*/
herr_t
H5T__reverse_order(uint8_t *rev, uint8_t *s, size_t size, H5T_order_t order)
H5T__reverse_order(uint8_t *rev, uint8_t *s, const H5T_t *dtype)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After looking at this again, the function name is very misleading. It should be "H5T__to_le_order" or similar. If you are on a little-endian machine, it definitely doesn't reverse the order.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed and ideally we would be able to refactor the conversion functions to not have to byte-swap data into little-endian order in the first place, removing the need for reversing the buffer before calling the exception callback function.

src/H5Tpkg.h Show resolved Hide resolved
src/H5Tpkg.h Outdated Show resolved Hide resolved
tools/lib/h5tools.h Outdated Show resolved Hide resolved
"b" +4 IEEE 64-bit little-endian float
} 12 bytes
"b" +8 IEEE 64-bit little-endian float
} 16 bytes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did this change?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My guess is that the file must have previously been generated on some system where the compiler used different structure padding.

Copy link
Contributor

@qkoziol qkoziol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing amount of great work here!

Four changes seem important to me:

  • Add the fuzzing guards on the complex datatype's encoding.
  • Add the version field to the complex datatype encoding, so that if we ever add heterogeneous complex types, we don't have to write as complex of a parser.
  • Switch H5Tset_size's behavior
  • Remove the 'homogeneous' flag from the complex datatype struct.

@jhendersonHDF
Copy link
Collaborator Author

* Add the fuzzing guards on the complex datatype's encoding.

The previous comment was made in the decode helper routine. Do we want checks both during encode and decode?

* Add the version field to the complex datatype encoding, so that if we ever add heterogeneous complex types, we don't have to write as complex of a parser.

What's the purpose of having a complex number-specific encoding version field here? Is it just to check that no heterogeneous complex types are allowed in, e.g. version 0?

@qkoziol
Copy link
Contributor

qkoziol commented Jul 10, 2024

* Add the fuzzing guards on the complex datatype's encoding.

The previous comment was made in the decode helper routine. Do we want checks both during encode and decode?

Sorry, I meant decoding.

* Add the version field to the complex datatype encoding, so that if we ever add heterogeneous complex types, we don't have to write as complex of a parser.

What's the purpose of having a complex number-specific encoding version field here? Is it just to check that no heterogeneous complex types are allowed in, e.g. version 0?

🤔 I worked through all the permutations I can think of again, and I think it'll be OK. I was concerned about the possible addition of another parent datatype for the heterogeneous case, but I think the existing flag will be OK for the forward/backward compatibility So, I think we can skip adding the version field.

@jhendersonHDF
Copy link
Collaborator Author

Most of the major previous comments should be addressed now. I haven't made the change to H5Tset_size yet as I still feel fairly strongly about that and I don't think we necessarily came to any real conclusion on it. There are various other comments on other possible optimizations or refactoring that could be made, but many of them seem like things that could fall outside this PR.

@derobins derobins added this to the 1.16.0 milestone Aug 15, 2024
@qkoziol
Copy link
Contributor

qkoziol commented Aug 26, 2024

Most of the major previous comments should be addressed now. I haven't made the change to H5Tset_size yet as I still feel fairly strongly about that and I don't think we necessarily came to any real conclusion on it. There are various other comments on other possible optimizations or refactoring that could be made, but many of them seem like things that could fall outside this PR.

Perhaps we should have a short, small audience meeting to discuss the H5Tset_size() issue?

@jhendersonHDF jhendersonHDF force-pushed the feature/complex_number_support branch from aeff840 to 0b0b1dc Compare October 2, 2024 23:55
@jhendersonHDF jhendersonHDF removed the Merge - Develop Only This cannot be merged to previous versions of HDF5 (file format or breaking API changes) label Oct 3, 2024
@jhendersonHDF jhendersonHDF force-pushed the feature/complex_number_support branch 2 times, most recently from eee1cef to 756db47 Compare October 9, 2024 21:13
Adds the new datatype class H5T_COMPLEX

Adds the new API function H5Tcomplex_create which creates a complex
number datatype from an ID of a base floating-point datatype

Adds the new feature check macros H5_HAVE_COMPLEX_NUMBERS and
H5_HAVE_C99_COMPLEX_NUMBERS

Adds the new datatype size macros H5_SIZEOF_FLOAT_COMPLEX,
H5_SIZEOF_DOUBLE_COMPLEX and H5_SIZEOF_LONG_DOUBLE_COMPLEX

Adds the new datatype ID macros H5T_NATIVE_FLOAT_COMPLEX,
H5T_NATIVE_DOUBLE_COMPLEX, H5T_NATIVE_LDOUBLE_COMPLEX,
H5T_CPLX_IEEE_F16LE, H5T_CPLX_IEEE_F16BE,
H5T_CPLX_IEEE_F32LE, H5T_CPLX_IEEE_F32BE,
H5T_CPLX_IEEE_F64LE and H5T_CPLX_IEEE_F64BE

Adds hard and soft datatype conversion paths between complex number
datatypes and all the integer and floating-point datatypes, as well as
between other complex number datatypes

Adds a special conversion path between complex number datatypes and
array or compound datatypes where the in-memory layout of data is the
same between the datatypes and data can be converted directly

Adds support for complex number datatypes to the h5dump, h5ls and
h5diff/ph5diff tools. Allows h5dump '-m' option to change floating-point
printing format for float complex and double complex datatypes, as well
as long double complex if it has the same size as double complex

Adds minimal support to the h5watch and h5import tools

Adds support for the predefined complex number datatypes and
H5Tcomplex_create function to the Java wrappers. Also adds initial,
untested support to the JNI for future use with HDFView

Adds support for just the H5T_COMPLEX datatype class to the Fortran
wrappers

Adds support for the predefined complex number datatypes and
H5Tcomplex_create function to the high level library H5LT interface
for use with the H5LTtext_to_dtype and H5LTdtype_to_text functions

Changes some usages of "complex" in the library since it conflicts with
the "complex" keyword from the complex.h header. Also changes various
usages of the word "complex" throughout the library to distinguish
compound datatypes from complex datatypes.
@derobins derobins merged commit 90429f5 into develop Oct 31, 2024
132 checks passed
brtnfld pushed a commit to brtnfld/hdf5 that referenced this pull request Nov 4, 2024
* Implement support for complex number datatypes

Adds the new datatype class H5T_COMPLEX

Adds the new API function H5Tcomplex_create which creates a complex
number datatype from an ID of a base floating-point datatype

Adds the new feature check macros H5_HAVE_COMPLEX_NUMBERS and
H5_HAVE_C99_COMPLEX_NUMBERS

Adds the new datatype size macros H5_SIZEOF_FLOAT_COMPLEX,
H5_SIZEOF_DOUBLE_COMPLEX and H5_SIZEOF_LONG_DOUBLE_COMPLEX

Adds the new datatype ID macros H5T_NATIVE_FLOAT_COMPLEX,
H5T_NATIVE_DOUBLE_COMPLEX, H5T_NATIVE_LDOUBLE_COMPLEX,
H5T_CPLX_IEEE_F16LE, H5T_CPLX_IEEE_F16BE,
H5T_CPLX_IEEE_F32LE, H5T_CPLX_IEEE_F32BE,
H5T_CPLX_IEEE_F64LE and H5T_CPLX_IEEE_F64BE

Adds hard and soft datatype conversion paths between complex number
datatypes and all the integer and floating-point datatypes, as well as
between other complex number datatypes

Adds a special conversion path between complex number datatypes and
array or compound datatypes where the in-memory layout of data is the
same between the datatypes and data can be converted directly

Adds support for complex number datatypes to the h5dump, h5ls and
h5diff/ph5diff tools. Allows h5dump '-m' option to change floating-point
printing format for float complex and double complex datatypes, as well
as long double complex if it has the same size as double complex

Adds minimal support to the h5watch and h5import tools

Adds support for the predefined complex number datatypes and
H5Tcomplex_create function to the Java wrappers. Also adds initial,
untested support to the JNI for future use with HDFView

Adds support for just the H5T_COMPLEX datatype class to the Fortran
wrappers

Adds support for the predefined complex number datatypes and
H5Tcomplex_create function to the high level library H5LT interface
for use with the H5LTtext_to_dtype and H5LTdtype_to_text functions

Changes some usages of "complex" in the library since it conflicts with
the "complex" keyword from the complex.h header. Also changes various
usages of the word "complex" throughout the library to distinguish
compound datatypes from complex datatypes.
brtnfld pushed a commit to brtnfld/hdf5 that referenced this pull request Nov 4, 2024
* Implement support for complex number datatypes

Adds the new datatype class H5T_COMPLEX

Adds the new API function H5Tcomplex_create which creates a complex
number datatype from an ID of a base floating-point datatype

Adds the new feature check macros H5_HAVE_COMPLEX_NUMBERS and
H5_HAVE_C99_COMPLEX_NUMBERS

Adds the new datatype size macros H5_SIZEOF_FLOAT_COMPLEX,
H5_SIZEOF_DOUBLE_COMPLEX and H5_SIZEOF_LONG_DOUBLE_COMPLEX

Adds the new datatype ID macros H5T_NATIVE_FLOAT_COMPLEX,
H5T_NATIVE_DOUBLE_COMPLEX, H5T_NATIVE_LDOUBLE_COMPLEX,
H5T_CPLX_IEEE_F16LE, H5T_CPLX_IEEE_F16BE,
H5T_CPLX_IEEE_F32LE, H5T_CPLX_IEEE_F32BE,
H5T_CPLX_IEEE_F64LE and H5T_CPLX_IEEE_F64BE

Adds hard and soft datatype conversion paths between complex number
datatypes and all the integer and floating-point datatypes, as well as
between other complex number datatypes

Adds a special conversion path between complex number datatypes and
array or compound datatypes where the in-memory layout of data is the
same between the datatypes and data can be converted directly

Adds support for complex number datatypes to the h5dump, h5ls and
h5diff/ph5diff tools. Allows h5dump '-m' option to change floating-point
printing format for float complex and double complex datatypes, as well
as long double complex if it has the same size as double complex

Adds minimal support to the h5watch and h5import tools

Adds support for the predefined complex number datatypes and
H5Tcomplex_create function to the Java wrappers. Also adds initial,
untested support to the JNI for future use with HDFView

Adds support for just the H5T_COMPLEX datatype class to the Fortran
wrappers

Adds support for the predefined complex number datatypes and
H5Tcomplex_create function to the high level library H5LT interface
for use with the H5LTtext_to_dtype and H5LTdtype_to_text functions

Changes some usages of "complex" in the library since it conflicts with
the "complex" keyword from the complex.h header. Also changes various
usages of the word "complex" throughout the library to distinguish
compound datatypes from complex datatypes.
@jhendersonHDF jhendersonHDF deleted the feature/complex_number_support branch November 26, 2024 19:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component - Build CMake, Autotools Component - C Library Core C library issues (usually in the src directory) Component - Documentation Doxygen, markdown, etc. Component - Fortran Fortran wrappers Component - High-Level Library Code in the hl directory Component - Java Java wrappers Component - Misc Anything else (CODEOWNERS, etc.) Component - Testing Code in test or testpar directories, GitHub workflows Component - Tools Command-line tools like h5dump, includes high-level tools Priority - 0. Blocker ⛔ This MUST be merged for the release to happen Type - New Feature Add a new API call, functionality, or tool
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants