Skip to content

HDF5 Working Group

Dana Robinson edited this page Dec 31, 2024 · 135 revisions

The HDF5 Working Group meets weekly, on Thursdays at 10 am Central time. This meeting is for HDF5 library developers and anyone is welcome to attend. The purpose of this meeting is to discuss HDF5 library development. It is NOT intended for providing technical support.

Zoom link: https://us06web.zoom.us/j/89601195963

The agenda and any cancellations will be posted below and on the forum, usually on Monday. If you have any action items to discuss, please email derobins at hdfgroup dot org to get them added to the agenda. If there are no pressing issues by Monday, the meeting may be cancelled.

9 January 2025

Agenda

  • Review threadsafe H5FL package changes and testing (Quincey)

Notes

2 January 2025 - No Meeting

26 December 2024 - No Meeting

19 December 2024

Agenda

  • No meeting 26 December
  • Threadsafe object locking

Notes

12 December 2024

Agenda

  • Hosted by Neil Fortner, Dana is at AGU24 this week.
  • Please feel free to add your agenda items here!
  • Multithreaded collective I/O operations (Quincey)

Notes

5 December 2024

Agenda

  • HDF5 2.0 goings on
    • Autotools delete pending
    • The C++ wrappers will survive for another day (useful for RAII)
  • zlib default behavior?
    • On or off?
    • Fail or not?
  • Do we need H5Eprint3()? GH #4698 (https://github.com/HDFGroup/hdf5/issues/4698)
  • Discuss metadata performance issues:
    • Metadata cache config defaults
    • Maximum metadata cache size
    • Maximum "est_num_entries" size
    • Cache image disallowed in read only mode

Notes

HDF5 2.0 goings on

zlib default behavior

  • Current behavior:
    • Default ON
    • Not a configure failure if not found
    • If you rely on the default ON, and the zlib isn't found, you will create an HDF5 library with no compression
  • If we make the default OFF, people who rely on the defaults will create HDF5 libraries with no compression
  • Decisions:
    • CMake will fail if a selected option cannot be built
    • We'll leave zlib ON and try to do the best we can on platforms like Windows

H5Eprint3()

  • Nobody complained, we'll version

Let's decide on the _debug suffix (Scot)

  • 100%
  • Could be different in Windows/POSIX

Let's see what we can do about filters having to link to the HDF5 library

  • Still issues with Windows Debug/Release

Need to bump the default file format version in 2.0

Discuss metadata performance issues:

  • Problem with max number of external links in EFC (16-bit limit?)
    • Library complains when more than 64k, Werner has 140k
    • HDF5 should not arbitrarily have limits, even if high values will typically make a file system sad
    • Should change the library MD cache size limits
    • Definitely should try paged aggregation and page buffering
    • Are there crashes when you set the default cache size to the max?
      • John said there may be issues with going more than 32 MiB (max is 64 MiB?)
    • Should bump MD cache sizes (memory size, element counts)
      • John suggests that we may want to make the cache more aggressive about adapting to larger size working sets

28 November 2024

No Meeting - Happy Thanksgiving! 🦃

21 November 2024

This session is cancelled due to staff and community members attending SC24.

14 November 2024

7 November 2024

Agenda

  • HDF5 2.0 goings on (recent merges, etc.)
  • HDF5 2.0 planning (wiki, parent/child issues)
  • PR #5015 (concurrency feature)

HDF5 2.0

  • Send any feedback about 2.0 decisions ASAP
  • We turned on a few branch protection rules:
    • Dismiss stale pull request approvals when new commits are pushed
    • Require branches to be up to date before merging
  • We'll turn this branch protection rule on soon:
    • Require signed commits

PR #5015

  • Adds support for future concurrency
  • Everyone should review this so we can merge it before the next meeting
  • Note: CMake has trouble with C11 threads (see #5034)

31 October 2024

Agenda

24 October 2024

Agenda

Notes

Version numbers, branch names, etc.

  • General agreement that the

Sparse data API changes

  • Quincey suggests:
    • That we separate sparsity from the structured chunk implementation so other VOL connectors can do their own special sparse thing. Proposed an H5Pset_density() API call.
    • H5Dget_defined() should also work for any dataset (e.g., think of missing chunks)
    • Ditto H5Derase()
    • Native VOL connector API calls should be namespaced (e.g., H5Dnative_foo())

17 October 2024

Agenda

Notes

Misc

  • The direct VFD may not be SWMR-compatible due to direct I/O page machinations under the hood
  • Everyone should look over the sparse RFC for next week
    • Especially think about namespacing native VOL connector API calls
  • C++ wrappers, in general, would NOT be thread-safe (either ours or HighFive, etc.)
  • Elena says that Fortran should be thread-safe

HDF5 1.16 --> HDF5 2.0

  • General agreement with the slides shown at Oct 15 CtD
  • Surprisingly little pushback on removing the Autotools
  • Should check with the BlueBrain folks about C++ header implementation support
    • We can put HighFive in our CI
    • Might take it over and seek funding if it turns into a community project
  • Need to clearly describe what our new versioning scheme means for the file format
  • Changes a lot of heap allocations to stack allocations
  • Cleanup fixes several problems
  • Pulls H5FL out of the startup code, handy for initialization
  • Fixes a deficiency in the VOL API where context pointers was missing

10 October 2024 - Plugin working group

3 October 2024

Agenda

Notes

1.14.5 Release

  • Do we still support 32-bit operating systems
  • This should be ready to merge now

Tentative VFD initialization changes

  • Discussed pro/con of init-at-once v. lazy, refactoring, other changes (qkoziol/refactor_h5fd_and_packages)
  • New private files for internal things in "demo" VFDs that use public API calls only
  • Definitely needs performance testing
  • WIP - will revisit when ready for a PR

26 September 2024

Dana is out at NOBUGS 2024 this week so Neil Fortner will host this meeting.

Agenda

Notes

  • 1.14.5 is available for testing. See forum post.
  • Quincey Koziol and Matt Larson: Thread safety shut down issues: potentially active threads that interacted w/library and acquired resources of some sort and are potentially executing when library is attempting to shut down.
    • what if thread is executing and has any file related resources (assume holding dirty metadata--lots of special cases there) if the process dies, we'll have a corrupted file. How to get the data to the file safely so it doesn't corrupt?
    • Tell user: "don't do that."
    • See documentation in develop: https://github.com/HDFGroup/hdf5/blob/develop/doc/threadsafety-warning.md
    • Goes in 1.14.5, needs to be in release notes.
  • Sparse chunk API - Elena and Quincey have discussion items. Need to firm up schedule. Neil will follow-up.
  • File Format - Elena will have discussion items and will work with Dana to add to agenda.
  • Plug-in Working Group meeting on 3rd Thursday - October 17

19 September 2024

This was originally intended as an HDF5 Plugin Working Group meeting, though there is only one action item

Agenda here: https://github.com/HDFGroup/hdf5_plugins/wiki/HDF5-Plugin-Working-Group

So we'll also discuss the 1.14.5 release

Agenda

Notes

  • Not aware of any 1.14.5 issues
    • There was a minor Windows issue and this was fixed (still needs to go into 1.14 and 1.14.5)
    • ttsafe still has problems on slow machines, even with the latest patch
    • Some minor issues, but nothing that would hold the release (will be a release note)
  • PR #4856
    • Some nice VOL cleanup, very reasonable, needs review
  • Other PRs
    • Neil's #4843 - adapts the external dataset code to use the sec2 VFD trick for dealing with torn I/O
  • Python vs Perl
    • This is fine if someone wants to donate their time
    • It'd be nice to also replace the shell scripts with Python (assuming we keep the Autotools)
  • Misc issues
    • NVHPC has dt_arith problems with long doubles (Quincey/Nvidia will investigate)
    • Java tests may be encountering uncaught failures
    • Quincey says he's seen problems with gcc 14 and MacOS

12 September 2024

Agenda

  • 1.14.5 Release
    • Docs are a mess. Need to fix.
  • Discuss H5Tset_size() behavior in complex number PR
    • Decision: Disable H5Tset_size() for array, complex, and variable-length datatypes.

05 September 2024

Agenda

  • 1.14.5 release
  • Structured chunk discussion (follow-up from 22 Aug meeting)
  • H5Tset_size discussion

1.14.5 release

  • Still targeting late September
  • Any changes need to be in by Sep 13th (NEXT FRIDAY)
  • Pre-release the following week - PLEASE TEST!
  • There are some issues with MacOS signing w/ plugins, but these should be resolved before the release
  • Allen says HDF4 dmg files work and people should test
  • ttsafe failures are develop only and will not move to 1.14
  • Elena says:
    • Broken links
    • Missing docs (subfiling, etc.)

Structured chunk discussion

  • Don't bump B-tree, etc. versions (only messages change)
  • There is no new layout. Storage is still chunked. Only need to revise the chunked storage.
    • "Typo" in that final chunk dimension is datatype size (NEEDS TO BE DOCUMENTED)
    • Chunked storage property needs a version number field
    • Probably need to think more about the layout property
    • Some discussion about locality of sparse chunk descriptions in the file format doc (Quincey argues it needs to be closer to the structured chunk property section)
    • Much discussion about checksums, mandatory vs optional, metadata vs data, etc. - needs more discussion
    • Next discussion on October 10th

H5Tset_size discussion

29 August 2024

Agenda

  • 1.14.5 release
  • Structured chunk "working group"
  • Improved docs/website
  • oss-fuzz action discussion (#4784)
  • CMake examples threads detection
  • Mac OS dmg construction CI issue

1.14.5 release

  • Still targeting late September
  • Any changes need to be in by Sep 13th
  • Pre-release the following week - PLEASE TEST!
  • There are some issues with MacOS signing, but these should be resolved before the release
  • ttsafe failures are develop only and will not move to 1.14

Structured chunk work

  • Small working group will meet next week to discuss some ideas, will present findings during next week's HDF5 WG

Docs website

  • Should be ready to go today (please review #4718)

oss-fuzz

  • Testing sanitizers in CI is a great idea
  • Weird that we have known sanitizer errors, but the sanitizer checks pass
  • Should not run oss-fuzz as a part of CI

Make examples threads detection

  • PR #4746
  • Confusing logic
  • Spirited discussion, but we should just remove this for now since there are no examples with threads
  • Should also rename HDF_ variables to HDF5_EXAMPLES_ for clarity

Mac OS CI failures

22 August 2024

Agenda

1.14.5 release

  • Still targeting late September
  • Any changes need to be in by Sep 13th
  • Pre-release the following week - PLEASE TEST!
  • There are some issues with MacOS signing, but these should be resolved before the release

Structured Chunk Changes

  • Everyone should look at the structured chunk RFC, RM entries, file format changes, etc. (link above)
  • We'll discuss in two weeks (Sept 5)

8 August 2024 - NO MEETING (Staff vacations)

1 August 2024 - NO MEETING (Staff vacations)

25 July 2024 - NO MEETING (ESIP)

18 July 2024 - Plugin working group

11 July 2024 - NO MEETING

4 July 2024 - NO MEETING

This is a holiday in the US.

27 June 2024

Agenda

  • Filter plugin working group
  • Git submodules - okay or no? (Dana)
  • A PEP-like process for library modifications (Dana)
  • Safely initializing global variables in MT-HDF5 (Quincey)

Filter plugin working group

  • The 2nd HDF5 WG meeting of the month will be the Filter Plugin Working Group (no other business that meeting)
  • Starts July 11th
  • Will announce on the forum and my Call the Doctor session next week
  • I'll email anyone who expressed interest at the last HUG (sadly, that session was not recorded)
  • Dana will send a link to the original filter working group whitepaper from 10+ years ago
  • Let's change the name to External Plugin Working Group and cover VOL connectors, VFDs, and filters

Git submodules

  • PR #4604 would bring in Doxgygen Awesome and the recommended way to do this is via a submodule
  • Submodules can be difficult for people new to them
    • We probably wouldn't ever modify the code in the submodule so it'd be less of a burden
    • The only thing we'd ever have to do is update the target to point to newer versions (and there might be actions that keep it in sync, like dependabot)
  • Definitely need to update the docs and alert/educate developers
  • (Jordan) CMake can pull things for you. Best for things that rarely change.
  • DECISION: Let's just copy the files over. There are not a lot.

HDF5 PEPs - HEPs?

  • PEP 1 describes the process (https://peps.python.org/pep-0001/)
  • They also have a coding style PEP (https://peps.python.org/pep-0008/)
  • They have a lot of infrastructure in their peps repo that we could probably modify (https://github.com/python/peps)
  • I mainly want to use the PEP concept and infrastructure, but modify the governance to suit our project (so no intention to bring over unmodified)
  • I want to announce this at the HUG, so we have a month to argue about specifics
  • Aleksandar suggests we use Markdown instead of rst (there are also fancier versions of Markdown)
  • We'll need to see how the PEP docs handle more complicated documents (is rst good enough?)
  • John Mainzer suggests we talk to the PEP guys to see what works and what does not (Aleksandar has some experience here)

Initializing global state

20 June 2024 - NO MEETING

13 June 2024 - NO MEETING

6 June 2024

Agenda

  • Poisoning errors discussion (Nvidia)
  • Wrapping user callbacks (Nvidia)

Poisoning errors discussion

  • NOTE: This is all tentative and work in progress
  • Would allow assert-like behavior in production builds
  • Would add "poisoned" checks to public API calls via the FUNC_ENTER macros
  • Would prevent library state change when poisoned
  • Would simply return the error value (with no error stack, since creating error stacks changes library state)
  • Might want a mechanism to switch between abort() and returning an error
  • HPOISON; macro would poison the library
  • H5is_library_poisoned() call would return poison/not poisoned state w/o changing library state (just returns a global var)
  • Is this really necessary?
    • Might be useful when we can't rebuild the library
    • No obvious driver for this
    • Adds complexity
  • It would be nice to see other examples of poisoning in a library or have pointers to other discussions
  • Would probably be difficult to apply consistently in the library

Wrapping user callbacks

  • Needed when a callback can leave the library (not internal callbacks)
  • Hard to differentiate when we are doing library vs external calls
    • Adds a performance hit when we are doing internal calls
    • e.g., VOL callbacks
  • Add H5_BEFORE/AFTER_USER_CB() macros
    • No-ops when not a concurrency-safe library
  • Add H5TS_user_cb_prepare()
  • ~150 places "callback points"
  • ~300 places where we need the macros

30 May 2024

Agenda

  • CMake discussion
  • Poisoning discussion
  • Should we have an HDF5 1.16.0 release in the fall?
  • What do we do about the C++ wrappers?

CMake Plan

See: https://forum.hdfgroup.org/t/community-input-hdf5-cmake-overhaul/12364/2

For HDF5 1.14.5, we’re planning on overhauling our CMake build code, to bring it more in line with modern CMake conventions.

We’re still putting together our assessment/plan, but we’d like to:

  • Consolidate and reorganize the existing CMake code, to make things easier to understand and find
  • Move to “modern CMake”
    • Convert macros to functions
    • Add additional functions to avoid code duplication
  • Ensure parity with the Autotools
  • Add verification/testing to CI to ensure builds are correct
  • Add documentation

Useful "modern CMake" links

Poisoning discussion

  • Adding a way to "poison" the library so subsequent API calls will just fail and not mutate state
    • Helpful for debugging
    • Could avoid making problems worse when library state gets corrupted
    • Concurrency is the driving force for this
  • Would be mid-way in impact between assert/abort and just returning an error code
  • Could be implemented easily via a global "poisoned" variable that is checked on API entry
  • Consensus is "maybe, but let's think about it"
  • Should probably be off by default, at least in release builds
  • We should set up an RFC/PIP process for HDF5 so things like this can be documented and discussed

HDF5 1.16.0 in fall 2024

  • Complex number support requires bumping a datatype version number and that should happen in a major release
  • Should we do a major release (probably 1.16.0) in the fall to get complex number support out?
  • Tentative 'yes' and we'll start planning (more news at the June 3 Call the Doctor)
  • Might bump some library defaults (e.g., to better work in the cloud)
  • Could make minor API changes (e.g., #3505 HDoff_t change)
  • Should think about bumping the minimum CMake version to get workflows (3.25?)
  • Might also think about making some build system changes (drop obsolete things, rename for consistency)
  • Probably not move to C11 to make the upgrade path easier, but maybe bump develop to C11 after creating the 1.16 branch
  • Would try to make upgrading from 1.14.x as easy as possible
  • Would no longer support 1.14.x (so 1.14.4 would be the last 1.14 release)

C++ Wrappers?

  • These have lagged far behind the library, feature-wise
  • Do we update them or deprecate and eventually remove them? There are header-only C++ wrappers that provide a more modern interface (not maintained by HDFG)
  • We'll ask the community (forum posts, CtD, state-of-HDF5 talks)

23 May 2024

Agenda

  • Clean out PR queue

Actions

NOTE: In general, PRs that will never be merged but contain useful ideas should be closed and an issue/discussion created instead. PRs that are a work in progress, are under active development, and should not be merged until important changes have been made should be marked draft while the work takes place. Everything else in the PR list should be ready for review and possible merging.

  • 1387 - (H5T optimization) - This PR will never be merged and was being kept as a reminder to investigate the useful bits of the PR. We will close it and create an issue.
  • 3505 - (HDoff_t) - This PR or an equivalent will be merged before June. It cannot go to 1.14 since it's an API change (which will be unversioned). Needs better docs and probably an issue to ensure docs are correct in next major release.
  • 4171 - (NVHPC update) - Will remain open while we investigate why long double conversions are failing.
  • 4266 - (uninit H5T memory) - Close and create an issue. Needs to be fundamentally fixed elsewhere.
  • 4315 - (datatype precision overflow) - Leave open. Needs minor tweaks.
  • 4347 - (vasprintf) - Leave open. Will be converted to use HD prefix.
  • 4469 - (big threading PR) - Needs review. Copyright okay now.
  • 4475 - (pause error checking) - Okay to review and merge. Needs an issue to update the docs, especially the VOL connector author guide. Also probably needs VOL connector upgrades. Might be a 2.0 thing.
  • 4487 - (zlib-ng) - Can remain open while we investigate the test failure.
  • 4488 - (URLs) - Can remain open as a draft but needs a plan for resolution.
  • 4500 - (CMake UNITY_BUILD) - Review and merge.

16 May 2024 - CANCELLED (no agenda items)

9 May 2024

Agenda

  • Make sure everyone can connect to Zoom
  • Complex number datatype creation API
    • Does it make sense to allow specifying member names for real and imaginary part of complex numbers and store that in the file?
    • Should we design the API to allow different datatypes for real and imaginary part, based on forum discussion?
  • When/if C11 is moved to, alignment checking of C types in library can be replaced with _Alignof(type) / alignof(type) (latter removed in C23)
  • Additional locking discussion (Quincey Koziol)
  • PRs and such

Complex number datatype discussion

  • Does it make sense to allow specifying member names for real and imaginary part of complex numbers and store that in the file?
    • Use case is mainly display
    • Could be contentious
    • Might be better to push this to the tools
  • Should we design the API to allow different datatypes for real and imaginary part, based on forum discussion?
    • Is there a compelling use case for this?
  • Jordan will push the feature to a branch soon
  • Could make complex numbers a datatype class and select between cartesian/polar
  • Alternatively, just support C complex type (like how the rest of the library basically works)

When/if C11 is moved to, alignment checking of C types in library can be replaced with _Alignof(type) / alignof(type) (latter removed in C23)

  • It makes sense to allow this when we move to C11 (in next major version?)

Climate workshop (Elena)

  • Switched to Zarr from HDF5 due to lack of SWMR
  • LZ4 filter function issues - chunk can't be uncompressed directly

Zarr discussion

  • Need to address MT, etc. concerns
  • We should support FP8, which Zarr does not

Additional locking discussion (Quincey Koziol)

  • Sent along a doc (in chat)

PRs and such

2 May 2024

Agenda

Move to Zoom (starting next week)

  • Moving to Zoom so it's easier for outside people to join
  • Agenda will be listed here and announced on the forum on Mondays
  • I'll send a Zoom link to everyone who is on the existing Teams invite list
  • Might use one of the meetings per month for the filter working group

Governance

  • Will be looking at other projects to get some ideas for how to better manage the HDF5 community

PRs and such

  • Do we check in generated files? (#4453, which commits newly-added, generated files) - Quincey suggests having a script to generate the files as a convenience
  • Should the codestack removal go to 1.14? (#4454) - No strong feelings, definitely broken and nobody complained, just a configure change, let's remove

Removing Autotools support

  • Supporting two build systems is a lot of work
  • libtool is a mess and requires sed hacks to fix linker options (#4448)
  • How much screaming will there be if we drop the Autotools in the next major version of the library?
  • What about Ubuntu, etc.?
  • Look for complaints in the forum

Is it okay to allocate file space when there is a null dataspace

  • An oversight, nothing gets allocated at zero bytes
  • Need to update the docs

Discuss locking protocol for multithreaded concurrency planning

25 April 2024

Agenda

  • PRs and such
  • Property lists (Lifeboat)

PRs and such

  • FUNC_ENTER has some controversies and needs investigation
  • Others can go in after review requirements are met

Property lists (Lifeboat)

  • Do we need to support modifying a property list class after properties exist? (yes)
  • Do we need to support deleting properties from a class? (maybe)
  • Existing library is buggy in that properties can be deleted from a property list class and this will affect existing property lists (John Mainzer will file an issue)

18 April 2024

Agenda

  • Yay, release
  • C11 in develop
  • Issues, etc.
  • Revisions to the error package (H5E)

Yay, release

  • It's out. Yay!

C11 in develop

  • Is it okay to require C11 in develop?
  • Tentative yes, but we need a policy
  • Also need to figure out where we are going to test and debug big-endian code

Issues, etc.

  • Not a lot going on
  • Let's get the H5Tconv refactoring in

Revisions to the error package (H5E)

  • NVidia thread-safety changes passing CI, PRs imminent
  • Trying to reduce H5I usage from H5E
  • H5E uses H5I for its data management, should revamp to store things in local data structures
  • Legit use cases for creating custom ID things (e.g., VOL connectors)
  • Will still use IDs, but only for public new classes that will be immediately converted to the internal things
  • So the public API will not change
  • Library internals should be refactored to avoid ID use internally, in general
  • Lifeboat says this is fine wrt what they are doing

11 April 2024

Agenda

  • Release goings on
  • (Pseudo-)random numbers in HDF5
  • What do we do with the last HDfoo() functions?
  • Do we keep the 'getting started' guide and similar docs in the GitHub wiki or with the code?
  • H5TRACE scheme removal
  • H5E changes && "have threads" vs "have threadsafe" macros (Quincey Koziol)

Release goings on

  • Release is TODAY 🎉

Pseudo-random numbers

What do we do with the last few HDfoo() functions?

Where to put helpful developer docs?

H5TRACE scheme removal

H5E changes

  • Quincey wants to not incr/decr library error IDs
  • Maybe cache error IDs in a table?
  • Will require further investigation

"have threads" vs "have threadsafe" macros

  • H5_HAVE_THREADS --> Has threads
  • H5_HAVE_THREADSAFE --> Has thread-safety (i.e., global lock)

4 April 2024

Agenda

  • Release goings on
  • Locking protocols discussion (Quincey Koziol)

Release goings on

Locking protocols discussion

28 March 2024

Agenda

  • Release status
  • #3505: off_t --> HDoff_t
  • Quincey's threadsafety wrapper work
  • Do we mandate C11? (at least in develop)
  • PR/issue highlights
  • Crashproofing RFC

Release status

  • Delayed until early/mid-April (waiting on a security patch from Amazon)

#3505: off_t --> HDoff_t

  • We did not version anything when we went from hid_t int --> int64_t in 1.8/10
  • Let's merge this PR (can't go to 1.14)
  • This is fine
  • Quincey sees off_t issues on his Mac w/ gcc 13

Do we mandate C11? (at least in develop)

PR/issue highlights

  • Big set of changes from Quincey / AWS has been merged (thanks!)
  • NOTE: These will be filed as Mitre CVEs

Quincey's threadsafety wrapper work

Crashproofing RFC

  • John Mainzer insists that VFD SWMR has many of these features
  • Will send out the RFC to meeting attendees
  • Will schedule a follow-up meeting to discuss
  • Neil will present at next week's call the doctor meeting

21 March 2024

Agenda

  • Upcoming release
  • PR/issue highlights

Upcoming release

  • Try to get your code in by Friday (March 22)
  • Release will be next Thursday (March 28)

PR/issue highlights

  • Issue #108 - Needs a file format bump, might want to boost to 64-bit chunks, could do as a part of sparse work
  • PR #4166 - Jordan will close, we'll revisit for 1.14.5 (possibly a doc change)

14 March 2024

Agenda

  • float16 goings on
  • PR/issue highlights

float16 goings on

  • Will merge PR on Friday

PRs and Issues

  • Not much exciting going on
  • John may have H5I lock-free code for us to look at in April
  • Mark can help us with HDF Java --> Maven in April
  • Quincey talked about the need for H5I locks even with lock-free data structures

7 March 2024

Agenda

  • float16 goings on
  • The TRACE macros and related
  • PR/issue highlights

float16 goings on

  • Going pretty well
  • Dealing with quirky platforms/compilers
  • Scot says there are some Fortran issues with Flang (REAL16 support is in Fortran 2023 - Flang is the only compiler that supports this (currenty))

TRACE macros

  • Discussion
  • Thoughts?
  • Jordan: Never found it to be useful at all
  • Decision is to merge existing PR, then tag in GH, and then create a PR to yank

PR/issue highlights

  • (#4070 - BE examples fail) What do we do about examples?
  • Jordan says h5ls and h5dump have opposite ideas about printing types (we should fix this)
  • (#4064 - ros3 VFD secret length) Can go in. Dana will fix the stack allocations later.
  • (#4062 - static CRT link) Can go in. Dana will update the PR to have a "don't use this!" comment
  • (#4060 - long double test) Can go in

29 February 2024 (No meeting - USENIX FAST)

22 February 2024

Agenda

  • PR #4017 (improves flush performance) <-- Click merge button if no objections
  • float16 goings on
  • Other library happenings

15 February 2024

Agenda

  • PR #4017 (improves flush performance)
  • Other library goings on

Notes

Flush performance PR

  • Looks good at a first glance
  • We'll discuss next week and merge after the HDF5WG

GitHub goings on

  • FAPL changes will propagate to sub-files in things like VDS, external links
  • Should change the behavior of H5Pset_elink_fapl() so you get the propagating behavior and only have to use the API call if you want something different

Quincey's NVidia todo list

  • ~20 project ideas regarding GDS, etc.

8 February 2024

Agenda

  • Float16 RFC update
  • Chat about chunk cache meeting

Notes

Float16, etc.

  • Brief discussion of complex datatypes, will keep on working on the RFC

Chunk cache discussion

  • Will kick this down the road until there are phase II funds (April?)

GitHub goings on

  • Need to have ABI checks when merging to 1.14
  • Need to be able to easily find ABI check results for develop, 1.14
  • Are build-only checks running tests?

Misc questions

  • Do the Autotools + gcc strip -g (or not pass it) to the compiler, linker, etc. ? Is -s getting passed down and superseding -g? John M. will hassle Dana if --enable-symbols doesn't work.

1 February 2024

Agenda

  • Float16 RFC update
  • Citing HDF5 PR
  • HDF5 goings on
  • Debug API?

Notes

New types: float16, complex, Boolean

  • Do we need a recap of type conversion in the RFC?
  • We should definitely have examples in the docs for when a particular type is not supported
  • What about quad precision floats (128-bit, supported in GNU, numpy and Fortran already support, there's an IEEE standard)?

Citing HDF5 PR

  • Merged

HDF5 goings on

  • Nothing too exciting

Debug API

  • General agreement that some low-level API calls could go in a special debug header that would not be a formal part of the 'public API'

25 January 2024

Agenda

Notes

GCPL propagation

  • We'll propagate the properties to the intermediate groups
  • Might require some finagling to deal with non-group creation

New types: float16, complex, Boolean

  • We should look at FP8 as well, possibly other ML/DL convenience reduced types
  • Should also consider adding Boolean support from Jerome's RFC
  • Need to look at NVidia's compilers
  • NATIVE macros will map to H5I_INAVLID_HID when there is no compiler support for a type
  • Complex should be a real atomic type, not a struct of two doubles or whatnot
  • Might add some conversions to common current complex number hacks, like a struct of two doubles, etc.
  • How will complex numbers be printed in h5dump, etc.?

18 January 2024

Agenda

  • Discuss how the chunk cache discussion went
  • Library PRs, issues, etc.

Notes

Discuss how the chunk cache discussion went

  • John Mainzer / Lifeboat will be reworking the chunk cache for sparse data and thread-safety
  • The chunk cache used to be per-file (currently per-dataset - performance was poor w/ per-file)
  • The per-file --> per-dataset change was in 1.6
  • John wants to go back to one cache per file or even one cache for many files
  • John is exploring data structures and algorithms (lock-free hash tables suitable?)
  • Actual work would only take place if Lifeboat gets a phase II grant (end of Feb)

Library PRs, issues, etc.

  • Jordan's type conversion fix is super important and needs to get into 1.14.4. Quincey will review.
  • The -shared CMake extensions for tools, libraries, etc. will be removed by default in 1.14.4. CMake will work more like the Autotools, though compatibility options will be provided if you want the old behavior.

11 January 2024

Agenda

  • Discuss https://github.com/HDFGroup/hdf5/pull/3927
  • Determine 2024 / 1.14.4 priorities
  • Quincey's warnhist improvements
  • John Mainzer has a question about chunk cache documentation
  • Elena Pourmal wants to discuss behavior when a compressed buffer is bigger after "compression"
  • We need an updated guide for updating across major/minor releases
  • Always need to reclaim vlen memory, even when using strings?

Notes

Citing HDF5

  • Tentative okay on #3927, but we should look into how other software does things. Is this the way people will do this in the future?
  • Do we have a DOI for HDF5? That should be a thing we add.

Priorities

  • Unicode
  • MinGW (especially w/ MSYS2 - AND DOCUMENT AND TEST)
  • Update to handle recent MPI session changes
  • Support modern C and ML/DL data types (float16, bool, complex, etc.)

Behavior when a buffer is bigger after "compression"

  • Existing behavior: Larger buffer is written out as "compressed"
  • Responsibility lies with filter
  • Need to fix the zlib deflate filter (Elena will file a bug report, may also affect szip)

Changes to bin/warnhist

  • Now can produce warning density and has more options

Chunk cache documentation?

  • Read the source, Luke
  • Rob Matzke may have an old document (location?)
  • John wants to discuss creating a more flexible chunk cache that can support sparse data better (will be in Champaign T/W next week)

Version updating

  • Version on web has dead links, is out of date

Vlen string cleanup

  • h5py not doing a great job at cleaning up after itself, has a memory leak
  • h5py needs updating to avoid the leak
  • Is h5py doing the right thing with new-style references? <-- New-style references in general probably need better documentation
Clone this wiki locally