Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pause recording errors instead of clearing the error stack #4475

Merged
merged 38 commits into from
Jun 18, 2024

Conversation

qkoziol
Copy link
Contributor

@qkoziol qkoziol commented May 10, 2024

An internal capability that's similar to the H5E_BEGIN_TRY / H5E_END_TRY macros in H5Epublic.h, but more efficient since we can avoid pushing errors on the stack entirely (and those macros use public API routines).

This capability (and other techniques) can be used to remove use of H5E_clear_stack() and H5E_BEGIN_TRY / H5E_END_TRY within library routines.

We want to remove H5E_clear_stack() because it can trigger calls to the H5I interface from within the H5E code, which creates a great deal of complexity for threadsafe code. And we want to remove H5E_BEGIN_TRY / H5E_END_TRY's because they make public API calls from within the library code.

Also some other minor tidying in routines related to removing the use of H5E_clear_stack() and H5E_BEGIN_TRY / H5E_END_TRY from H5Fint.c

qkoziol and others added 2 commits May 10, 2024 13:27
An internal capability that's similar to the H5E_BEGIN_TRY / H5E_END_TRY
macros in H5Epublic.h, but more efficient since we can avoid pushing errors on
the stack entirely (and those macros use public API routines).

This capability (and other techniques) can be used to remove use of
H5E_clear_stack() and H5E_BEGIN_TRY / H5E_END_TRY within library routines.

We want to remove H5E_clear_stack() because it can trigger calls to the H5I
interface from within the H5E code, which creates a great deal of complexity
for threadsafe code.  And we want to remove H5E_BEGIN_TRY / H5E_END_TRY's
because they make public API calls from within the library code.

Also some other minor tidying in routines related to removing the use of
H5E_clear_stack() and H5E_BEGIN_TRY / H5E_END_TRY from H5Fint.c

Signed-off-by: Quincey Koziol <[email protected]>
@qkoziol
Copy link
Contributor Author

qkoziol commented May 13, 2024

I have a change to make for this, please hold off on reviewing for the moment. I should have something up later today.

@derobins derobins added Merge - To 1.14 Priority - 1. High 🔼 These are important issues that should be resolved in the next release Component - C Library Core C library issues (usually in the src directory) Type - Improvement Improvements that don't add a new feature or functionality labels May 13, 2024
qkoziol added 2 commits May 13, 2024 11:34
Uses 'try open' routines now, down to the foundation of the operation

Signed-off-by: Quincey Koziol <[email protected]>
@qkoziol
Copy link
Contributor Author

qkoziol commented May 13, 2024

I have a change to make for this, please hold off on reviewing for the moment. I should have something up later today.

OK, I've updated things to a cleaner solution. Please go ahead with reviews now.

@qkoziol
Copy link
Contributor Author

qkoziol commented May 14, 2024

Interesting - removing the H5E_clear_stack() from the virtual dataset storage code has revealed a failure in the VDS SWMR test. I'm looking into it, but it's another motivator to get the clear stack calls out of the library.

@jhendersonHDF
Copy link
Collaborator

jhendersonHDF commented May 14, 2024

How does this interact with the old way of "pausing" the error stack by just removing the callback function temporarily? I'm thinking that this PR might need to add a way of publicly retrieving whether an error stack has been paused. The error macros I used in different VFDs/VOLs currently decide whether to suppress that driver's/connector's own error reporting based on whether the default error stack currently has a callback function set, so with these changes it seems I'd likely also need to check if the error stack has been paused to be correct.

@qkoziol
Copy link
Contributor Author

qkoziol commented May 14, 2024

How does this interact with the old way of "pausing" the error stack by just removing the callback function temporarily? I'm thinking that this PR might need to add a way of publicly retrieving whether an error stack has been paused. The error macros I used in different VFDs/VOLs currently decide whether to suppress that driver's/connector's own error reporting based on whether the default error stack currently has a callback function set, so with these changes it seems I'd likely also need to check if the error stack has been paused to be correct.

Setting the reporting callback to NULL just suppresses error reports from being printed when leaving an API routine, it doesn't pause actually pushing errors on the stack. This change allows the library to internally pause pushing errors on the stack in circumstances when it knows the operation could fail and would just call H5E_clear_stack() on a failure.

@qkoziol
Copy link
Contributor Author

qkoziol commented May 14, 2024

Currently, I don't see a need for an API for this, but I'm open to ideas around one.

brtnfld and others added 2 commits May 14, 2024 17:42
…up#4477)

* Properly clean up cache when failing to load an object header

* Don't check message type a second time in H5G__open_oid if the first attempt returns error

* Add more asserts to H5O__assert() to avoid segfaults

---------

Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com>
Copy link
Contributor

@bmribler bmribler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few minor comments

src/H5E.c Show resolved Hide resolved
test/error_test.c Outdated Show resolved Hide resolved
test/error_test.c Outdated Show resolved Hide resolved
qkoziol added 2 commits June 5, 2024 20:45
Signed-off-by: Quincey Koziol <[email protected]>
Signed-off-by: Quincey Koziol <[email protected]>
src/H5Fefc.c Show resolved Hide resolved
@derobins derobins marked this pull request as draft June 8, 2024 00:25
@derobins derobins marked this pull request as ready for review June 8, 2024 18:37
Copy link
Member

@derobins derobins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Jordan's comments need to be addressed before this can be merged

@jhendersonHDF
Copy link
Collaborator

Will do some more testing here to see if there's anything left to do for this PR.

@jhendersonHDF
Copy link
Collaborator

Minor new conflict from #4568, but this looks good to me now and local testing shows error stacks are retained at least from the H5F_open -> VFD path.

@qkoziol
Copy link
Contributor Author

qkoziol commented Jun 13, 2024

Minor new conflict from #4568, but this looks good to me now and local testing shows error stacks are retained at least from the H5F_open -> VFD path.

I've fixed the confilct; thanks for the review! Back to @derobins now :-)

@qkoziol
Copy link
Contributor Author

qkoziol commented Jun 17, 2024

@derobins - This should be ready to go, Jordan's approved.

@derobins
Copy link
Member

Needs a RELEASE.txt entry, but otherwise this can be merged

Signed-off-by: Quincey Koziol <[email protected]>
@derobins derobins merged commit 850d6c8 into HDFGroup:develop Jun 18, 2024
56 checks passed
@qkoziol qkoziol deleted the pause_errors branch June 18, 2024 21:24
byrnHDF pushed a commit to byrnHDF/hdf5 that referenced this pull request Jun 26, 2024
…4475)

An internal capability that's similar to the H5E_BEGIN_TRY / H5E_END_TRY
macros in H5Epublic.h, but more efficient since we can avoid pushing errors on
the stack entirely (and those macros use public API routines).

This capability (and other techniques) can be used to remove use of
H5E_clear_stack() and H5E_BEGIN_TRY / H5E_END_TRY within library routines.

We want to remove H5E_clear_stack() because it can trigger calls to the H5I
interface from within the H5E code, which creates a great deal of complexity
for threadsafe code.  And we want to remove H5E_BEGIN_TRY / H5E_END_TRY's
because they make public API calls from within the library code.

Also some other minor tidying in routines related to removing the use of
H5E_clear_stack() and H5E_BEGIN_TRY / H5E_END_TRY from H5Fint.c
lrknox pushed a commit to lrknox/hdf5 that referenced this pull request Jul 2, 2024
…4475)

An internal capability that's similar to the H5E_BEGIN_TRY / H5E_END_TRY
macros in H5Epublic.h, but more efficient since we can avoid pushing errors on
the stack entirely (and those macros use public API routines).

This capability (and other techniques) can be used to remove use of
H5E_clear_stack() and H5E_BEGIN_TRY / H5E_END_TRY within library routines.

We want to remove H5E_clear_stack() because it can trigger calls to the H5I
interface from within the H5E code, which creates a great deal of complexity
for threadsafe code.  And we want to remove H5E_BEGIN_TRY / H5E_END_TRY's
because they make public API calls from within the library code.

Also some other minor tidying in routines related to removing the use of
H5E_clear_stack() and H5E_BEGIN_TRY / H5E_END_TRY from H5Fint.c
lrknox added a commit that referenced this pull request Jul 3, 2024
* Fix typos in context/property documentation (#4550)

* Fix CI markdown link check http 500 errors (#4556)

Sites like GitLab can have internal problems that return http 500
errors while they fix their problems. Some sites also return http
200 OK, which is fine.

This PR adds a config file to the markdown link check so those
are considered "passing" and don't break the CI.

* Simplify property copying between lists internally (#4551)

* Add Python examples (#4546)

These examples are referred to from the replacement page of https://portal.hdfgroup.org/display/HDF5/Other+Examples.

* Correct property cb signatures in docs (#4554)

* Correct property cb signatures in docs
* Correct delete callback type name in docs
* add missing word to H5P__free_prop doc

* Move C++ and Fortran and examples to HDF5Examples folder (#4552)

* Document 'return-and-read' field in API context (#4560)

* Add compression includes to tests needing zlib support (#4561)

* Allow usage of page buffering for serial file access from parallel HDF5 builds (#4568)

* Remove old version of libaec (#4567)

* Add property names to context field docs (#4563)

* Document property shared name behavior (#4565)

* Clarify H5CX macro documentation (#4569)

* Document H5Punregister modifying default properties (#4570)

* Update NVHPC to 24.5 (#4171)

We don't test parallel in other GitHub actions, so this also converts the
NVHPC check to configure and build only while we discuss how we'll
test parallel HDF5 in GitHub.

There is a blocking GitHub issue to address the test failures for
HDF5 1.14.5 (#4571).

* Clean up comments in H5FDros3.c (#4572)

* Rename INSTALL_Auto.txt to INSTALL_Autotools.txt (#4575)

* Clean up ros3 VFD stats code (#4579)

* Removes printf debugging
* Simplifies and centralizes stats code
* Use #ifdef ROS3_STATS instead of #if
* Other misc tidying

* Turn off ros3 VFD stat collection by default (#4581)

Not a new change - an artifact from a previous check-in.

* Pause recording errors instead of clearing the error stack (#4475)

An internal capability that's similar to the H5E_BEGIN_TRY / H5E_END_TRY
macros in H5Epublic.h, but more efficient since we can avoid pushing errors on
the stack entirely (and those macros use public API routines).

This capability (and other techniques) can be used to remove use of
H5E_clear_stack() and H5E_BEGIN_TRY / H5E_END_TRY within library routines.

We want to remove H5E_clear_stack() because it can trigger calls to the H5I
interface from within the H5E code, which creates a great deal of complexity
for threadsafe code.  And we want to remove H5E_BEGIN_TRY / H5E_END_TRY's
because they make public API calls from within the library code.

Also some other minor tidying in routines related to removing the use of
H5E_clear_stack() and H5E_BEGIN_TRY / H5E_END_TRY from H5Fint.c

* Add page buffer cache command line option to tools (#4562)


Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com>

* Clarify documentation for H5CX_get_data_transform (#4580)

* Correct comment for H5CX_get_data_transform

* Document why data transform ctx field doesnt use macro

* Remove public API call from ros3 VFD (#4583)

* Remove printf debugging from H5FDs3comms.c (#4584)

* Cleanup of ros3 test (#4587)

* Removed JS* macro scheme (replaced w/ h5test.h macros)
* Moved curl setup/teardown to main()
* A lot of cleanup and simplification

* Removed unused code from H5FDs3comms.c (#4588)

* H5FD_s3comms_nlowercase()
* H5FD_s3comms_trim()
* H5FD_s3comms_uriencode()

* Remove magic fields from s3comms structs (#4589)

* Remove dead H5FD_s3comms_percent_encode_char() (#4591)

* Rework the TestExpress usage and refactor dead code (#4590)

* Skip examples if running sanitizers (#4592)

* Clean up s3comms test code (#4594)

* Remove JS* macros
* Remove dead code
* Bring in line with other test code

* Add publish to bucket workflow (#4566)

* Update abi report CI workflow for last release (#4596)

* Update abi report workflow to handle 1.14.4.3 release

* Update name of java report

* Document that ctx VOL property isn't drawn from the FAPL (#4597)

* Update macos workflow to 14 (keep 13 as alternate) (#4603)

* Removed unnecessary call to H5E_clear_stack (#4607)

H5FO_opened and H5SL_search don't push errors on the stack

* Bring subfiling VFD code closer to typical library code (#4595)

Remove API calls, use FUNC_ENTER/LEAVE macros, use the library's error macros,
rename functions to have more standardized names, etc.

* Correct documentation for return-and-read fields (#4598)

* These two generators create strings without NUL for testing (#4608)

* Fix Fortran pkconfig to indicate full path of modules (#4593)

* Updated release schedule (#4615)

1.16 and 2.0 information

* Document VOL object wrapping context (#4611)

* Earray.c and farray.c in hdf5_1_14 still need time_t curr_time for HDsrandom.

* Remove line to use future 116_API from CMakeListat.txt files in HDF5
examples directories
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component - C Library Core C library issues (usually in the src directory) Priority - 1. High 🔼 These are important issues that should be resolved in the next release Type - Improvement Improvements that don't add a new feature or functionality
Projects
Status: Needs Merged
Development

Successfully merging this pull request may close these issues.

7 participants