Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revamp some deadlock analysis for GH #6437

Merged
merged 9 commits into from
Jan 16, 2025

Conversation

knelli2
Copy link
Contributor

@knelli2 knelli2 commented Jan 16, 2025

Proposed changes

While debugging nodegroup deadlocks, these changes were helpful giving more, better organized, information.

This last commit was useful because sometimes after a deadlock, I'd want to run an event to get some extra information and we have a phase for doing that (PostFailureCleanup), but it wasn't run after a deadlock. If the method I have for implementing this needs a lot of discussion, I'm ok leaving it out for now. The other commits are more important I think.

Upgrade instructions

For BBH and SingleBH executables, must add a block to the input file

EventsRunAtCleanup:
  ObservationValue: -1000.0
  Events:
    - Event1
    - Event2

Code review checklist

  • The code is documented and the documentation renders correctly. Run
    make doc to generate the documentation locally into BUILD_DIR/docs/html.
    Then open index.html.
  • The code follows the stylistic and code quality guidelines listed in the
    code review guide.
  • The PR lists upgrade instructions and is labeled bugfix or
    new feature if appropriate.

Further comments

@knelli2 knelli2 requested a review from nilsdeppe January 16, 2025 01:34
Copy link
Member

@nilsdeppe nilsdeppe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for doing this! A few small suggestions on printing. Please squash immediately :)

}
}

Parallel::fprintf(file_name, "%s\n", ss.str());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be easier to read the output file if you do this print outside the for_each. Then you could do something like

============== BEGIN CALLBACKS ON NODE # =====================
...
============== END CALLBACKS ON NODE # =====================

to make the file easier to parse.


stream_points(temporal_id);

Parallel::fprintf(file_name, "%s\n", ss.str());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would move the stream to outside the for loop. Then the block of text is guaranteed to be contiguous. You can then also delineate it with blocks like

=========== BEGIN INTERPOLATION TARGET ========


ss << difference;

Parallel::fprintf(file_name, "%s\n", ss.str());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would move the stream to outside the loops. Then the block of text is guaranteed to be contiguous. You can then also delineate it with blocks like

=========== BEGIN INTERPOLATOR =============

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving outside the for loop is good, but can't move outside the for_each because I write a new file for each target tag (the output can be long).

try {
if (force) {
set_terminate(true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not obvious why terminate should be true. Could you add a code comment?

@knelli2
Copy link
Contributor Author

knelli2 commented Jan 16, 2025

Squashed!

@nilsdeppe nilsdeppe merged commit a0e4198 into sxs-collaboration:develop Jan 16, 2025
19 of 24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants