Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug where AsyncioRunnable hangs if process_one throws and the source is not emitting new values #523

Merged
merged 10 commits into from
Jan 13, 2025

Conversation

dagardner-nv
Copy link
Contributor

@dagardner-nv dagardner-nv commented Dec 18, 2024

Description

  • Fixes a bug first observed in NVIDIA-AI-Blueprints/vulnerability-analysis and reported in [BUG]: Exceptions raised in an LLMNode don't always halt the pipeline Morpheus#2086
  • AsyncioRunnable will now call on_state_update(state_t::Kill) when an exception is caught
  • Replace blocking call to await_read with await_read_until allowing AsyncioRunnable to check stop_source.stop_requested()
  • Define new await_read_until method in IEdgeReadable, unfortunately this interface has numerous subclasses which all then needed new await_read_until methods, even though EdgeChannelReader is the only class that really needed it. Alternatives:
    • In AsyncSink perform a static cast of this->get_readable_edge() to EdgeChannelReader
    • Define await_read_until method in IEdgeReadable but give it an implementation that throws a non-impl exception (or asserts false)

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@dagardner-nv dagardner-nv added bug Something isn't working non-breaking Non-breaking change DO NOT MERGE skip-ci Optionally Skip CI for this PR labels Dec 18, 2024
@dagardner-nv dagardner-nv self-assigned this Dec 18, 2024
@dagardner-nv dagardner-nv requested a review from a team as a code owner December 18, 2024 19:33
… been raised in process_one, but AsyncioRunnable is blocked on read_async in the situation where the source isn't emitting any values

TODO: Remove debug logging
TODO: Remove static_pointer_cast
@dagardner-nv dagardner-nv requested a review from a team as a code owner December 19, 2024 00:38
@dagardner-nv dagardner-nv removed DO NOT MERGE skip-ci Optionally Skip CI for this PR labels Dec 19, 2024
@dagardner-nv dagardner-nv changed the title Add test to reproduce Morpheus issue #2086 Fix bug where AsyncioRunnable hangs if process_one throws and the source is not emitting new values Dec 19, 2024
Copy link
Member

@willkill07 willkill07 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One nit on consistency of unimplemented behavior. Otherwise LGTM.

@dagardner-nv
Copy link
Contributor Author

/merge

@rapids-bot rapids-bot bot merged commit aaf402a into nv-morpheus:branch-25.02 Jan 13, 2025
18 checks passed
@dagardner-nv dagardner-nv deleted the david-async-gen-2086 branch January 13, 2025 23:29
Copy link

codecov bot commented Jan 13, 2025

Codecov Report

Attention: Patch coverage is 19.23077% with 42 lines in your changes missing coverage. Please review.

Project coverage is 74.01%. Comparing base (7d5e48f) to head (ddffa07).
Report is 1 commits behind head on branch-25.02.

Files with missing lines Patch % Lines
python/mrc/_pymrc/include/pymrc/node.hpp 0.00% 26 Missing ⚠️
cpp/mrc/include/mrc/edge/edge_readable.hpp 0.00% 12 Missing ⚠️
cpp/mrc/include/mrc/node/sink_properties.hpp 0.00% 2 Missing ⚠️
cpp/mrc/include/mrc/node/source_properties.hpp 0.00% 2 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@                Coverage Diff                @@
##           branch-25.02     #523       +/-   ##
=================================================
+ Coverage         54.34%   74.01%   +19.66%     
=================================================
  Files               372      407       +35     
  Lines             12553    15104     +2551     
  Branches           1104     1199       +95     
=================================================
+ Hits               6822    11179     +4357     
+ Misses             5731     3925     -1806     
Flag Coverage Δ
cpp 69.46% <29.41%> (+21.65%) ⬆️
py 44.08% <0.00%> (-0.17%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
cpp/mrc/include/mrc/edge/edge_channel.hpp 86.36% <100.00%> (+1.36%) ⬆️
...thon/mrc/_pymrc/include/pymrc/asyncio_runnable.hpp 95.58% <100.00%> (+0.35%) ⬆️
cpp/mrc/include/mrc/node/sink_properties.hpp 72.72% <0.00%> (-4.70%) ⬇️
cpp/mrc/include/mrc/node/source_properties.hpp 49.25% <0.00%> (-1.52%) ⬇️
cpp/mrc/include/mrc/edge/edge_readable.hpp 62.22% <0.00%> (-22.63%) ⬇️
python/mrc/_pymrc/include/pymrc/node.hpp 40.00% <0.00%> (-21.23%) ⬇️

... and 168 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7d5e48f...ddffa07. Read the comment docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working non-breaking Non-breaking change
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants