-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Properly short circuit core worker Get() on exception #5672
Conversation
Test PASSed. |
Test FAILed. |
LGTM, thanks. I left a minor comment. |
@@ -30,7 +30,8 @@ Status CoreWorkerMemoryStoreProvider::Put(const RayObject &object, | |||
Status CoreWorkerMemoryStoreProvider::Get( | |||
const std::unordered_set<ObjectID> &object_ids, int64_t timeout_ms, | |||
const TaskID &task_id, | |||
std::unordered_map<ObjectID, std::shared_ptr<RayObject>> *results) { | |||
std::unordered_map<ObjectID, std::shared_ptr<RayObject>> *results, | |||
bool *got_exception) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got_exception
can be removed since it can be determined from the result
map when there's an exception?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could, but that would require iterating over the full list of results even in success cases, which I would want to avoid especially because the IsException
check loops over a list of types itself.
std::string error_string = std::to_string(ray::rpc::TASK_EXECUTION_EXCEPTION); | ||
char error_buffer[error_string.size()]; | ||
size_t len = error_string.copy(error_buffer, error_string.size(), 0); | ||
buffers_with_exception.emplace_back( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't the above just a very intricate way of doing error_string.data()
below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately not, we need a non-const reference to the data. We could also do const_cast
if you'd prefer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah no, let's not do that. Probably a const version of the constructor would be in order
Line 51 in d8f5804
LocalMemoryBuffer(uint8_t *data, size_t size, bool copy_data = false) |
const TaskID &task_id, | ||
std::unordered_map<ObjectID, std::shared_ptr<RayObject>> *results) = 0; | ||
/// \param[out] got_exception Set to true if any of the fetched results were an | ||
/// exception. \return Status. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
newline before \return Status.
Test FAILed. |
Why are these changes needed?
We should be short circuiting calls to
Get()
in the core worker when we see an exception, but this wasn't propagated out of the plasma store provider (was causing some Java test errors).Checks
scripts/format.sh
to lint the changes in this PR.