Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NLP] Catch exceptions thrown during inference and report as errors #2542

Merged
merged 4 commits into from
Jun 20, 2023

Conversation

davidkyle
Copy link
Member

Inference was previously wrapped in a try catch statement but that was lost during a refactoring which means the errors are swallowed by the executing thread and not propagated to the error handler.

4266662#diff-5984f43760a98454db757ddada7f1f9f82a87ffaa4aa1c8a89966f4fc94a8829

Since the refactoring inference is performed from the computeValue callback of the CCompressedLfuCache. As inference may throw computeValue must handle failed computations. I've changed the signature to return a std::optional, if the nullopt is returned the cache will not cache the value.

The consequence of this is that the readValue function is never called if computeValue fails. Users must be aware that calling lookup on the cache may result in readValue not being called. If this is considered a trap-y pattern then the alternative is to expose read and write methods on the cache and rewrite the code to first read the value, call infer if the result is not cached then put and write the inference result.

Inference Failures

Evaluating the model with the wrong number of arguments is one example of a recoverable failure.

Example Error Message Expected at most 3 argument(s) for operator 'forward', but received 5 argument(s). Declaration: forward(__torch__.transformers.modeling_distilbert.DistilBertForSequenceClassification self, Tensor input_ids, Tensor argument_2) -> ((Tensor)) Exception raised from checkAndNormalizeInputs at /Users/davidkyle/source/pytorch/aten/src/ATen/core/function_schema_inl.h:392 (most recent call first): frame #0: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__1::basic_string, std::__1::allocator> const&) + 92 (0x1051f68bc in libc10.dylib) frame #1: void c10::FunctionSchema::checkAndNormalizeInputs(std::__1::vector>&, std::__1::unordered_map, std::__1::allocator>, c10::IValue, std::__1::hash, std::__1::allocator>>, std::__1::equal_to, std::__1::allocator>>, std::__1::allocator, std::__1::allocator> const, c10::IValue>>> const&) const + 464 (0x11218a094 in libtorch_cpu.dylib) frame #2: torch::jit::Method::operator()(std::__1::vector>, std::__1::unordered_map, std::__1::allocator>, c10::IValue, std::__1::hash, std::__1::allocator>>, std::__1::equal_to, std::__1::allocator>>, std::__1::allocator, std::__1::allocator> const, c10::IValue>>> const&) const + 540 (0x11552b078 in libtorch_cpu.dylib) frame #3: torch::jit::Module::forward(std::__1::vector>, std::__1::unordered_map, std::__1::allocator>, c10::IValue, std::__1::hash, std::__1::allocator>>, std::__1::equal_to, std::__1::allocator>>, std::__1::allocator, std::__1::allocator> const, c10::IValue>>> const&) + 116 (0x104dd8540 in pytorch_inference) frame #4: infer(torch::jit::Module&, ml::torch::CCommandParser::SRequest&) + 1268 (0x104dd77a8 in pytorch_inference) frame #5: std::__1::optional, std::__1::allocator>> handleRequest(ml::torch::CCommandParser::CRequestCacheInterface&, ml::torch::CCommandParser::SRequest, torch::jit::Module&, ml::torch::CResultWriter&)::$_0::operator()()::'lambda'(auto)::operator()(auto) const + 48 (0x104df1528 in pytorch_inference) frame #6: decltype(static_cast(fp)(static_cast(fp0))) std::__1::__invoke(auto&&, ml::torch::CCommandParser::SRequest&&) + 68 (0x104df14c0 in pytorch_inference) frame #7: std::__1::optional, std::__1::allocator>> std::__1::__invoke_void_return_wrapper, std::__1::allocator>>, false>::__call(handleRequest(ml::torch::CCommandParser::CRequestCacheInterface&, ml::torch::CCommandParser::SRequest, torch::jit::Module&, ml::torch::CResultWriter&)::$_0::operator()()::'lambda'(auto)&, ml::torch::CCommandParser::SRequest&&) + 40 (0x104df144c in pytorch_inference) frame #8: std::__1::__function::__alloc_func, std::__1::optional, std::__1::allocator>> (ml::torch::CCommandParser::SRequest)>::operator()(ml::torch::CCommandParser::SRequest&&) + 48 (0x104df1418 in pytorch_inference) frame #9: std::__1::__function::__func, std::__1::optional, std::__1::allocator>> (ml::torch::CCommandParser::SRequest)>::operator()(ml::torch::CCommandParser::SRequest&&) + 44 (0x104df0244 in pytorch_inference) frame #10: std::__1::__function::__value_func, std::__1::allocator>> (ml::torch::CCommandParser::SRequest)>::operator()(ml::torch::CCommandParser::SRequest&&) const + 88 (0x104e1e1e0 in pytorch_inference) frame #11: std::__1::function, std::__1::allocator>> (ml::torch::CCommandParser::SRequest)>::operator()(ml::torch::CCommandParser::SRequest) const + 32 (0x104e1e0cc in pytorch_inference) frame #12: ml::torch::CCommandParser::CRequestCacheStub::lookup(ml::torch::CCommandParser::SRequest, std::__1::function, std::__1::allocator>> (ml::torch::CCommandParser::SRequest)> const&, std::__1::function, std::__1::allocator> const&, bool)> const&) + 68 (0x104e1e00c in pytorch_inference) frame #13: handleRequest(ml::torch::CCommandParser::CRequestCacheInterface&, ml::torch::CCommandParser::SRequest, torch::jit::Module&, ml::torch::CResultWriter&)::$_0::operator()() + 212 (0x104defa68 in pytorch_inference) frame #14: decltype(static_cast(fp)()) std::__1::__invoke(handleRequest(ml::torch::CCommandParser::CRequestCacheInterface&, ml::torch::CCommandParser::SRequest, torch::jit::Module&, ml::torch::CResultWriter&)::$_0&) + 24 (0x104def988 in pytorch_inference) frame #15: std::__1::__bind_return, std::__1::tuple<>, __is_valid_bind_return, std::__1::tuple<>>::value>::type std::__1::__apply_functor, std::__1::tuple<>>(handleRequest(ml::torch::CCommandParser::CRequestCacheInterface&, ml::torch::CCommandParser::SRequest, torch::jit::Module&, ml::torch::CResultWriter&)::$_0&, std::__1::tuple<>&, std::__1::__tuple_indices<>, std::__1::tuple<>&&) + 32 (0x104def964 in pytorch_inference) frame #16: std::__1::__bind_return, std::__1::tuple<>, __is_valid_bind_return, std::__1::tuple<>>::value>::type std::__1::__bind::operator()<>() + 36 (0x104def938 in pytorch_inference) frame #17: decltype(static_cast&>(fp)()) std::__1::__invoke&>(std::__1::__bind&) + 24 (0x104def908 in pytorch_inference) frame #18: void std::__1::__invoke_void_return_wrapper::__call&>(std::__1::__bind&) + 24 (0x104def8e4 in pytorch_inference) frame #19: std::__1::enable_if, std::__1::tuple<>, __is_valid_bind_return, std::__1::tuple<>>::value>::type, void>::value || std::__1::integral_constant::value, void>::type std::__1::__bind_r::operator()<>() + 24 (0x104def8c0 in pytorch_inference) frame #20: void ml::core::concurrency_detail::invokeAndWriteResultToPromise, std::__1::shared_ptr>>(std::__1::__bind_r&, std::__1::shared_ptr>&, std::__1::integral_constant const&) + 32 (0x104def808 in pytorch_inference) frame #21: std::__1::future::type>::type> ml::core::async(ml::core::CExecutor&, handleRequest(ml::torch::CCommandParser::CRequestCacheInterface&, ml::torch::CCommandParser::SRequest, torch::jit::Module&, ml::torch::CResultWriter&)::$_0&&)::'lambda'()::operator()() + 36 (0x104def7dc in pytorch_inference) frame #22: decltype(static_cast(fp)()) std::__1::__invoke::type>::type> ml::core::async(ml::core::CExecutor&, handleRequest(ml::torch::CCommandParser::CRequestCacheInterface&, ml::torch::CCommandParser::SRequest, torch::jit::Module&, ml::torch::CResultWriter&)::$_0&&)::'lambda'()&>(handleRequest(ml::torch::CCommandParser::CRequestCacheInterface&, ml::torch::CCommandParser::SRequest, torch::jit::Module&, ml::torch::CResultWriter&)::$_0&&) + 24 (0x104def7ac in pytorch_inference) frame #23: void std::__1::__invoke_void_return_wrapper::__call::type>::type> ml::core::async(ml::core::CExecutor&, handleRequest(ml::torch::CCommandParser::CRequestCacheInterface&, ml::torch::CCommandParser::SRequest, torch::jit::Module&, ml::torch::CResultWriter&)::$_0&&)::'lambda'()&>(std::__1::future::type>::type> ml::core::async(ml::core::CExecutor&, handleRequest(ml::torch::CCommandParser::CRequestCacheInterface&, ml::torch::CCommandParser::SRequest, torch::jit::Module&, ml::torch::CResultWriter&)::$_0&&)::'lambda'()&) + 24 (0x104def764 in pytorch_inference) frame #24: std::__1::__function::__alloc_func::type>::type> ml::core::async(ml::core::CExecutor&, handleRequest(ml::torch::CCommandParser::CRequestCacheInterface&, ml::torch::CCommandParser::SRequest, torch::jit::Module&, ml::torch::CResultWriter&)::$_0&&)::'lambda'(), std::__1::allocator::type>::type> ml::core::async(ml::core::CExecutor&, handleRequest(ml::torch::CCommandParser::CRequestCacheInterface&, ml::torch::CCommandParser::SRequest, torch::jit::Module&, ml::torch::CResultWriter&)::$_0&&)::'lambda'()>, void ()>::operator()() + 28 (0x104def740 in pytorch_inference) frame #25: std::__1::__function::__func::type>::type> ml::core::async(ml::core::CExecutor&, handleRequest(ml::torch::CCommandParser::CRequestCacheInterface&, ml::torch::CCommandParser::SRequest, torch::jit::Module&, ml::torch::CResultWriter&)::$_0&&)::'lambda'(), std::__1::allocator::type>::type> ml::core::async(ml::core::CExecutor&, handleRequest(ml::torch::CCommandParser::CRequestCacheInterface&, ml::torch::CCommandParser::SRequest, torch::jit::Module&, ml::torch::CResultWriter&)::$_0&&)::'lambda'()>, void ()>::operator()() + 28 (0x104ded710 in pytorch_inference) frame #26: std::__1::__function::__value_func::operator()() const + 68 (0x104e30ab0 in pytorch_inference) frame #27: std::__1::function::operator()() const + 24 (0x104e30780 in pytorch_inference) frame #28: ml::core::CStaticThreadPool::CWrappedTask::operator()() + 52 (0x1073fb9d0 in libMlCore.dylib) frame #29: ml::core::CStaticThreadPool::worker(unsigned long) + 448 (0x1073fb510 in libMlCore.dylib) frame #30: ml::core::CStaticThreadPool::CStaticThreadPool(unsigned long, unsigned long)::$_0::operator()() const + 32 (0x10740184c in libMlCore.dylib) frame #31: decltype(static_cast(fp)()) std::__1::__invoke(ml::core::CStaticThreadPool::CStaticThreadPool(unsigned long, unsigned long)::$_0&&) + 24 (0x1074017f8 in libMlCore.dylib) frame #32: void std::__1::__thread_execute>, ml::core::CStaticThreadPool::CStaticThreadPool(unsigned long, unsigned long)::$_0>(std::__1::tuple>, ml::core::CStaticThreadPool::CStaticThreadPool(unsigned long, unsigned long)::$_0>&, std::__1::__tuple_indices<>) + 28 (0x107401794 in libMlCore.dylib) frame #33: void* std::__1::__thread_proxy>, ml::core::CStaticThreadPool::CStaticThreadPool(unsigned long, unsigned long)::$_0>>(void*) + 84 (0x107401124 in libMlCore.dylib) frame #34: _pthread_start + 148 (0x187f3bfa8 in libsystem_pthread.dylib) frame #35: thread_start + 8 (0x187f36da0 in libsystem_pthread.dylib)

Runtime Exceptions

TODO: runtime exceptions may not be recoverable and should fail the inference process. Investigate as a follow up.

Copy link
Contributor

@droberts195 droberts195 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

bin/pytorch_inference/Main.cc Outdated Show resolved Hide resolved
@davidkyle davidkyle merged commit 4b769f8 into elastic:main Jun 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants