[NLP] Catch exceptions thrown during inference and report as errors #2542

davidkyle · 2023-06-20T10:08:03Z

Inference was previously wrapped in a try catch statement but that was lost during a refactoring which means the errors are swallowed by the executing thread and not propagated to the error handler.

4266662#diff-5984f43760a98454db757ddada7f1f9f82a87ffaa4aa1c8a89966f4fc94a8829

Since the refactoring inference is performed from the computeValue callback of the CCompressedLfuCache. As inference may throw computeValue must handle failed computations. I've changed the signature to return a std::optional, if the nullopt is returned the cache will not cache the value.

The consequence of this is that the readValue function is never called if computeValue fails. Users must be aware that calling lookup on the cache may result in readValue not being called. If this is considered a trap-y pattern then the alternative is to expose read and write methods on the cache and rewrite the code to first read the value, call infer if the result is not cached then put and write the inference result.

Inference Failures

Evaluating the model with the wrong number of arguments is one example of a recoverable failure.

Example Error Message

Expected at most 3 argument(s) for operator 'forward', but received 5 argument(s). Declaration: forward(__torch__.transformers.modeling_distilbert.DistilBertForSequenceClassification self, Tensor input_ids, Tensor argument_2) -> ((Tensor)) Exception raised from checkAndNormalizeInputs at /Users/davidkyle/source/pytorch/aten/src/ATen/core/function_schema_inl.h:392 (most recent call first): frame #0: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__1::basic_string, std::__1::allocator> const&) + 92 (0x1051f68bc in libc10.dylib) frame #1: void c10::FunctionSchema::checkAndNormalizeInputs(std::__1::vector>&, std::__1::unordered_map, std::__1::allocator>, c10::IValue, std::__1::hash, std::__1::allocator>>, std::__1::equal_to, std::__1::allocator>>, std::__1::allocator, std::__1::allocator> const, c10::IValue>>> const&) const + 464 (0x11218a094 in libtorch_cpu.dylib) frame #2: torch::jit::Method::operator()(std::__1::vector>, std::__1::unordered_map, std::__1::allocator>, c10::IValue, std::__1::hash, std::__1::allocator>>, std::__1::equal_to, std::__1::allocator>>, std::__1::allocator, std::__1::allocator> const, c10::IValue>>> const&) const + 540 (0x11552b078 in libtorch_cpu.dylib) frame #3: torch::jit::Module::forward(std::__1::vector>, std::__1::unordered_map, std::__1::allocator>, c10::IValue, std::__1::hash, std::__1::allocator>>, std::__1::equal_to, std::__1::allocator>>, std::__1::allocator, std::__1::allocator> const, c10::IValue>>> const&) + 116 (0x104dd8540 in pytorch_inference) frame #4: infer(torch::jit::Module&, ml::torch::CCommandParser::SRequest&) + 1268 (0x104dd77a8 in pytorch_inference) frame #5: std::__1::optional, std::__1::allocator>> handleRequest(ml::torch::CCommandParser::CRequestCacheInterface&, ml::torch::CCommandParser::SRequest, torch::jit::Module&, ml::torch::CResultWriter&)::$_0::operator()()::'lambda'(auto)::operator()(auto) const + 48 (0x104df1528 in pytorch_inference) frame #6: decltype(static_cast(fp)(static_cast(fp0))) std::__1::__invoke(auto&&, ml::torch::CCommandParser::SRequest&&) + 68 (0x104df14c0 in pytorch_inference) frame #7: std::__1::optional, std::__1::allocator>> std::__1::__invoke_void_return_wrapper, std::__1::allocator>>, false>::__call(handleRequest(ml::torch::CCommandParser::CRequestCacheInterface&, ml::torch::CCommandParser::SRequest, torch::jit::Module&, ml::torch::CResultWriter&)::$_0::operator()()::'lambda'(auto)&, ml::torch::CCommandParser::SRequest&&) + 40 (0x104df144c in pytorch_inference) frame #8: std::__1::__function::__alloc_func, std::__1::optional, std::__1::allocator>> (ml::torch::CCommandParser::SRequest)>::operator()(ml::torch::CCommandParser::SRequest&&) + 48 (0x104df1418 in pytorch_inference) frame #9: std::__1::__function::__func, std::__1::optional, std::__1::allocator>> (ml::torch::CCommandParser::SRequest)>::operator()(ml::torch::CCommandParser::SRequest&&) + 44 (0x104df0244 in pytorch_inference) frame #10: std::__1::__function::__value_func, std::__1::allocator>> (ml::torch::CCommandParser::SRequest)>::operator()(ml::torch::CCommandParser::SRequest&&) const + 88 (0x104e1e1e0 in pytorch_inference) frame #11: std::__1::function, std::__1::allocator>> (ml::torch::CCommandParser::SRequest)>::operator()(ml::torch::CCommandParser::SRequest) const + 32 (0x104e1e0cc in pytorch_inference) frame #12: ml::torch::CCommandParser::CRequestCacheStub::lookup(ml::torch::CCommandParser::SRequest, std::__1::function, std::__1::allocator>> (ml::torch::CCommandParser::SRequest)> const&, std::__1::function, std::__1::allocator> const&, bool)> const&) + 68 (0x104e1e00c in pytorch_inference) frame #13: handleRequest(ml::torch::CCommandParser::CRequestCacheInterface&, ml::torch::CCommandParser::SRequest, torch::jit::Module&, ml::torch::CResultWriter&)::$_0::operator()() + 212 (0x104defa68 in pytorch_inference) frame #14: decltype(static_cast(fp)()) std::__1::__invoke(handleRequest(ml::torch::CCommandParser::CRequestCacheInterface&, ml::torch::CCommandParser::SRequest, torch::jit::Module&, ml::torch::CResultWriter&)::$_0&) + 24 (0x104def988 in pytorch_inference) frame #15: std::__1::__bind_return, std::__1::tuple<>, __is_valid_bind_return, std::__1::tuple<>>::value>::type std::__1::__apply_functor, std::__1::tuple<>>(handleRequest(ml::torch::CCommandParser::CRequestCacheInterface&, ml::torch::CCommandParser::SRequest, torch::jit::Module&, ml::torch::CResultWriter&)::$_0&, std::__1::tuple<>&, std::__1::__tuple_indices<>, std::__1::tuple<>&&) + 32 (0x104def964 in pytorch_inference) frame #16: std::__1::__bind_return, std::__1::tuple<>, __is_valid_bind_return, std::__1::tuple<>>::value>::type std::__1::__bind::operator()<>() + 36 (0x104def938 in pytorch_inference) frame #17: decltype(static_cast&>(fp)()) std::__1::__invoke&>(std::__1::__bind&) + 24 (0x104def908 in pytorch_inference) frame #18: void std::__1::__invoke_void_return_wrapper::__call&>(std::__1::__bind&) + 24 (0x104def8e4 in pytorch_inference) frame #19: std::__1::enable_if, std::__1::tuple<>, __is_valid_bind_return, std::__1::tuple<>>::value>::type, void>::value || std::__1::integral_constant::value, void>::type std::__1::__bind_r::operator()<>() + 24 (0x104def8c0 in pytorch_inference) frame #20: void ml::core::concurrency_detail::invokeAndWriteResultToPromise, std::__1::shared_ptr>>(std::__1::__bind_r&, std::__1::shared_ptr>&, std::__1::integral_constant const&) + 32 (0x104def808 in pytorch_inference) frame #21: std::__1::future::type>::type> ml::core::async(ml::core::CExecutor&, handleRequest(ml::torch::CCommandParser::CRequestCacheInterface&, ml::torch::CCommandParser::SRequest, torch::jit::Module&, ml::torch::CResultWriter&)::$_0&&)::'lambda'()::operator()() + 36 (0x104def7dc in pytorch_inference) frame #22: decltype(static_cast(fp)()) std::__1::__invoke::type>::type> ml::core::async(ml::core::CExecutor&, handleRequest(ml::torch::CCommandParser::CRequestCacheInterface&, ml::torch::CCommandParser::SRequest, torch::jit::Module&, ml::torch::CResultWriter&)::$_0&&)::'lambda'()&>(handleRequest(ml::torch::CCommandParser::CRequestCacheInterface&, ml::torch::CCommandParser::SRequest, torch::jit::Module&, ml::torch::CResultWriter&)::$_0&&) + 24 (0x104def7ac in pytorch_inference) frame #23: void std::__1::__invoke_void_return_wrapper::__call::type>::type> ml::core::async(ml::core::CExecutor&, handleRequest(ml::torch::CCommandParser::CRequestCacheInterface&, ml::torch::CCommandParser::SRequest, torch::jit::Module&, ml::torch::CResultWriter&)::$_0&&)::'lambda'()&>(std::__1::future::type>::type> ml::core::async(ml::core::CExecutor&, handleRequest(ml::torch::CCommandParser::CRequestCacheInterface&, ml::torch::CCommandParser::SRequest, torch::jit::Module&, ml::torch::CResultWriter&)::$_0&&)::'lambda'()&) + 24 (0x104def764 in pytorch_inference) frame #24: std::__1::__function::__alloc_func::type>::type> ml::core::async(ml::core::CExecutor&, handleRequest(ml::torch::CCommandParser::CRequestCacheInterface&, ml::torch::CCommandParser::SRequest, torch::jit::Module&, ml::torch::CResultWriter&)::$_0&&)::'lambda'(), std::__1::allocator::type>::type> ml::core::async(ml::core::CExecutor&, handleRequest(ml::torch::CCommandParser::CRequestCacheInterface&, ml::torch::CCommandParser::SRequest, torch::jit::Module&, ml::torch::CResultWriter&)::$_0&&)::'lambda'()>, void ()>::operator()() + 28 (0x104def740 in pytorch_inference) frame #25: std::__1::__function::__func::type>::type> ml::core::async(ml::core::CExecutor&, handleRequest(ml::torch::CCommandParser::CRequestCacheInterface&, ml::torch::CCommandParser::SRequest, torch::jit::Module&, ml::torch::CResultWriter&)::$_0&&)::'lambda'(), std::__1::allocator::type>::type> ml::core::async(ml::core::CExecutor&, handleRequest(ml::torch::CCommandParser::CRequestCacheInterface&, ml::torch::CCommandParser::SRequest, torch::jit::Module&, ml::torch::CResultWriter&)::$_0&&)::'lambda'()>, void ()>::operator()() + 28 (0x104ded710 in pytorch_inference) frame #26: std::__1::__function::__value_func::operator()() const + 68 (0x104e30ab0 in pytorch_inference) frame #27: std::__1::function::operator()() const + 24 (0x104e30780 in pytorch_inference) frame #28: ml::core::CStaticThreadPool::CWrappedTask::operator()() + 52 (0x1073fb9d0 in libMlCore.dylib) frame #29: ml::core::CStaticThreadPool::worker(unsigned long) + 448 (0x1073fb510 in libMlCore.dylib) frame #30: ml::core::CStaticThreadPool::CStaticThreadPool(unsigned long, unsigned long)::$_0::operator()() const + 32 (0x10740184c in libMlCore.dylib) frame #31: decltype(static_cast(fp)()) std::__1::__invoke(ml::core::CStaticThreadPool::CStaticThreadPool(unsigned long, unsigned long)::$_0&&) + 24 (0x1074017f8 in libMlCore.dylib) frame #32: void std::__1::__thread_execute>, ml::core::CStaticThreadPool::CStaticThreadPool(unsigned long, unsigned long)::$_0>(std::__1::tuple>, ml::core::CStaticThreadPool::CStaticThreadPool(unsigned long, unsigned long)::$_0>&, std::__1::__tuple_indices<>) + 28 (0x107401794 in libMlCore.dylib) frame #33: void* std::__1::__thread_proxy>, ml::core::CStaticThreadPool::CStaticThreadPool(unsigned long, unsigned long)::$_0>>(void*) + 84 (0x107401124 in libMlCore.dylib) frame #34: _pthread_start + 148 (0x187f3bfa8 in libsystem_pthread.dylib) frame #35: thread_start + 8 (0x187f36da0 in libsystem_pthread.dylib)

Runtime Exceptions

TODO: runtime exceptions may not be recoverable and should fail the inference process. Investigate as a follow up.

droberts195

LGTM

bin/pytorch_inference/Main.cc

davidkyle added 3rd party models v8.9.0 labels Jun 20, 2023

Catch exceptions thrown during inference and report as errors

48e0df3

davidkyle force-pushed the dont-swallow branch from 42c5a77 to 48e0df3 Compare June 20, 2023 10:19

changelog

0647c3f

droberts195 added the >bug label Jun 20, 2023

test

b6f1b5d

droberts195 approved these changes Jun 20, 2023

View reviewed changes

bin/pytorch_inference/Main.cc Outdated Show resolved Hide resolved

address comments

1c694d2

davidkyle merged commit 4b769f8 into elastic:main Jun 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[NLP] Catch exceptions thrown during inference and report as errors #2542

[NLP] Catch exceptions thrown during inference and report as errors #2542

davidkyle commented Jun 20, 2023

droberts195 left a comment

[NLP] Catch exceptions thrown during inference and report as errors #2542

[NLP] Catch exceptions thrown during inference and report as errors #2542

Conversation

davidkyle commented Jun 20, 2023

Inference Failures

Runtime Exceptions

droberts195 left a comment

Choose a reason for hiding this comment