[NLP] Catch exceptions thrown during inference and report as errors #2542
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Inference was previously wrapped in a try catch statement but that was lost during a refactoring which means the errors are swallowed by the executing thread and not propagated to the error handler.
4266662#diff-5984f43760a98454db757ddada7f1f9f82a87ffaa4aa1c8a89966f4fc94a8829
Since the refactoring inference is performed from the
computeValue
callback of theCCompressedLfuCache
. As inference may throwcomputeValue
must handle failed computations. I've changed the signature to return a std::optional, if the nullopt is returned the cache will not cache the value.The consequence of this is that the
readValue
function is never called ifcomputeValue
fails. Users must be aware that callinglookup
on the cache may result inreadValue
not being called. If this is considered a trap-y pattern then the alternative is to exposeread
andwrite
methods on the cache and rewrite the code to first read the value, call infer if the result is not cached then put and write the inference result.Inference Failures
Evaluating the model with the wrong number of arguments is one example of a recoverable failure.
Example Error Message
Expected at most 3 argument(s) for operator 'forward', but received 5 argument(s). Declaration: forward(__torch__.transformers.modeling_distilbert.DistilBertForSequenceClassification self, Tensor input_ids, Tensor argument_2) -> ((Tensor)) Exception raised from checkAndNormalizeInputs at /Users/davidkyle/source/pytorch/aten/src/ATen/core/function_schema_inl.h:392 (most recent call first): frame #0: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__1::basic_string, std::__1::allocator> const&) + 92 (0x1051f68bc in libc10.dylib) frame #1: void c10::FunctionSchema::checkAndNormalizeInputs(std::__1::vector>&, std::__1::unordered_map, std::__1::allocator>, c10::IValue, std::__1::hash, std::__1::allocator>>, std::__1::equal_to, std::__1::allocator>>, std::__1::allocator, std::__1::allocator> const, c10::IValue>>> const&) const + 464 (0x11218a094 in libtorch_cpu.dylib) frame #2: torch::jit::Method::operator()(std::__1::vector>, std::__1::unordered_map, std::__1::allocator>, c10::IValue, std::__1::hash, std::__1::allocator>>, std::__1::equal_to, std::__1::allocator>>, std::__1::allocator, std::__1::allocator> const, c10::IValue>>> const&) const + 540 (0x11552b078 in libtorch_cpu.dylib) frame #3: torch::jit::Module::forward(std::__1::vector>, std::__1::unordered_map, std::__1::allocator>, c10::IValue, std::__1::hash, std::__1::allocator>>, std::__1::equal_to, std::__1::allocator>>, std::__1::allocator, std::__1::allocator> const, c10::IValue>>> const&) + 116 (0x104dd8540 in pytorch_inference) frame #4: infer(torch::jit::Module&, ml::torch::CCommandParser::SRequest&) + 1268 (0x104dd77a8 in pytorch_inference) frame #5: std::__1::optional, std::__1::allocator>> handleRequest(ml::torch::CCommandParser::CRequestCacheInterface&, ml::torch::CCommandParser::SRequest, torch::jit::Module&, ml::torch::CResultWriter&)::$_0::operator()()::'lambda'(auto)::operator()(auto) const + 48 (0x104df1528 in pytorch_inference) frame #6: decltype(static_cast(fp)(static_cast(fp0))) std::__1::__invoke(auto&&, ml::torch::CCommandParser::SRequest&&) + 68 (0x104df14c0 in pytorch_inference) frame #7: std::__1::optional, std::__1::allocator>> std::__1::__invoke_void_return_wrapper, std::__1::allocator>>, false>::__call(handleRequest(ml::torch::CCommandParser::CRequestCacheInterface&, ml::torch::CCommandParser::SRequest, torch::jit::Module&, ml::torch::CResultWriter&)::$_0::operator()()::'lambda'(auto)&, ml::torch::CCommandParser::SRequest&&) + 40 (0x104df144c in pytorch_inference) frame #8: std::__1::__function::__alloc_func, std::__1::optional, std::__1::allocator>> (ml::torch::CCommandParser::SRequest)>::operator()(ml::torch::CCommandParser::SRequest&&) + 48 (0x104df1418 in pytorch_inference) frame #9: std::__1::__function::__func, std::__1::optional, std::__1::allocator>> (ml::torch::CCommandParser::SRequest)>::operator()(ml::torch::CCommandParser::SRequest&&) + 44 (0x104df0244 in pytorch_inference) frame #10: std::__1::__function::__value_func, std::__1::allocator>> (ml::torch::CCommandParser::SRequest)>::operator()(ml::torch::CCommandParser::SRequest&&) const + 88 (0x104e1e1e0 in pytorch_inference) frame #11: std::__1::function, std::__1::allocator>> (ml::torch::CCommandParser::SRequest)>::operator()(ml::torch::CCommandParser::SRequest) const + 32 (0x104e1e0cc in pytorch_inference) frame #12: ml::torch::CCommandParser::CRequestCacheStub::lookup(ml::torch::CCommandParser::SRequest, std::__1::function, std::__1::allocator>> (ml::torch::CCommandParser::SRequest)> const&, std::__1::function, std::__1::allocator> const&, bool)> const&) + 68 (0x104e1e00c in pytorch_inference) frame #13: handleRequest(ml::torch::CCommandParser::CRequestCacheInterface&, ml::torch::CCommandParser::SRequest, torch::jit::Module&, ml::torch::CResultWriter&)::$_0::operator()() + 212 (0x104defa68 in pytorch_inference) frame #14: decltype(static_cast(fp)()) std::__1::__invoke(handleRequest(ml::torch::CCommandParser::CRequestCacheInterface&, ml::torch::CCommandParser::SRequest, torch::jit::Module&, ml::torch::CResultWriter&)::$_0&) + 24 (0x104def988 in pytorch_inference) frame #15: std::__1::__bind_return, std::__1::tuple<>, __is_valid_bind_return, std::__1::tuple<>>::value>::type std::__1::__apply_functor, std::__1::tuple<>>(handleRequest(ml::torch::CCommandParser::CRequestCacheInterface&, ml::torch::CCommandParser::SRequest, torch::jit::Module&, ml::torch::CResultWriter&)::$_0&, std::__1::tuple<>&, std::__1::__tuple_indices<>, std::__1::tuple<>&&) + 32 (0x104def964 in pytorch_inference) frame #16: std::__1::__bind_return, std::__1::tuple<>, __is_valid_bind_return, std::__1::tuple<>>::value>::type std::__1::__bind::operator()<>() + 36 (0x104def938 in pytorch_inference) frame #17: decltype(static_cast&>(fp)()) std::__1::__invoke&>(std::__1::__bind&) + 24 (0x104def908 in pytorch_inference) frame #18: void std::__1::__invoke_void_return_wrapper::__call&>(std::__1::__bind&) + 24 (0x104def8e4 in pytorch_inference) frame #19: std::__1::enable_if, std::__1::tuple<>, __is_valid_bind_return, std::__1::tuple<>>::value>::type, void>::value || std::__1::integral_constant::value, void>::type std::__1::__bind_r::operator()<>() + 24 (0x104def8c0 in pytorch_inference) frame #20: void ml::core::concurrency_detail::invokeAndWriteResultToPromise, std::__1::shared_ptr>>(std::__1::__bind_r&, std::__1::shared_ptr>&, std::__1::integral_constant const&) + 32 (0x104def808 in pytorch_inference) frame #21: std::__1::future::type>::type> ml::core::async(ml::core::CExecutor&, handleRequest(ml::torch::CCommandParser::CRequestCacheInterface&, ml::torch::CCommandParser::SRequest, torch::jit::Module&, ml::torch::CResultWriter&)::$_0&&)::'lambda'()::operator()() + 36 (0x104def7dc in pytorch_inference) frame #22: decltype(static_cast(fp)()) std::__1::__invoke::type>::type> ml::core::async(ml::core::CExecutor&, handleRequest(ml::torch::CCommandParser::CRequestCacheInterface&, ml::torch::CCommandParser::SRequest, torch::jit::Module&, ml::torch::CResultWriter&)::$_0&&)::'lambda'()&>(handleRequest(ml::torch::CCommandParser::CRequestCacheInterface&, ml::torch::CCommandParser::SRequest, torch::jit::Module&, ml::torch::CResultWriter&)::$_0&&) + 24 (0x104def7ac in pytorch_inference) frame #23: void std::__1::__invoke_void_return_wrapper::__call::type>::type> ml::core::async(ml::core::CExecutor&, handleRequest(ml::torch::CCommandParser::CRequestCacheInterface&, ml::torch::CCommandParser::SRequest, torch::jit::Module&, ml::torch::CResultWriter&)::$_0&&)::'lambda'()&>(std::__1::future::type>::type> ml::core::async(ml::core::CExecutor&, handleRequest(ml::torch::CCommandParser::CRequestCacheInterface&, ml::torch::CCommandParser::SRequest, torch::jit::Module&, ml::torch::CResultWriter&)::$_0&&)::'lambda'()&) + 24 (0x104def764 in pytorch_inference) frame #24: std::__1::__function::__alloc_func::type>::type> ml::core::async(ml::core::CExecutor&, handleRequest(ml::torch::CCommandParser::CRequestCacheInterface&, ml::torch::CCommandParser::SRequest, torch::jit::Module&, ml::torch::CResultWriter&)::$_0&&)::'lambda'(), std::__1::allocator::type>::type> ml::core::async(ml::core::CExecutor&, handleRequest(ml::torch::CCommandParser::CRequestCacheInterface&, ml::torch::CCommandParser::SRequest, torch::jit::Module&, ml::torch::CResultWriter&)::$_0&&)::'lambda'()>, void ()>::operator()() + 28 (0x104def740 in pytorch_inference) frame #25: std::__1::__function::__func::type>::type> ml::core::async(ml::core::CExecutor&, handleRequest(ml::torch::CCommandParser::CRequestCacheInterface&, ml::torch::CCommandParser::SRequest, torch::jit::Module&, ml::torch::CResultWriter&)::$_0&&)::'lambda'(), std::__1::allocator::type>::type> ml::core::async(ml::core::CExecutor&, handleRequest(ml::torch::CCommandParser::CRequestCacheInterface&, ml::torch::CCommandParser::SRequest, torch::jit::Module&, ml::torch::CResultWriter&)::$_0&&)::'lambda'()>, void ()>::operator()() + 28 (0x104ded710 in pytorch_inference) frame #26: std::__1::__function::__value_func::operator()() const + 68 (0x104e30ab0 in pytorch_inference) frame #27: std::__1::function::operator()() const + 24 (0x104e30780 in pytorch_inference) frame #28: ml::core::CStaticThreadPool::CWrappedTask::operator()() + 52 (0x1073fb9d0 in libMlCore.dylib) frame #29: ml::core::CStaticThreadPool::worker(unsigned long) + 448 (0x1073fb510 in libMlCore.dylib) frame #30: ml::core::CStaticThreadPool::CStaticThreadPool(unsigned long, unsigned long)::$_0::operator()() const + 32 (0x10740184c in libMlCore.dylib) frame #31: decltype(static_cast(fp)()) std::__1::__invoke(ml::core::CStaticThreadPool::CStaticThreadPool(unsigned long, unsigned long)::$_0&&) + 24 (0x1074017f8 in libMlCore.dylib) frame #32: void std::__1::__thread_execute>, ml::core::CStaticThreadPool::CStaticThreadPool(unsigned long, unsigned long)::$_0>(std::__1::tuple>, ml::core::CStaticThreadPool::CStaticThreadPool(unsigned long, unsigned long)::$_0>&, std::__1::__tuple_indices<>) + 28 (0x107401794 in libMlCore.dylib) frame #33: void* std::__1::__thread_proxy>, ml::core::CStaticThreadPool::CStaticThreadPool(unsigned long, unsigned long)::$_0>>(void*) + 84 (0x107401124 in libMlCore.dylib) frame #34: _pthread_start + 148 (0x187f3bfa8 in libsystem_pthread.dylib) frame #35: thread_start + 8 (0x187f36da0 in libsystem_pthread.dylib)Runtime Exceptions
TODO: runtime exceptions may not be recoverable and should fail the inference process. Investigate as a follow up.