Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

onnxruntime c++ logging crash #22013

Closed
sai-fpp opened this issue Sep 6, 2024 · 8 comments
Closed

onnxruntime c++ logging crash #22013

sai-fpp opened this issue Sep 6, 2024 · 8 comments
Labels
platform:mobile issues related to ONNX Runtime mobile; typically submitted using template

Comments

@sai-fpp
Copy link

sai-fpp commented Sep 6, 2024

Describe the issue

Dear Sir or Madam:
When running onnxruntime on the arm64 target machine, after changing the log level of Ort::Env env from ORT_LOGGING_LEVEL_WARNING to ORT_LOGGING_LEVEL_VERBOSE(C++/onnxruntime 1.19.0), the program crashed. The crash log was as follows:


Build fingerprint: 'HONGQI/se1000_e007_slave_car/se1000_e007_slave:11/22.10.02.241422.243212/243212:userdebug/test-keys'
Revision: '0'
ABI: 'arm64'
Timestamp: 2024-01-01 01:19:41+0800
pid: 24441, tid: 24441, name: onnx_test >>> ./onnx_test <<<
uid: 0
signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x0
Cause: null pointer dereference
x0 0000000000000000 x1 0000ffffe3aab230 x2 b400f7f55ad014d8 x3 0000ffffe3aab5e8
x4 000000003b9ac9ff x5 0000f7f73f7f1000 x6 0000000000246ef6 x7 0000f7f73f7f1010
x8 00060dd82d1f16d3 x9 952c1cec50af5e2e x10 000000003b9aca00 x11 0000000d0ba1534b
x12 000000007fffffff x13 0000ffffe3aab1d8 x14 0000000000000001 x15 0000000000000000
x16 0000f7f73af0f750 x17 0000f7f73ad14ac0 x18 0000f7f73f47a000 x19 0000000000000001
x20 b400f7f53ac956d0 x21 0000000000000002 x22 b400f7f53ac95110 x23 b400f7f53ac94a90
x24 0000ffffe3aad6b8 x25 b400f7f5aacebdd0 x26 0000f7f73b4f21e0 x27 0000f7f73e68d000
x28 0000000000000000 x29 0000ffffe3aab1f0
lr 0000f7f73de21340 sp 0000ffffe3aab1d0 pc 0000f7f73de213bc pst 0000000060001000

backtrace:
#00 pc 00000000029003bc /data/local/tmp/release_26/libonnxruntime.so (onnxruntime::logging::ISink::Send(std::__ndk1::chrono::time_point<std::__ndk1::chrono::system_clock, std::__ndk1::chrono::duration<long long, std::__ndk1::ratio<1l, 1000000l> > > const&, std::__ndk1::basic_string<char, std::__ndk1::char_traits, std::__ndk1::allocator > const&, onnxruntime::logging::Capture const&)+44) (BuildId: 06bf4c9e9f46f8db141d79c577f38866c3f494d6)
#1 pc 000000000290033c /data/local/tmp/release_26/libonnxruntime.so (onnxruntime::logging::LoggingManager::Log(std::__ndk1::basic_string<char, std::__ndk1::char_traits, std::__ndk1::allocator > const&, onnxruntime::logging::Capture const&) const+92) (BuildId: 06bf4c9e9f46f8db141d79c577f38866c3f494d6)
#2 pc 00000000028ff5dc /data/local/tmp/release_26/libonnxruntime.so (onnxruntime::logging::Logger::Log(onnxruntime::logging::Capture const&) const+36) (BuildId: 06bf4c9e9f46f8db141d79c577f38866c3f494d6)
#3 pc 00000000028ff584 /data/local/tmp/release_26/libonnxruntime.so (onnxruntime::logging::Capture::~Capture()+52) (BuildId: 06bf4c9e9f46f8db141d79c577f38866c3f494d6)
#4 pc 000000000255a6c8 /data/local/tmp/release_26/libonnxruntime.so (onnxruntime::ExecuteThePlan(onnxruntime::SessionState const&, gsl::span<int const, 18446744073709551615ul>, gsl::span<OrtValue const, 18446744073709551615ul>, gsl::span<int const, 18446744073709551615ul>, std::__ndk1::vector<OrtValue, std::__ndk1::allocator >&, std::__ndk1::unordered_map<unsigned long, std::__ndk1::function<onnxruntime::common::Status (onnxruntime::TensorShape const&, OrtDevice const&, OrtValue&, bool&)>, std::__ndk1::hash, std::__ndk1::equal_to, std::__ndk1::allocator<std::__ndk1::pair<unsigned long const, std::__ndk1::function<onnxruntime::common::Status (onnxruntime::TensorShape const&, OrtDevice const&, OrtValue&, bool&)> > > > const&, onnxruntime::logging::Logger const&, onnxruntime::DeviceStreamCollection const*, bool const&, bool, bool)+308) (BuildId: 06bf4c9e9f46f8db141d79c577f38866c3f494d6)
#5 pc 00000000025bf4a4 /data/local/tmp/release_26/libonnxruntime.so (onnxruntime::utils::ExecuteGraphImpl(onnxruntime::SessionState const&, onnxruntime::FeedsFetchesManager const&, gsl::span<OrtValue const, 18446744073709551615ul>, std::__ndk1::vector<OrtValue, std::__ndk1::allocator >&, std::__ndk1::unordered_map<unsigned long, std::__ndk1::function<onnxruntime::common::Status (onnxruntime::TensorShape const&, OrtDevice const&, OrtValue&, bool&)>, std::__ndk1::hash, std::__ndk1::equal_to, std::__ndk1::allocator<std::__ndk1::pair<unsigned long const, std::__ndk1::function<onnxruntime::common::Status (onnxruntime::TensorShape const&, OrtDevice const&, OrtValue&, bool&)> > > > const&, ExecutionMode, bool const&, onnxruntime::logging::Logger const&, onnxruntime::DeviceStreamCollection*, bool, onnxruntime::Stream*)+420) (BuildId: 06bf4c9e9f46f8db141d79c577f38866c3f494d6)
#6 pc 00000000025bee10 /data/local/tmp/release_26/libonnxruntime.so (onnxruntime::utils::ExecuteGraph(onnxruntime::SessionState const&, onnxruntime::FeedsFetchesManager&, gsl::span<OrtValue const, 18446744073709551615ul>, std::__ndk1::vector<OrtValue, std::__ndk1::allocator >&, ExecutionMode, bool const&, onnxruntime::logging::Logger const&, onnxruntime::DeviceStreamCollectionHolder&, bool, onnxruntime::Stream*)+448) (BuildId: 06bf4c9e9f46f8db141d79c577f38866c3f494d6)
#7 pc 00000000025bfe60 /data/local/tmp/release_26/libonnxruntime.so (onnxruntime::utils::ExecuteGraph(onnxruntime::SessionState const&, onnxruntime::FeedsFetchesManager&, gsl::span<OrtValue const, 18446744073709551615ul>, std::__ndk1::vector<OrtValue, std::__ndk1::allocator >&, ExecutionMode, OrtRunOptions const&, onnxruntime::DeviceStreamCollectionHolder&, onnxruntime::logging::Logger const&)+140) (BuildId: 06bf4c9e9f46f8db141d79c577f38866c3f494d6)
#8 pc 000000000121e31c /data/local/tmp/release_26/libonnxruntime.so (onnxruntime::InferenceSession::Run(OrtRunOptions const&, gsl::span<std::__ndk1::basic_string<char, std::__ndk1::char_traits, std::__ndk1::allocator > const, 18446744073709551615ul>, gsl::span<OrtValue const, 18446744073709551615ul>, gsl::span<std::__ndk1::basic_string<char, std::__ndk1::char_traits, std::__ndk1::allocator > const, 18446744073709551615ul>, std::__ndk1::vector<OrtValue, std::__ndk1::allocator >, std::__ndk1::vector<OrtDevice, std::__ndk1::allocator > const)+3856) (BuildId: 06bf4c9e9f46f8db141d79c577f38866c3f494d6)
#9 pc 0000000001220ebc /data/local/tmp/release_26/libonnxruntime.so (onnxruntime::InferenceSession::Run(OrtRunOptions const&, gsl::span<char const* const, 18446744073709551615ul>, gsl::span<OrtValue const* const, 18446744073709551615ul>, gsl::span<char const* const, 18446744073709551615ul>, gsl::span<OrtValue*, 18446744073709551615ul>)+1192) (BuildId: 06bf4c9e9f46f8db141d79c577f38866c3f494d6)
#10 pc 0000000001195280 /data/local/tmp/release_26/libonnxruntime.so (OrtApis::Run(OrtSession*, OrtRunOptions const*, char const* const*, OrtValue const* const*, unsigned long, char const* const*, unsigned long, OrtValue**)+408) (BuildId: 06bf4c9e9f46f8db141d79c577f38866c3f494d6)
#11 pc 000000000026ae8c /data/local/tmp/release_26/libonnx_inference.so (Ort::Session::Run(Ort::RunOptions const&, char const* const*, Ort::Value const*, unsigned long, char const* const*, unsigned long)+172) (BuildId: f6c0205e8f8387a2457b7f95cb7198ea1db10b51)
#12 pc 000000000026a584 /data/local/tmp/release_26/libonnx_inference.so (onnx_Infer+1488) (BuildId: f6c0205e8f8387a2457b7f95cb7198ea1db10b51)
#13 pc 0000000000001804 /data/local/tmp/release_26/onnx_test (BuildId: 9d6802b5b8d960f41acd4bbeb27f83dafd34a82e)
#14 pc 000000000004988c /apex/com.android.runtime/lib64/bionic/libc.so (__libc_init+108) (BuildId: f60e9508c5bfe39f73dfa8a2be262d62)

   Tombstone file:

tombstone_07.txt

  The ONNX Runtime compilation command was as follows:
  ./build_tool.sh --skip_tests --config Debug  --build_shared_lib --android --android_cpp_shared --android_api 30 --android_abi arm64-v8a --android_ndk_path /opt/android-ndk-r26d/ --android_sdk_path /opt/android-sdk/
  
  I needed your help solving this problem. I looked forward to your reply. 
  Thanks!

To reproduce

The problem was bound to occur.

Urgency

No response

Platform

Android

OS Version

11

ONNX Runtime Installation

Built from Source

ONNX Runtime Version or Commit ID

1.19.0

ONNX Runtime API

C++

Architecture

ARM64

Execution Provider

Default CPU

Execution Provider Library Version

No response

@github-actions github-actions bot added the platform:mobile issues related to ONNX Runtime mobile; typically submitted using template label Sep 6, 2024
@RyanUnderhill
Copy link
Member

@skottmckay, is this one obvious to you?

@skottmckay
Copy link
Contributor

Not aware of anyone else hitting this issue. Does it fail if you build without --android_cpp_shared ?

@sai-fpp
Copy link
Author

sai-fpp commented Sep 9, 2024

Not aware of anyone else hitting this issue. Does it fail if you build without --android_cpp_shared ?
if I build without `--android_cpp_shared, the problem still exists

@sai-fpp
Copy link
Author

sai-fpp commented Sep 9, 2024

./build.sh --skip_tests --config Debug --parallel --use_nnapi --build_shared_lib --android --android_api 30 --android_abi arm64-v8a --android_ndk_path /home/fangpangpang/ai/android-ndk-r26d/ --android_sdk_path /home/fangpangpang/ai/android-ndk-r26d
Uploading libonnxruntime1.zip…

@skottmckay
Copy link
Contributor

Can you please provide the code being used to call ORT?

Do you create an Ort::Env instance as the first step and keep it valid until all inference sessions are closed?

@sai-fpp
Copy link
Author

sai-fpp commented Sep 10, 2024

The code that provides the ort call is as follows:

struct handle {
Ort::Session session;
std::unique_ptr tok;
json label_mapping;
float threshold;
std::vector<const char*> input_names;
std::vector<std::vector<int64_t>> input_shapes;
std::vector<const char*> output_names;
std::vector<std::vector<int64_t>> output_shapes;
};

extern "C" void* onnx_Init(const char* path) {
std::string config_path = std::string(path) + "/" + ONNX_CONFIG_FILE;
std::ifstream config_file(config_path);
json config = json::parse(config_file);

std::string model_path = std::string(basePath) + "/" + config["model_path"].get<std::string>();
std::string tokenizer_path = std::string(basePath) + "/" + config["tokenizer_path"].get<std::string>();
std::string label_mapping_path = std::string(basePath) + "/" + config["label_mapping_path"].get<std::string>();
float threshold = config["threshold"];

**// env init
//Ort::Env env(ORT_LOGGING_LEVEL_WARNING, "test");   //Running normally
Ort::Env env( ORT_LOGGING_LEVEL_VERBOSE, "test");      //will crash**

Ort::SessionOptions session_options;    

// Ort::ThrowOnError(OrtSessionOptionsAppendExecutionProvider_Nnapi(session_options, 0));
Ort::Session session(env, model_path.c_str(), session_options);


handle* handle = new handle{std::move(session), std::move(tok), std::move(label_mapping), threshold, std::move(input_names), std::move(input_shapes), std::move(output_names), std::move(output_shapes)};
return static_cast<void*>(handle);
}

extern "C" char* onnx_Infer(void* phandle, const char* text) {
handle* handle = static_cast<handle*>(phandle);
std::string text_str(text);

std::vector<int64_t> dummy_input_np = TestTokenizer(handle->tok, text_str);

size_t num_input_nodes = handle->input_names.size();
const std::vector<const char*>& input_names = handle->input_names;
const std::vector<std::vector<int64_t>>& input_shapes = handle->input_shapes;
size_t num_output_nodes = handle->output_names.size();
const std::vector<const char*>& output_names = handle->output_names;

std::vector<int64_t> attention_mask_np(dummy_input_np.size(), 0);
for (size_t i = 0; i < dummy_input_np.size(); ++i) {
    if (dummy_input_np[i] != 0) {
        attention_mask_np[i] = 1;
    }
}

std::vector<Ort::Value> ort_inputs;
std::vector<int64_t> shape = {1, static_cast<int64_t>(dummy_input_np.size())};

Ort::MemoryInfo memory_info = Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeCPU);

ort_inputs.push_back(Ort::Value::CreateTensor<int64_t>(memory_info, dummy_input_np.data(), dummy_input_np.size(), shape.data(), shape.size()));
ort_inputs.push_back(Ort::Value::CreateTensor<int64_t>(memory_info, attention_mask_np.data(), attention_mask_np.size(), shape.data(), shape.size()));

auto ort_outs = handle->session.Run(Ort::RunOptions{nullptr}, input_names.data(), ort_inputs.data(), num_input_nodes, output_names.data(), num_output_nodes);

float* logits = ort_outs[0].GetTensorMutableData<float>();
size_t logits_size = ort_outs[0].GetTensorTypeAndShapeInfo().GetElementCount();

int predicted_class_id = std::distance(logits, std::max_element(logits, logits + logits_size));

float sum_exp = std::accumulate(logits, logits + logits_size, 0.0f, [](float a, float b) { return a + std::exp(b); });
float score = std::exp(logits[predicted_class_id]) / sum_exp;

std::string label = handle->label_mapping["id2label"][std::to_string(predicted_class_id)];
json result = {
    {"text", text_str},
    {"label", label},
    {"score", score}
};

std::string result_str = result.dump();
char* result_cstr = static_cast<char*>(malloc(result_str.size() + 1));
std::strcpy(result_cstr, result_str.c_str());
return result_cstr;

}

extern "C" void onnx_destroy(void* phandle) {
handle* handle = static_cast<handle*>(phandle);
handle->tok.reset();
delete handle;
}

main function:
#include
#include
#include "onnx_inference_api.h"

int main() {
char* result;

const char* path = "./model";

void* phandle = onnx_Initialize(path);

const char* test_text  = "see you again";
result = onnx_Infer(phandle, test_text);
free(result);


onnx_destroy(phandle);
return 0;

}

@skottmckay
Copy link
Contributor

Hard to tell as the formatting of your reply doesn't make it clear what code is where, but I would have expected the Ort::Env needs to be in main() to guarantee its lifetime is longer than that of the session. It seems like it might be a local variable in onnx_Init which will not work. Ort::Env must be created before the session AND remain valid until the session is destroyed.

@sai-fpp
Copy link
Author

sai-fpp commented Sep 10, 2024

Hard to tell as the formatting of your reply doesn't make it clear what code is where, but I would have expected the Ort::Env needs to be in main() to guarantee its lifetime is longer than that of the session. It seems like it might be a local variable in onnx_Init which will not work. Ort::Env must be created before the session AND remain valid until the session is destroyed.

You are right, this is the problem, Thanks for your help!

@sai-fpp sai-fpp closed this as completed Sep 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
platform:mobile issues related to ONNX Runtime mobile; typically submitted using template
Projects
None yet
Development

No branches or pull requests

3 participants