Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REPORT] Node panics with --trie-cache-size 0 #10

Open
yorickdowne opened this issue Jul 30, 2023 · 5 comments
Open

[REPORT] Node panics with --trie-cache-size 0 #10

yorickdowne opened this issue Jul 30, 2023 · 5 comments

Comments

@yorickdowne
Copy link

yorickdowne commented Jul 30, 2023

Description:

During and after warp sync block history fill, the node panics. It then restarts and continues block (history) import.

This may be related to --trie-cache-size 0.

Environment:

Operating System: Ubuntu 22.04
System Specification: OVH Adv-1
Version: chainflip-engine 0.8.7, chainflip-node 0.8.7

Logs:

chainflip-node-1  | 2023-07-30 04:45:37 ⏩ Block history, #120448 (17 peers), best: #525444 (0x5d3c…2d91), finalized #525441 (0xe1c3…61ff), ⬇ 27.2MiB/s ⬆ 174.9kiB/s    
chainflip-node-1  | 
chainflip-node-1  | ====================
chainflip-node-1  | 
chainflip-node-1  | Version: 0.8.7-5991d303f7b
chainflip-node-1  | 
chainflip-node-1  |    0: sp_panic_handler::set::{{closure}}
chainflip-node-1  |    1: <alloc::boxed::Box<F,A> as core::ops::function::Fn<Args>>::call
chainflip-node-1  |              at /rustc/da7c50c089d5db2d3ebaf227fe075bb1346bfaec/library/alloc/src/boxed.rs:2002:9
chainflip-node-1  |       std::panicking::rust_panic_with_hook
chainflip-node-1  |              at /rustc/da7c50c089d5db2d3ebaf227fe075bb1346bfaec/library/std/src/panicking.rs:696:13
chainflip-node-1  |    2: std::panicking::begin_panic_handler::{{closure}}
chainflip-node-1  |              at /rustc/da7c50c089d5db2d3ebaf227fe075bb1346bfaec/library/std/src/panicking.rs:583:13
chainflip-node-1  |    3: std::sys_common::backtrace::__rust_end_short_backtrace
chainflip-node-1  |              at /rustc/da7c50c089d5db2d3ebaf227fe075bb1346bfaec/library/std/src/sys_common/backtrace.rs:150:18
chainflip-node-1  |    4: rust_begin_unwind
chainflip-node-1  |              at /rustc/da7c50c089d5db2d3ebaf227fe075bb1346bfaec/library/std/src/panicking.rs:579:5
chainflip-node-1  |    5: core::panicking::panic_fmt
chainflip-node-1  |              at /rustc/da7c50c089d5db2d3ebaf227fe075bb1346bfaec/library/core/src/panicking.rs:64:14
chainflip-node-1  |    6: core::result::unwrap_failed
chainflip-node-1  |              at /rustc/da7c50c089d5db2d3ebaf227fe075bb1346bfaec/library/core/src/result.rs:1750:5
chainflip-node-1  |    7: <sp_state_machine::ext::Ext<H,B> as sp_externalities::Externalities>::storage
chainflip-node-1  |    8: <&mut dyn sp_externalities::Externalities as sp_io::storage::Storage>::get_version_1
chainflip-node-1  |    9: std::thread::local::LocalKey<T>::with
chainflip-node-1  |   10: tracing::span::Span::in_scope
chainflip-node-1  |   11: sp_io::storage::get_version_1
chainflip-node-1  |   12: sp_io::storage::ExtStorageGetVersion1::call
chainflip-node-1  |   13: <sc_executor_wasmtime::imports::Registry as sp_wasm_interface::HostFunctionRegistry>::with_function_context
chainflip-node-1  |   14: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
chainflip-node-1  |   15: <F as wasmtime::func::IntoFunc<T,(wasmtime::func::Caller<T>,A1),R>>::into_func::wasm_to_host_shim
chainflip-node-1  |   16: <unknown>
chainflip-node-1  |   17: <unknown>
chainflip-node-1  |   18: <unknown>
chainflip-node-1  |   19: <unknown>
chainflip-node-1  |   20: <unknown>
chainflip-node-1  |   21: <unknown>
chainflip-node-1  |   22: wasmtime_runtime::traphandlers::catch_traps::call_closure
chainflip-node-1  |   23: wasmtime_setjmp
chainflip-node-1  |   24: wasmtime_runtime::traphandlers::<impl wasmtime_runtime::traphandlers::call_thread_state::CallThreadState>::with
chainflip-node-1  |   25: wasmtime_runtime::traphandlers::catch_traps
chainflip-node-1  |   26: wasmtime::func::invoke_wasm_and_catch_traps
chainflip-node-1  |   27: sc_executor_wasmtime::instance_wrapper::EntryPoint::call
chainflip-node-1  |   28: sc_executor_wasmtime::runtime::perform_call
chainflip-node-1  |   29: <sc_executor_wasmtime::runtime::WasmtimeInstance as sc_executor_common::wasm_runtime::WasmInstance>::call_with_allocation_stats
chainflip-node-1  |   30: sc_executor_common::wasm_runtime::WasmInstance::call_export
chainflip-node-1  |   31: std::thread::local::LocalKey<T>::with
chainflip-node-1  |   32: sc_executor::native_executor::WasmExecutor<H>::with_instance::{{closure}}
chainflip-node-1  |   33: sc_executor::wasm_runtime::RuntimeCache::with_instance
chainflip-node-1  |   34: <sc_executor::native_executor::NativeElseWasmExecutor<D> as sp_core::traits::CodeExecutor>::call
chainflip-node-1  |   35: sp_state_machine::execution::StateMachine<B,H,Exec>::execute_aux
chainflip-node-1  |   36: sp_state_machine::execution::StateMachine<B,H,Exec>::execute_using_consensus_failure_handler
chainflip-node-1  |   37: <sc_service::client::call_executor::LocalCallExecutor<Block,B,E> as sc_client_api::call_executor::CallExecutor<Block>>::contextual_call
chainflip-node-1  |   38: <sc_service::client::client::Client<B,E,Block,RA> as sp_api::CallApiAt<Block>>::call_api_at
chainflip-node-1  |   39: <state_chain_runtime::RuntimeApiImpl<__SR_API_BLOCK__,RuntimeApiImplCall> as sp_block_builder::BlockBuilder<__SR_API_BLOCK__>>::__runtime_api_internal_call_api_at
chainflip-node-1  |   40: sp_transaction_pool::runtime_api::TaggedTransactionQueue::validate_transaction
chainflip-node-1  |   41: tracing::span::Span::in_scope
chainflip-node-1  |   42: sc_transaction_pool::api::validate_transaction_blocking
chainflip-node-1  |   43: <sc_transaction_pool::api::FullChainApi<Client,Block> as sc_transaction_pool::graph::pool::ChainApi>::validate_transaction::{{closure}}::{{closure}}
chainflip-node-1  |   44: sc_transaction_pool::api::spawn_validation_pool_task::{{closure}}
chainflip-node-1  |   45: <futures_util::future::future::map::Map<Fut,F> as core::future::future::Future>::poll
chainflip-node-1  |   46: <sc_service::task_manager::prometheus_future::PrometheusFuture<T> as core::future::future::Future>::poll
chainflip-node-1  |   47: <futures_util::future::select::Select<A,B> as core::future::future::Future>::poll
chainflip-node-1  |   48: <tracing_futures::Instrumented<T> as core::future::future::Future>::poll
chainflip-node-1  |   49: tokio::runtime::park::CachedParkThread::block_on
chainflip-node-1  |   50: tokio::runtime::handle::Handle::block_on
chainflip-node-1  |   51: <tokio::runtime::blocking::task::BlockingTask<T> as core::future::future::Future>::poll
chainflip-node-1  |   52: tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut
chainflip-node-1  |   53: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
chainflip-node-1  |   54: tokio::runtime::task::harness::Harness<T,S>::poll
chainflip-node-1  |   55: tokio::runtime::blocking::pool::Inner::run
chainflip-node-1  |   56: std::sys_common::backtrace::__rust_begin_short_backtrace
chainflip-node-1  |   57: core::ops::function::FnOnce::call_once{{vtable.shim}}
chainflip-node-1  |   58: <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once
chainflip-node-1  |              at /rustc/da7c50c089d5db2d3ebaf227fe075bb1346bfaec/library/alloc/src/boxed.rs:1988:9
chainflip-node-1  |       <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once
chainflip-node-1  |              at /rustc/da7c50c089d5db2d3ebaf227fe075bb1346bfaec/library/alloc/src/boxed.rs:1988:9
chainflip-node-1  |       std::sys::unix::thread::Thread::new::thread_start
chainflip-node-1  |              at /rustc/da7c50c089d5db2d3ebaf227fe075bb1346bfaec/library/std/src/sys/unix/thread.rs:108:17
chainflip-node-1  |   59: start_thread
chainflip-node-1  |   60: clone
chainflip-node-1  | 
chainflip-node-1  | 
chainflip-node-1  | Thread 'tokio-runtime-worker' panicked at 'Externalities not allowed to fail within runtime: "Trie lookup error: Database missing expected key: 0x2392f9b973fd9ca8098f3e594c6d00856229952bab2af6d8ecf9cf74f720e336"', /github/home/.cargo/git/checkouts/substrate-a7ad12d678bd31ac/8b93ab6/primitives/state-machine/src/ext.rs:189
chainflip-node-1  | 
chainflip-node-1  | This is a bug. Please report it at:
chainflip-node-1  | 
chainflip-node-1  | 	chainflip.io
chainflip-node-1  | 
chainflip-node-1 exited with code 0
@yorickdowne
Copy link
Author

After block history import completes, node can still panic

chainflip-node-1  | 2023-07-30 04:55:54 ✨ Imported #525545 (0x755b…299f)    
chainflip-node-1  | 2023-07-30 04:55:54 💤 Idle (12 peers), best: #525545 (0x755b…299f), finalized #525542 (0xdf18…91d7), ⬇ 73.4kiB/s ⬆ 171.6kiB/s    
chainflip-node-1  | 2023-07-30 04:55:59 💤 Idle (12 peers), best: #525545 (0x755b…299f), finalized #525543 (0xfba4…0207), ⬇ 135.7kiB/s ⬆ 205.4kiB/s    
chainflip-node-1  | 2023-07-30 04:56:00 ✨ Imported #525546 (0x067b…8eef)    
chainflip-node-1  | 2023-07-30 04:56:04 💤 Idle (12 peers), best: #525546 (0x067b…8eef), finalized #525544 (0x9857…ff50), ⬇ 179.8kiB/s ⬆ 209.3kiB/s    
chainflip-node-1  | 2023-07-30 04:56:06 ✨ Imported #525547 (0xc74c…feed)    
chainflip-node-1  | 2023-07-30 04:56:09 💤 Idle (12 peers), best: #525547 (0xc74c…feed), finalized #525545 (0x755b…299f), ⬇ 140.4kiB/s ⬆ 168.3kiB/s    
chainflip-node-1  | 2023-07-30 04:56:12 ✨ Imported #525548 (0xea71…b54b)    
chainflip-node-1  | 2023-07-30 04:56:14 💤 Idle (12 peers), best: #525548 (0xea71…b54b), finalized #525546 (0x067b…8eef), ⬇ 143.0kiB/s ⬆ 233.0kiB/s    
chainflip-node-1  | 2023-07-30 04:56:18 ✨ Imported #525549 (0xa216…0df9)    
chainflip-node-1  | 2023-07-30 04:56:19 💤 Idle (12 peers), best: #525549 (0xa216…0df9), finalized #525547 (0xc74c…feed), ⬇ 149.2kiB/s ⬆ 158.5kiB/s    
chainflip-node-1  | 2023-07-30 04:56:24 ✨ Imported #525550 (0x4b0b…45fd)    
chainflip-node-1  | 2023-07-30 04:56:24 💤 Idle (12 peers), best: #525550 (0x4b0b…45fd), finalized #525547 (0xc74c…feed), ⬇ 47.7kiB/s ⬆ 73.9kiB/s    
chainflip-node-1  | 
chainflip-node-1  | ====================
chainflip-node-1  | 
chainflip-node-1  | Version: 0.8.7-5991d303f7b
chainflip-node-1  | 
chainflip-node-1  |    0: sp_panic_handler::set::{{closure}}
chainflip-node-1  |    1: <alloc::boxed::Box<F,A> as core::ops::function::Fn<Args>>::call
chainflip-node-1  |              at /rustc/da7c50c089d5db2d3ebaf227fe075bb1346bfaec/library/alloc/src/boxed.rs:2002:9
chainflip-node-1  |       std::panicking::rust_panic_with_hook
chainflip-node-1  |              at /rustc/da7c50c089d5db2d3ebaf227fe075bb1346bfaec/library/std/src/panicking.rs:696:13
chainflip-node-1  |    2: std::panicking::begin_panic_handler::{{closure}}
chainflip-node-1  |              at /rustc/da7c50c089d5db2d3ebaf227fe075bb1346bfaec/library/std/src/panicking.rs:583:13
chainflip-node-1  |    3: std::sys_common::backtrace::__rust_end_short_backtrace
chainflip-node-1  |              at /rustc/da7c50c089d5db2d3ebaf227fe075bb1346bfaec/library/std/src/sys_common/backtrace.rs:150:18
chainflip-node-1  |    4: rust_begin_unwind
chainflip-node-1  |              at /rustc/da7c50c089d5db2d3ebaf227fe075bb1346bfaec/library/std/src/panicking.rs:579:5
chainflip-node-1  |    5: core::panicking::panic_fmt
chainflip-node-1  |              at /rustc/da7c50c089d5db2d3ebaf227fe075bb1346bfaec/library/core/src/panicking.rs:64:14
chainflip-node-1  |    6: core::result::unwrap_failed
chainflip-node-1  |              at /rustc/da7c50c089d5db2d3ebaf227fe075bb1346bfaec/library/core/src/result.rs:1750:5
chainflip-node-1  |    7: <sp_state_machine::ext::Ext<H,B> as sp_externalities::Externalities>::exists_storage
chainflip-node-1  |    8: std::thread::local::LocalKey<T>::with
chainflip-node-1  |    9: tracing::span::Span::in_scope
chainflip-node-1  |   10: sp_io::storage::exists_version_1
chainflip-node-1  |   11: sp_io::storage::ExtStorageExistsVersion1::call
chainflip-node-1  |   12: <sc_executor_wasmtime::imports::Registry as sp_wasm_interface::HostFunctionRegistry>::with_function_context
chainflip-node-1  |   13: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
chainflip-node-1  |   14: <F as wasmtime::func::IntoFunc<T,(wasmtime::func::Caller<T>,A1),R>>::into_func::wasm_to_host_shim
chainflip-node-1  |   15: <unknown>
chainflip-node-1  |   16: <unknown>
chainflip-node-1  |   17: <unknown>
chainflip-node-1  |   18: <unknown>
chainflip-node-1  |   19: <unknown>
chainflip-node-1  |   20: <unknown>
chainflip-node-1  |   21: wasmtime_runtime::traphandlers::catch_traps::call_closure
chainflip-node-1  |   22: wasmtime_setjmp
chainflip-node-1  |   23: wasmtime_runtime::traphandlers::<impl wasmtime_runtime::traphandlers::call_thread_state::CallThreadState>::with
chainflip-node-1  |   24: wasmtime_runtime::traphandlers::catch_traps
chainflip-node-1  |   25: wasmtime::func::invoke_wasm_and_catch_traps
chainflip-node-1  |   26: sc_executor_wasmtime::instance_wrapper::EntryPoint::call
chainflip-node-1  |   27: sc_executor_wasmtime::runtime::perform_call
chainflip-node-1  |   28: <sc_executor_wasmtime::runtime::WasmtimeInstance as sc_executor_common::wasm_runtime::WasmInstance>::call_with_allocation_stats
chainflip-node-1  |   29: sc_executor_common::wasm_runtime::WasmInstance::call_export
chainflip-node-1  |   30: std::thread::local::LocalKey<T>::with
chainflip-node-1  |   31: sc_executor::native_executor::WasmExecutor<H>::with_instance::{{closure}}
chainflip-node-1  |   32: sc_executor::wasm_runtime::RuntimeCache::with_instance
chainflip-node-1  |   33: <sc_executor::native_executor::NativeElseWasmExecutor<D> as sp_core::traits::CodeExecutor>::call
chainflip-node-1  |   34: sp_state_machine::execution::StateMachine<B,H,Exec>::execute_aux
chainflip-node-1  |   35: sp_state_machine::execution::StateMachine<B,H,Exec>::execute_using_consensus_failure_handler
chainflip-node-1  |   36: <sc_service::client::call_executor::LocalCallExecutor<Block,B,E> as sc_client_api::call_executor::CallExecutor<Block>>::contextual_call
chainflip-node-1  |   37: <sc_service::client::client::Client<B,E,Block,RA> as sp_api::CallApiAt<Block>>::call_api_at
chainflip-node-1  |   38: <state_chain_runtime::RuntimeApiImpl<__SR_API_BLOCK__,RuntimeApiImplCall> as sp_block_builder::BlockBuilder<__SR_API_BLOCK__>>::__runtime_api_internal_call_api_at
chainflip-node-1  |   39: sp_transaction_pool::runtime_api::TaggedTransactionQueue::validate_transaction
chainflip-node-1  |   40: tracing::span::Span::in_scope
chainflip-node-1  |   41: sc_transaction_pool::api::validate_transaction_blocking
chainflip-node-1  |   42: <sc_transaction_pool::api::FullChainApi<Client,Block> as sc_transaction_pool::graph::pool::ChainApi>::validate_transaction::{{closure}}::{{closure}}
chainflip-node-1  |   43: sc_transaction_pool::api::spawn_validation_pool_task::{{closure}}
chainflip-node-1  |   44: <futures_util::future::future::map::Map<Fut,F> as core::future::future::Future>::poll
chainflip-node-1  |   45: <sc_service::task_manager::prometheus_future::PrometheusFuture<T> as core::future::future::Future>::poll
chainflip-node-1  |   46: <futures_util::future::select::Select<A,B> as core::future::future::Future>::poll
chainflip-node-1  |   47: <tracing_futures::Instrumented<T> as core::future::future::Future>::poll
chainflip-node-1  |   48: tokio::runtime::park::CachedParkThread::block_on
chainflip-node-1  |   49: tokio::runtime::handle::Handle::block_on
chainflip-node-1  |   50: <tokio::runtime::blocking::task::BlockingTask<T> as core::future::future::Future>::poll
chainflip-node-1  |   51: tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut
chainflip-node-1  |   52: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
chainflip-node-1  |   53: tokio::runtime::task::harness::Harness<T,S>::poll
chainflip-node-1  |   54: tokio::runtime::blocking::pool::Inner::run
chainflip-node-1  |   55: std::sys_common::backtrace::__rust_begin_short_backtrace
chainflip-node-1  |   56: core::ops::function::FnOnce::call_once{{vtable.shim}}
chainflip-node-1  |   57: <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once
chainflip-node-1  |              at /rustc/da7c50c089d5db2d3ebaf227fe075bb1346bfaec/library/alloc/src/boxed.rs:1988:9
chainflip-node-1  |       <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once
chainflip-node-1  |              at /rustc/da7c50c089d5db2d3ebaf227fe075bb1346bfaec/library/alloc/src/boxed.rs:1988:9
chainflip-node-1  |       std::sys::unix::thread::Thread::new::thread_start
chainflip-node-1  |              at /rustc/da7c50c089d5db2d3ebaf227fe075bb1346bfaec/library/std/src/sys/unix/thread.rs:108:17
chainflip-node-1  |   58: start_thread
chainflip-node-1  |   59: clone
chainflip-node-1  | 
chainflip-node-1  | 
chainflip-node-1  | Thread 'tokio-runtime-worker' panicked at 'Externalities not allowed to fail within runtime: "Trie lookup error: Database missing expected key: 0x741c3d28aa7de2eb5339218c59692cf61f0a774ad33336da9df28109af8fa788"', /github/home/.cargo/git/checkouts/substrate-a7ad12d678bd31ac/8b93ab6/primitives/state-machine/src/ext.rs:275
chainflip-node-1  | 
chainflip-node-1  | This is a bug. Please report it at:
chainflip-node-1  | 
chainflip-node-1  | 	chainflip.io
chainflip-node-1  | 

@yorickdowne
Copy link
Author

Node kept panicking with --trie-cache-size 0, but has been stable without. It's likely this panic is related to that setting.

@yorickdowne yorickdowne changed the title [REPORT] Node panics when attempting warp sync [REPORT] Node panics with --trie-cache-size 0 Jul 30, 2023
@dandanlen
Copy link

@yorickdowne are you using parity db (--database paritydb)?

I found a couple of substrate issues that might be relevant:
paritytech/substrate#13864
paritytech/cumulus#2461

Both were resolved fairly recently so are not yet merged into our code base. For now, the best I can suggest is to use a different value for trie cache size or leave it unset (which will use the default value).

    --trie-cache-size <Bytes>
          Specify the state cache size.
          
          Providing `0` will disable the cache.
          
          [default: 67108864]

@yorickdowne
Copy link
Author

I have not specified paritydb, so it'd be whatever the default is. Do you recommend using --database paritydb instead of leaving it unspecified?

For now I am leaving --trie-cache-size unset. The only reason it was set was that that's what the apt package does in systemd.

@dandanlen
Copy link

dandanlen commented Aug 1, 2023

Yes, I'd recommend setting --database paritydb, this is soon to become the default. (I believe in the current version the default is rocksdb)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants