Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fail to validate chunk with chunk extra #3055

Closed
bowenwang1996 opened this issue Jul 29, 2020 · 0 comments · Fixed by #3056
Closed

Fail to validate chunk with chunk extra #3055

bowenwang1996 opened this issue Jul 29, 2020 · 0 comments · Fixed by #3056
Assignees
Labels
C-bug Category: This is a bug P-critical Priority: critical

Comments

@bowenwang1996
Copy link
Collaborator

On testnet we saw that some block producer fail during catch up because the node fails to validate some chunk with previous chunk extra. Upon investigation we realized that this failure is captured by test test_cross_shard_tx_drop_chunks. An example of such failure from logs

�[2mJul 27 17:27:22.811�[0m �[34mDEBUG�[0m chain: Failed to validate chunk extra: Error { inner:    0: backtrace::backtrace::libunwind::trace
             at /Users/bowenwang/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/backtrace-0.3.46/src/backtrace/libunwind.rs:86
      backtrace::backtrace::trace_unsynchronized
             at /Users/bowenwang/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/backtrace-0.3.46/src/backtrace/mod.rs:66
   1: backtrace::backtrace::trace
             at /Users/bowenwang/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/backtrace-0.3.46/src/backtrace/mod.rs:53
   2: backtrace::capture::Backtrace::create
             at /Users/bowenwang/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/backtrace-0.3.46/src/capture.rs:164
   3: backtrace::capture::Backtrace::new_unresolved
             at /Users/bowenwang/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/backtrace-0.3.46/src/capture.rs:158
   4: failure::backtrace::internal::InternalBacktrace::new
             at /Users/bowenwang/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/failure-0.1.7/src/backtrace/internal.rs:46
   5: failure::backtrace::Backtrace::new
             at /Users/bowenwang/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/failure-0.1.7/src/backtrace/mod.rs:121
   6: failure::context::Context<D>::new
             at /Users/bowenwang/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/failure-0.1.7/src/context.rs:84
   7: <near_chain::error::Error as core::convert::From<near_chain::error::ErrorKind>>::from
             at chain/chain/src/error.rs:290
   8: <T as core::convert::Into<U>>::into
             at /Users/bowenwang/.rustup/toolchains/nightly-2020-05-15-x86_64-apple-darwin/lib/rustlib/src/rust/src/libcore/convert/mod.rs:559
   9: near_chain::validate::validate_chunk_with_chunk_extra
             at chain/chain/src/validate.rs:105
  10: near_chain::chain::ChainUpdate::apply_chunks
             at chain/chain/src/chain.rs:2672
  11: near_chain::chain::ChainUpdate::process_block
             at /Users/bowenwang/NEAR/nearcore/chain/chain/src/chain.rs:2931
  12: near_chain::chain::Chain::process_block_single
             at /Users/bowenwang/NEAR/nearcore/chain/chain/src/chain.rs:1039
  13: near_chain::chain::Chain::check_blocks_with_missing_chunks
             at /Users/bowenwang/NEAR/nearcore/chain/chain/src/chain.rs:1192
  14: near_client::client::Client::process_blocks_with_missing_chunks
             at chain/client/src/client.rs:943
  15: near_client::client::Client::process_partial_encoded_chunk
             at chain/client/src/client.rs:681
  16: near_client::client::Client::process_partial_encoded_chunk_response
             at chain/client/src/client.rs:666
  17: <near_client::client_actor::ClientActor as actix::handler::Handler<near_network::types::NetworkClientMessages>>::handle
             at chain/client/src/client_actor.rs:470
  18: <actix::address::envelope::SyncEnvelopeProxy<A,M> as actix::address::envelope::EnvelopeProxy>::handle
             at /Users/bowenwang/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/actix-0.9.0/src/address/envelope.rs:112
  19: <actix::address::envelope::Envelope<A> as actix::address::envelope::EnvelopeProxy>::handle
             at /Users/bowenwang/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/actix-0.9.0/src/address/envelope.rs:71
  20: actix::mailbox::Mailbox<A>::poll
             at /Users/bowenwang/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/actix-0.9.0/src/mailbox.rs:101
  21: <actix::contextimpl::ContextFut<A,C> as core::future::future::Future>::poll
             at /Users/bowenwang/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/actix-0.9.0/src/contextimpl.rs:373
  22: <core::pin::Pin<P> as core::future::future::Future>::poll
             at /Users/bowenwang/.rustup/toolchains/nightly-2020-05-15-x86_64-apple-darwin/lib/rustlib/src/rust/src/libcore/future/future.rs:118
  23: tokio::runtime::task::core::Core<T,S>::poll::{{closure}}
             at /Users/bowenwang/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-0.2.18/src/runtime/task/core.rs:163
  24: tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut
             at /Users/bowenwang/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-0.2.18/src/loom/std/unsafe_cell.rs:14
  25: tokio::runtime::task::core::Core<T,S>::poll
             at /Users/bowenwang/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-0.2.18/src/runtime/task/core.rs:148
  26: tokio::runtime::task::harness::Harness<T,S>::poll::{{closure}}
             at /Users/bowenwang/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-0.2.18/src/runtime/task/harness.rs:108
  27: core::ops::function::FnOnce::call_once
             at /Users/bowenwang/.rustup/toolchains/nightly-2020-05-15-x86_64-apple-darwin/lib/rustlib/src/rust/src/libcore/ops/function.rs:232
  28: <std::panic::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
             at /Users/bowenwang/.rustup/toolchains/nightly-2020-05-15-x86_64-apple-darwin/lib/rustlib/src/rust/src/libstd/panic.rs:318
  29: std::panicking::try::do_call
             at /Users/bowenwang/.rustup/toolchains/nightly-2020-05-15-x86_64-apple-darwin/lib/rustlib/src/rust/src/libstd/panicking.rs:297
  30: __rust_try
  31: std::panicking::try
             at /Users/bowenwang/.rustup/toolchains/nightly-2020-05-15-x86_64-apple-darwin/lib/rustlib/src/rust/src/libstd/panicking.rs:274
  32: std::panic::catch_unwind
             at /Users/bowenwang/.rustup/toolchains/nightly-2020-05-15-x86_64-apple-darwin/lib/rustlib/src/rust/src/libstd/panic.rs:394
  33: tokio::runtime::task::harness::Harness<T,S>::poll
             at /Users/bowenwang/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-0.2.18/src/runtime/task/harness.rs:84
  34: tokio::runtime::task::raw::poll
             at /Users/bowenwang/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-0.2.18/src/runtime/task/raw.rs:104
  35: tokio::runtime::task::raw::RawTask::poll
             at /Users/bowenwang/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-0.2.18/src/runtime/task/raw.rs:66
  36: tokio::runtime::task::Notified<S>::run
             at /Users/bowenwang/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-0.2.18/src/runtime/task/mod.rs:169
  37: tokio::task::local::LocalSet::tick::{{closure}}
             at /Users/bowenwang/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-0.2.18/src/task/local.rs:406
  38: tokio::coop::budget::{{closure}}
             at /Users/bowenwang/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-0.2.18/src/coop.rs:85
  39: std::thread::local::LocalKey<T>::try_with
             at /Users/bowenwang/.rustup/toolchains/nightly-2020-05-15-x86_64-apple-darwin/lib/rustlib/src/rust/src/libstd/thread/local.rs:263
  40: std::thread::local::LocalKey<T>::with
             at /Users/bowenwang/.rustup/toolchains/nightly-2020-05-15-x86_64-apple-darwin/lib/rustlib/src/rust/src/libstd/thread/local.rs:239
  41: tokio::coop::budget
             at /Users/bowenwang/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-0.2.18/src/coop.rs:79
      tokio::task::local::LocalSet::tick
             at /Users/bowenwang/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-0.2.18/src/task/local.rs:406
  42: <tokio::task::local::RunUntil<T> as core::future::future::Future>::poll::{{closure}}
             at /Users/bowenwang/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-0.2.18/src/task/local.rs:527
  43: tokio::macros::scoped_tls::ScopedKey<T>::set
             at /Users/bowenwang/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-0.2.18/src/macros/scoped_tls.rs:64
  44: tokio::task::local::LocalSet::with
             at /Users/bowenwang/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-0.2.18/src/task/local.rs:440
  45: <tokio::task::local::RunUntil<T> as core::future::future::Future>::poll
             at /Users/bowenwang/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-0.2.18/src/task/local.rs:516
  46: tokio::task::local::LocalSet::run_until::{{closure}}
             at /Users/bowenwang/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-0.2.18/src/task/local.rs:390
  47: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
             at /Users/bowenwang/.rustup/toolchains/nightly-2020-05-15-x86_64-apple-darwin/lib/rustlib/src/rust/src/libcore/future/mod.rs:69
  48: tokio::runtime::basic_scheduler::BasicScheduler<P>::block_on::{{closure}}::{{closure}}
             at /Users/bowenwang/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-0.2.18/src/runtime/basic_scheduler.rs:131
  49: tokio::coop::budget::{{closure}}
             at /Users/bowenwang/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-0.2.18/src/coop.rs:97
  50: std::thread::local::LocalKey<T>::try_with
             at /Users/bowenwang/.rustup/toolchains/nightly-2020-05-15-x86_64-apple-darwin/lib/rustlib/src/rust/src/libstd/thread/local.rs:263
  51: std::thread::local::LocalKey<T>::with
             at /Users/bowenwang/.rustup/toolchains/nightly-2020-05-15-x86_64-apple-darwin/lib/rustlib/src/rust/src/libstd/thread/local.rs:239
  52: tokio::coop::budget
             at /Users/bowenwang/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-0.2.18/src/coop.rs:79
      tokio::runtime::basic_scheduler::BasicScheduler<P>::block_on::{{closure}}
             at /Users/bowenwang/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-0.2.18/src/runtime/basic_scheduler.rs:131
  53: tokio::runtime::basic_scheduler::enter::{{closure}}
             at /Users/bowenwang/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-0.2.18/src/runtime/basic_scheduler.rs:213
  54: tokio::macros::scoped_tls::ScopedKey<T>::set
             at /Users/bowenwang/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-0.2.18/src/macros/scoped_tls.rs:64
  55: tokio::runtime::basic_scheduler::enter
             at /Users/bowenwang/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-0.2.18/src/runtime/basic_scheduler.rs:213
  56: tokio::runtime::basic_scheduler::BasicScheduler<P>::block_on
             at /Users/bowenwang/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-0.2.18/src/runtime/basic_scheduler.rs:123
  57: tokio::runtime::Runtime::block_on::{{closure}}
             at /Users/bowenwang/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-0.2.18/src/runtime/mod.rs:418
  58: tokio::runtime::context::enter
             at /Users/bowenwang/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-0.2.18/src/runtime/context.rs:72
  59: tokio::runtime::handle::Handle::enter
             at /Users/bowenwang/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-0.2.18/src/runtime/handle.rs:39
  60: tokio::runtime::Runtime::block_on
             at /Users/bowenwang/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-0.2.18/src/runtime/mod.rs:415
  61: tokio::task::local::LocalSet::block_on
             at /Users/bowenwang/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/tokio-0.2.18/src/task/local.rs:351
  62: actix_rt::runtime::Runtime::block_on
             at /Users/bowenwang/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/actix-rt-1.1.1/src/runtime.rs:89
  63: actix_rt::builder::SystemRunner::run
             at /Users/bowenwang/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/actix-rt-1.1.1/src/builder.rs:164
  64: actix_rt::builder::Builder::run
             at /Users/bowenwang/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/actix-rt-1.1.1/src/builder.rs:70
  65: actix_rt::system::System::run
             at /Users/bowenwang/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/actix-rt-1.1.1/src/system.rs:143
  66: cross_shard_tx::tests::test_cross_shard_tx_common
             at chain/client/tests/cross_shard_tx.rs:376
  67: cross_shard_tx::tests::test_cross_shard_tx_drop_chunks
             at chain/client/tests/cross_shard_tx.rs:502
  68: cross_shard_tx::tests::test_cross_shard_tx_drop_chunks::{{closure}}
             at chain/client/tests/cross_shard_tx.rs:501
  69: core::ops::function::FnOnce::call_once
             at /Users/bowenwang/.rustup/toolchains/nightly-2020-05-15-x86_64-apple-darwin/lib/rustlib/src/rust/src/libcore/ops/function.rs:232
  70: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once
             at /rustc/a74d1862d4d87a56244958416fd05976c58ca1a8/src/liballoc/boxed.rs:1034
      <std::panic::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
             at /rustc/a74d1862d4d87a56244958416fd05976c58ca1a8/src/libstd/panic.rs:318
      std::panicking::try::do_call
             at /rustc/a74d1862d4d87a56244958416fd05976c58ca1a8/src/libstd/panicking.rs:297
      std::panicking::try
             at /rustc/a74d1862d4d87a56244958416fd05976c58ca1a8/src/libstd/panicking.rs:274
      std::panic::catch_unwind
             at /rustc/a74d1862d4d87a56244958416fd05976c58ca1a8/src/libstd/panic.rs:394
      test::run_test_in_process
             at src/libtest/lib.rs:541
      test::run_test::run_test_inner::{{closure}}
             at src/libtest/lib.rs:450
  71: std::sys_common::backtrace::__rust_begin_short_backtrace
             at /rustc/a74d1862d4d87a56244958416fd05976c58ca1a8/src/libstd/sys_common/backtrace.rs:130
  72: std::thread::Builder::spawn_unchecked::{{closure}}::{{closure}}
             at /rustc/a74d1862d4d87a56244958416fd05976c58ca1a8/src/libstd/thread/mod.rs:475
      <std::panic::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
             at /rustc/a74d1862d4d87a56244958416fd05976c58ca1a8/src/libstd/panic.rs:318
      std::panicking::try::do_call
             at /rustc/a74d1862d4d87a56244958416fd05976c58ca1a8/src/libstd/panicking.rs:297
      std::panicking::try
             at /rustc/a74d1862d4d87a56244958416fd05976c58ca1a8/src/libstd/panicking.rs:274
      std::panic::catch_unwind
             at /rustc/a74d1862d4d87a56244958416fd05976c58ca1a8/src/libstd/panic.rs:394
      std::thread::Builder::spawn_unchecked::{{closure}}
             at /rustc/a74d1862d4d87a56244958416fd05976c58ca1a8/src/libstd/thread/mod.rs:474
      core::ops::function::FnOnce::call_once{{vtable.shim}}
             at /rustc/a74d1862d4d87a56244958416fd05976c58ca1a8/src/libcore/ops/function.rs:232
  73: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once
             at /rustc/a74d1862d4d87a56244958416fd05976c58ca1a8/src/liballoc/boxed.rs:1034
      <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once
             at /rustc/a74d1862d4d87a56244958416fd05976c58ca1a8/src/liballoc/boxed.rs:1034
      std::sys::unix::thread::Thread::new::thread_start
             at src/libstd/sys/unix/thread.rs:87
  74: AssociationsManager::_map
  75: AssociationsManager::_map


Invalid State Root Hash }

indicates that the prev state root of the chunk doesn't match the post state root of the previous chunk.

@bowenwang1996 bowenwang1996 added C-bug Category: This is a bug P-critical Priority: critical labels Jul 29, 2020
@bowenwang1996 bowenwang1996 self-assigned this Jul 29, 2020
bowenwang1996 added a commit that referenced this issue Jul 29, 2020
…3056)

Currently we always use latest chunk extra when producing chunks. However, this might cause problems because the latest chunk extra is determined by the current head, which might not be the block that we are producing chunk on top of. When this happens, it causes invalid chunks to be produced, as mentioned in #3055. This PR fixes it by always using the chunk extra from the block that the chunk is building on. Fixes #3055.

Test plan
---------
`test_validate_chunk_extra`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug P-critical Priority: critical
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant