Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pass instant from aptosdb for calculating latency metric #15678

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

areshand
Copy link
Contributor

@areshand areshand commented Jan 6, 2025

Description

How Has This Been Tested?

Key Areas to Review

Type of Change

  • New feature
  • Bug fix
  • Breaking change
  • Performance improvement
  • Refactoring
  • Dependency update
  • Documentation update
  • Tests

Which Components or Systems Does This Change Impact?

  • Validator Node
  • Full Node (API, Indexer, etc.)
  • Move/Aptos Virtual Machine
  • Aptos Framework
  • Aptos CLI/SDK
  • Developer Infrastructure
  • Move Compiler
  • Other (specify)

Checklist

  • I have read and followed the CONTRIBUTING doc
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I identified and added all stakeholders and component owners affected by this change as reviewers
  • I tested both happy and unhappy path of the functionality
  • I have made corresponding changes to the documentation

Copy link

trunk-io bot commented Jan 6, 2025

⏱️ 2h 10m total CI duration on this PR
Job Cumulative Duration Recent Runs
rust-targeted-unit-tests 48m 🟥
rust-move-tests 14m 🟩
rust-move-tests 14m 🟩
rust-move-tests 13m 🟩
rust-cargo-deny 10m 🟩🟩🟩🟩🟩 (+1 more)
rust-move-tests 9m
check-dynamic-deps 8m 🟩🟩🟩🟩🟩 (+1 more)
rust-move-tests 8m
general-lints 3m 🟩🟩🟩🟩🟩 (+1 more)
semgrep/ci 2m 🟩🟩🟩🟩🟩 (+1 more)
file_change_determinator 1m 🟩🟩🟩🟩🟩 (+1 more)
permission-check 17s 🟩🟩🟩🟩🟩 (+1 more)
permission-check 16s 🟩🟩🟩🟩🟩 (+1 more)

🚨 1 job on the last run was significantly faster/slower than expected

Job Duration vs 7d avg Delta
check-dynamic-deps 2m 1m +107%

settingsfeedbackdocs ⋅ learn more about trunk.io

@areshand areshand force-pushed the pass_instant_latency branch 3 times, most recently from 0988f9c to 5f9b440 Compare January 7, 2025 05:12
@@ -166,31 +166,32 @@ impl InternalIndexerDBService {

pub async fn run(&mut self, node_config: &NodeConfig) -> Result<()> {
let mut start_version = self.get_start_version(node_config).await?;
let mut target_version = self.db_indexer.main_db_reader.ensure_synced_version()?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will error out with an empty db (before genesis is put in), is that intentional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, internal indexer is supposed to start after main db bootstrapped.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay..

log_grpc_step(
SERVICE_TYPE,
IndexerGrpcStep::InternalIndexerDBProcessed,
Some(start_version as i64),
Some(next_version as i64),
None,
None,
Some(start_time.elapsed().as_secs_f64()),
Some(step_timer.elapsed().as_secs_f64()),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same issue with yesterday, that before it's caught up you already logged the latency, which is not accurate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each loop now is blocked on notification of write to main db. Previously, each loop is only a batch of all updates. so this should reflect the latency.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see.

@areshand areshand force-pushed the pass_instant_latency branch 2 times, most recently from 1db3ecd to cc5d099 Compare January 10, 2025 23:59
@areshand areshand requested a review from msmouse January 13, 2025 18:52
@areshand areshand force-pushed the pass_instant_latency branch 2 times, most recently from 16ad465 to 072b5ff Compare January 13, 2025 22:43
log_grpc_step(
SERVICE_TYPE,
IndexerGrpcStep::InternalIndexerDBProcessed,
Some(start_version as i64),
Some(next_version as i64),
None,
None,
Some(start_time.elapsed().as_secs_f64()),
Some(step_timer.elapsed().as_secs_f64()),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see.

@@ -166,31 +166,32 @@ impl InternalIndexerDBService {

pub async fn run(&mut self, node_config: &NodeConfig) -> Result<()> {
let mut start_version = self.get_start_version(node_config).await?;
let mut target_version = self.db_indexer.main_db_reader.ensure_synced_version()?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay..

Comment on lines +178 to +180
Err(e) => {
panic!("Failed to get update from update_receiver: {}", e);
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is how this thread knows the main db has quit, right? In that case we should return Ok instead of suicide here?

And I realized even in the previous logic, this thread doesn't have a chance to quit until hitting the target version? It can be an issue when the first time the indexer is enabled? (granted we usually quit the maindb by killing the whole process, it'd be better if we deal with this case gracefully, if not too complicated.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. the DBindexer should have Arc of main_db. I expect the err should never be caused by main_db quit. (The main db quits after db indexer being dropped). Thus, I think it should panic if this unexpected error occurs.
  2. if the process is killed while the indexer is in the middle of processing a large amount of data. the thread is terminated the same way as other components reading main_db eg: API is reading transactions using storage interface and the node is killed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ugh, in that case the indexer loop never quits? That should be an issue in tests?

Shall we implement graceful quitting (separately)? @grao1991 ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't disagree.

Copy link
Contributor Author

@areshand areshand Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is a method called "run_with_end_version" for testing

@areshand
Copy link
Contributor Author

@grao1991 please take a look as well ?

@areshand areshand enabled auto-merge (rebase) January 16, 2025 18:44

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

@areshand areshand force-pushed the pass_instant_latency branch from 4f6258c to 17b78fb Compare January 21, 2025 16:47

This comment has been minimized.

This comment has been minimized.

Copy link
Contributor

❌ Forge suite compat failure on 17540fad8e88ab5681f3a91190b9f5d37e53d2ef ==> 17b78fb4e49b2434187d200ec040c52cd4bfb1e2

Compatibility test results for 17540fad8e88ab5681f3a91190b9f5d37e53d2ef ==> 17b78fb4e49b2434187d200ec040c52cd4bfb1e2 (PR)
1. Check liveness of validators at old version: 17540fad8e88ab5681f3a91190b9f5d37e53d2ef
compatibility::simple-validator-upgrade::liveness-check : committed: 11289.05 txn/s, latency: 2802.16 ms, (p50: 2300 ms, p70: 3000, p90: 3900 ms, p99: 15700 ms), latency samples: 369500
2. Upgrading first Validator to new version: 17b78fb4e49b2434187d200ec040c52cd4bfb1e2
Test Failed: Tried executing 10 txns, request counters: "success 0, failed submit [0, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10], failed wait [10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10], by client: [(0, 170, 180): http://aptos-node-1-validator.forge-compat-pr-15678.svc:8080/v1/]"

Caused by:
    Unknown error Ledger on endpoint (http://aptos-node-1-validator.forge-compat-pr-15678.svc:8080/v1/) is more than 60s behind current time, timing out waiting for the transaction. Warning, transaction (aa717054) might still succeed.

Stack backtrace:
   0: anyhow::error::<impl core::convert::From<E> for anyhow::Error>::from
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/anyhow-1.0.89/src/error.rs:564:25
   1: <core::result::Result<T,F> as core::ops::try_trait::FromResidual<core::result::Result<core::convert::Infallible,E>>>::from_residual
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/result.rs:1964:27
   2: aptos_transaction_emitter_lib::emitter::transaction_executor::RestApiReliableTransactionSubmitter::submit_check_and_retry::{{closure}}
             at ./crates/transaction-emitter-lib/src/emitter/transaction_executor.rs:141:28
   3: <futures_util::future::maybe_done::MaybeDone<Fut> as core::future::future::Future>::poll
             at /usr/local/cargo/git/checkouts/futures-rs-2e928889f587a511/1845d7d/futures-util/src/future/maybe_done.rs:95:38
   4: <futures_util::future::join_all::JoinAll<F> as core::future::future::Future>::poll
             at /usr/local/cargo/git/checkouts/futures-rs-2e928889f587a511/1845d7d/futures-util/src/future/join_all.rs:143:24
   5: <aptos_transaction_emitter_lib::emitter::transaction_executor::RestApiReliableTransactionSubmitter as aptos_transaction_generator_lib::ReliableTransactionSubmitter>::execute_transactions_with_counter::{{closure}}
             at ./crates/transaction-emitter-lib/src/emitter/transaction_executor.rs:343:10
   6: <core::pin::Pin<P> as core::future::future::Future>::poll
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/future/future.rs:123:9
   7: aptos_transaction_emitter_lib::emitter::account_minter::AccountMinter::create_and_fund_seed_accounts::{{closure}}
             at ./crates/transaction-emitter-lib/src/emitter/account_minter.rs:396:18
   8: aptos_transaction_emitter_lib::emitter::account_minter::AccountMinter::create_and_fund_accounts::{{closure}}
             at ./crates/transaction-emitter-lib/src/emitter/account_minter.rs:298:10
   9: aptos_transaction_emitter_lib::emitter::account_minter::bulk_create_accounts::{{closure}}
             at ./crates/transaction-emitter-lib/src/emitter/account_minter.rs:686:14
  10: aptos_transaction_emitter_lib::emitter::TxnEmitter::start_job::{{closure}}
             at ./crates/transaction-emitter-lib/src/emitter/mod.rs:811:10
  11: aptos_transaction_emitter_lib::emitter::TxnEmitter::emit_txn_for_impl::{{closure}}
             at ./crates/transaction-emitter-lib/src/emitter/mod.rs:925:14
  12: aptos_transaction_emitter_lib::emitter::TxnEmitter::emit_txn_for::{{closure}}
             at ./crates/transaction-emitter-lib/src/emitter/mod.rs:957:14
  13: aptos_testcases::compatibility_test::stat_gather_task::{{closure}}
             at ./testsuite/testcases/src/compatibility_test.rs:53:14
  14: aptos_testcases::compatibility_test::upgrade_and_gather_stats::{{closure}}::{{closure}}
             at ./testsuite/testcases/src/compatibility_test.rs:112:14
  15: tokio::runtime::task::core::Core<T,S>::poll::{{closure}}
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/task/core.rs:331:17
  16: tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/loom/std/unsafe_cell.rs:16:9
  17: tokio::runtime::task::core::Core<T,S>::poll
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/task/core.rs:320:30
  18: tokio::runtime::task::harness::poll_future::{{closure}}
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/task/harness.rs:500:19
  19: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/panic/unwind_safe.rs:272:9
  20: std::panicking::try::do_call
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panicking.rs:552:40
  21: std::panicking::try
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panicking.rs:516:19
  22: std::panic::catch_unwind
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panic.rs:146:14
  23: tokio::runtime::task::harness::poll_future
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/task/harness.rs:488:18
  24: tokio::runtime::task::harness::Harness<T,S>::poll_inner
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/task/harness.rs:209:27
  25: tokio::runtime::task::harness::Harness<T,S>::poll
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/task/harness.rs:154:15
  26: tokio::runtime::task::raw::RawTask::poll
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/task/raw.rs:201:18
  27: tokio::runtime::task::LocalNotified<S>::run
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/task/mod.rs:436:9
  28: tokio::runtime::scheduler::multi_thread::worker::Context::run_task::{{closure}}
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/scheduler/multi_thread/worker.rs:598:18
  29: tokio::runtime::coop::with_budget
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/coop.rs:107:5
  30: tokio::runtime::coop::budget
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/coop.rs:73:5
  31: tokio::runtime::scheduler::multi_thread::worker::Context::run_task
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/scheduler/multi_thread/worker.rs:597:9
  32: tokio::runtime::scheduler::multi_thread::worker::Context::run
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/scheduler/multi_thread/worker.rs:548:24
  33: tokio::runtime::scheduler::multi_thread::worker::run::{{closure}}::{{closure}}
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/scheduler/multi_thread/worker.rs:513:21
  34: tokio::runtime::context::scoped::Scoped<T>::set
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/context/scoped.rs:40:9
  35: tokio::runtime::context::set_scheduler::{{closure}}
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/context.rs:180:26
  36: std::thread::local::LocalKey<T>::try_with
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/thread/local.rs:284:16
  37: std::thread::local::LocalKey<T>::with
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/thread/local.rs:260:9
  38: tokio::runtime::context::set_scheduler
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/context.rs:180:17
  39: tokio::runtime::scheduler::multi_thread::worker::run::{{closure}}
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/scheduler/multi_thread/worker.rs:508:9
  40: tokio::runtime::context::runtime::enter_runtime
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/context/runtime.rs:65:16
  41: tokio::runtime::scheduler::multi_thread::worker::run
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/scheduler/multi_thread/worker.rs:500:5
  42: tokio::runtime::scheduler::multi_thread::worker::Launch::launch::{{closure}}
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/scheduler/multi_thread/worker.rs:466:45
  43: <tokio::runtime::blocking::task::BlockingTask<T> as core::future::future::Future>::poll
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/blocking/task.rs:42:21
  44: tokio::runtime::task::core::Core<T,S>::poll::{{closure}}
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/task/core.rs:331:17
  45: tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/loom/std/unsafe_cell.rs:16:9
  46: tokio::runtime::task::core::Core<T,S>::poll
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/task/core.rs:320:30
  47: tokio::runtime::task::harness::poll_future::{{closure}}
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/task/harness.rs:500:19
  48: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/panic/unwind_safe.rs:272:9
  49: std::panicking::try::do_call
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panicking.rs:552:40
  50: std::panicking::try
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panicking.rs:516:19
  51: std::panic::catch_unwind
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panic.rs:146:14
  52: tokio::runtime::task::harness::poll_future
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/task/harness.rs:488:18
  53: tokio::runtime::task::harness::Harness<T,S>::poll_inner
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/task/harness.rs:209:27
  54: tokio::runtime::task::harness::Harness<T,S>::poll
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/task/harness.rs:154:15
  55: tokio::runtime::task::raw::poll
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/task/raw.rs:271:5
  56: tokio::runtime::task::raw::RawTask::poll
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/task/raw.rs:201:18
  57: tokio::runtime::task::UnownedTask<S>::run
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/task/mod.rs:473:9
  58: tokio::runtime::blocking::pool::Task::run
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/blocking/pool.rs:160:9
  59: tokio::runtime::blocking::pool::Inner::run
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/blocking/pool.rs:518:17
  60: tokio::runtime::blocking::pool::Spawner::spawn_thread::{{closure}}
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/blocking/pool.rs:476:13
  61: std::sys_common::backtrace::__rust_begin_short_backtrace
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/sys_common/backtrace.rs:155:18
  62: std::thread::Builder::spawn_unchecked_::{{closure}}::{{closure}}
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/thread/mod.rs:528:17
  63: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/panic/unwind_safe.rs:272:9
  64: std::panicking::try::do_call
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panicking.rs:552:40
  65: std::panicking::try
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panicking.rs:516:19
  66: std::panic::catch_unwind
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panic.rs:146:14
  67: std::thread::Builder::spawn_unchecked_::{{closure}}
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/thread/mod.rs:527:30
  68: core::ops::function::FnOnce::call_once{{vtable.shim}}
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/ops/function.rs:250:5
  69: <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/alloc/src/boxed.rs:2020:9
  70: <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/alloc/src/boxed.rs:2020:9
  71: std::sys::pal::unix::thread::Thread::new::thread_start
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/sys/pal/unix/thread.rs:108:17
  72: start_thread
  73: clone
Trailing Log Lines:
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/ops/function.rs:250:5
  69: <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/alloc/src/boxed.rs:2020:9
  70: <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/alloc/src/boxed.rs:2020:9
  71: std::sys::pal::unix::thread::Thread::new::thread_start
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/sys/pal/unix/thread.rs:108:17
  72: start_thread
  73: clone

=== BEGIN JUNIT ===
<?xml version="1.0" encoding="UTF-8"?>
<testsuites name="forge" tests="1" failures="1" errors="0" uuid="493dd068-78a7-460f-a79c-04835362d736">
    <testsuite name="compat" tests="1" disabled="0" errors="0" failures="1">
        <testcase name="compatibility::simple-validator-upgrade">
            <failure message="Tried executing 10 txns, request counters: &quot;success 0, failed submit [0, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10], failed wait [10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10], by client: [(0, 170, 180): http://aptos-node-1-validator.forge-compat-pr-15678.svc:8080/v1/]&quot;

Caused by:
    Unknown error Ledger on endpoint (http://aptos-node-1-validator.forge-compat-pr-15678.svc:8080/v1/) is more than 60s behind current time, timing out waiting for the transaction. Warning, transaction (aa717054) might still succeed.

Stack backtrace:
   0: anyhow::error::&lt;impl core::convert::From&lt;E&gt; for anyhow::Error&gt;::from
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/anyhow-1.0.89/src/error.rs:564:25
   1: &lt;core::result::Result&lt;T,F&gt; as core::ops::try_trait::FromResidual&lt;core::result::Result&lt;core::convert::Infallible,E&gt;&gt;&gt;::from_residual
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/result.rs:1964:27
   2: aptos_transaction_emitter_lib::emitter::transaction_executor::RestApiReliableTransactionSubmitter::submit_check_and_retry::{{closure}}
             at ./crates/transaction-emitter-lib/src/emitter/transaction_executor.rs:141:28
   3: &lt;futures_util::future::maybe_done::MaybeDone&lt;Fut&gt; as core::future::future::Future&gt;::poll
             at /usr/local/cargo/git/checkouts/futures-rs-2e928889f587a511/1845d7d/futures-util/src/future/maybe_done.rs:95:38
   4: &lt;futures_util::future::join_all::JoinAll&lt;F&gt; as core::future::future::Future&gt;::poll
             at /usr/local/cargo/git/checkouts/futures-rs-2e928889f587a511/1845d7d/futures-util/src/future/join_all.rs:143:24
   5: &lt;aptos_transaction_emitter_lib::emitter::transaction_executor::RestApiReliableTransactionSubmitter as aptos_transaction_generator_lib::ReliableTransactionSubmitter&gt;::execute_transactions_with_counter::{{closure}}
             at ./crates/transaction-emitter-lib/src/emitter/transaction_executor.rs:343:10
   6: &lt;core::pin::Pin&lt;P&gt; as core::future::future::Future&gt;::poll
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/future/future.rs:123:9
   7: aptos_transaction_emitter_lib::emitter::account_minter::AccountMinter::create_and_fund_seed_accounts::{{closure}}
             at ./crates/transaction-emitter-lib/src/emitter/account_minter.rs:396:18
   8: aptos_transaction_emitter_lib::emitter::account_minter::AccountMinter::create_and_fund_accounts::{{closure}}
             at ./crates/transaction-emitter-lib/src/emitter/account_minter.rs:298:10
   9: aptos_transaction_emitter_lib::emitter::account_minter::bulk_create_accounts::{{closure}}
             at ./crates/transaction-emitter-lib/src/emitter/account_minter.rs:686:14
  10: aptos_transaction_emitter_lib::emitter::TxnEmitter::start_job::{{closure}}
             at ./crates/transaction-emitter-lib/src/emitter/mod.rs:811:10
  11: aptos_transaction_emitter_lib::emitter::TxnEmitter::emit_txn_for_impl::{{closure}}
             at ./crates/transaction-emitter-lib/src/emitter/mod.rs:925:14
  12: aptos_transaction_emitter_lib::emitter::TxnEmitter::emit_txn_for::{{closure}}
             at ./crates/transaction-emitter-lib/src/emitter/mod.rs:957:14
  13: aptos_testcases::compatibility_test::stat_gather_task::{{closure}}
             at ./testsuite/testcases/src/compatibility_test.rs:53:14
  14: aptos_testcases::compatibility_test::upgrade_and_gather_stats::{{closure}}::{{closure}}
             at ./testsuite/testcases/src/compatibility_test.rs:112:14
  15: tokio::runtime::task::core::Core&lt;T,S&gt;::poll::{{closure}}
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/task/core.rs:331:17
  16: tokio::loom::std::unsafe_cell::UnsafeCell&lt;T&gt;::with_mut
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/loom/std/unsafe_cell.rs:16:9
  17: tokio::runtime::task::core::Core&lt;T,S&gt;::poll
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/task/core.rs:320:30
  18: tokio::runtime::task::harness::poll_future::{{closure}}
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/task/harness.rs:500:19
  19: &lt;core::panic::unwind_safe::AssertUnwindSafe&lt;F&gt; as core::ops::function::FnOnce&lt;()&gt;&gt;::call_once
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/panic/unwind_safe.rs:272:9
  20: std::panicking::try::do_call
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panicking.rs:552:40
  21: std::panicking::try
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panicking.rs:516:19
  22: std::panic::catch_unwind
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panic.rs:146:14
  23: tokio::runtime::task::harness::poll_future
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/task/harness.rs:488:18
  24: tokio::runtime::task::harness::Harness&lt;T,S&gt;::poll_inner
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/task/harness.rs:209:27
  25: tokio::runtime::task::harness::Harness&lt;T,S&gt;::poll
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/task/harness.rs:154:15
  26: tokio::runtime::task::raw::RawTask::poll
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/task/raw.rs:201:18
  27: tokio::runtime::task::LocalNotified&lt;S&gt;::run
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/task/mod.rs:436:9
  28: tokio::runtime::scheduler::multi_thread::worker::Context::run_task::{{closure}}
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/scheduler/multi_thread/worker.rs:598:18
  29: tokio::runtime::coop::with_budget
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/coop.rs:107:5
  30: tokio::runtime::coop::budget
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/coop.rs:73:5
  31: tokio::runtime::scheduler::multi_thread::worker::Context::run_task
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/scheduler/multi_thread/worker.rs:597:9
  32: tokio::runtime::scheduler::multi_thread::worker::Context::run
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/scheduler/multi_thread/worker.rs:548:24
  33: tokio::runtime::scheduler::multi_thread::worker::run::{{closure}}::{{closure}}
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/scheduler/multi_thread/worker.rs:513:21
  34: tokio::runtime::context::scoped::Scoped&lt;T&gt;::set
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/context/scoped.rs:40:9
  35: tokio::runtime::context::set_scheduler::{{closure}}
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/context.rs:180:26
  36: std::thread::local::LocalKey&lt;T&gt;::try_with
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/thread/local.rs:284:16
  37: std::thread::local::LocalKey&lt;T&gt;::with
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/thread/local.rs:260:9
  38: tokio::runtime::context::set_scheduler
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/context.rs:180:17
  39: tokio::runtime::scheduler::multi_thread::worker::run::{{closure}}
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/scheduler/multi_thread/worker.rs:508:9
  40: tokio::runtime::context::runtime::enter_runtime
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/context/runtime.rs:65:16
  41: tokio::runtime::scheduler::multi_thread::worker::run
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/scheduler/multi_thread/worker.rs:500:5
  42: tokio::runtime::scheduler::multi_thread::worker::Launch::launch::{{closure}}
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/scheduler/multi_thread/worker.rs:466:45
  43: &lt;tokio::runtime::blocking::task::BlockingTask&lt;T&gt; as core::future::future::Future&gt;::poll
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/blocking/task.rs:42:21
  44: tokio::runtime::task::core::Core&lt;T,S&gt;::poll::{{closure}}
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/task/core.rs:331:17
  45: tokio::loom::std::unsafe_cell::UnsafeCell&lt;T&gt;::with_mut
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/loom/std/unsafe_cell.rs:16:9
  46: tokio::runtime::task::core::Core&lt;T,S&gt;::poll
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/task/core.rs:320:30
  47: tokio::runtime::task::harness::poll_future::{{closure}}
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/task/harness.rs:500:19
  48: &lt;core::panic::unwind_safe::AssertUnwindSafe&lt;F&gt; as core::ops::function::FnOnce&lt;()&gt;&gt;::call_once
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/panic/unwind_safe.rs:272:9
  49: std::panicking::try::do_call
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panicking.rs:552:40
  50: std::panicking::try
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panicking.rs:516:19
  51: std::panic::catch_unwind
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panic.rs:146:14
  52: tokio::runtime::task::harness::poll_future
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/task/harness.rs:488:18
  53: tokio::runtime::task::harness::Harness&lt;T,S&gt;::poll_inner
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/task/harness.rs:209:27
  54: tokio::runtime::task::harness::Harness&lt;T,S&gt;::poll
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/task/harness.rs:154:15
  55: tokio::runtime::task::raw::poll
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/task/raw.rs:271:5
  56: tokio::runtime::task::raw::RawTask::poll
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/task/raw.rs:201:18
  57: tokio::runtime::task::UnownedTask&lt;S&gt;::run
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/task/mod.rs:473:9
  58: tokio::runtime::blocking::pool::Task::run
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/blocking/pool.rs:160:9
  59: tokio::runtime::blocking::pool::Inner::run
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/blocking/pool.rs:518:17
  60: tokio::runtime::blocking::pool::Spawner::spawn_thread::{{closure}}
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/blocking/pool.rs:476:13
  61: std::sys_common::backtrace::__rust_begin_short_backtrace
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/sys_common/backtrace.rs:155:18
  62: std::thread::Builder::spawn_unchecked_::{{closure}}::{{closure}}
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/thread/mod.rs:528:17
  63: &lt;core::panic::unwind_safe::AssertUnwindSafe&lt;F&gt; as core::ops::function::FnOnce&lt;()&gt;&gt;::call_once
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/panic/unwind_safe.rs:272:9
  64: std::panicking::try::do_call
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panicking.rs:552:40
  65: std::panicking::try
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panicking.rs:516:19
  66: std::panic::catch_unwind
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panic.rs:146:14
  67: std::thread::Builder::spawn_unchecked_::{{closure}}
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/thread/mod.rs:527:30
  68: core::ops::function::FnOnce::call_once{{vtable.shim}}
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/ops/function.rs:250:5
  69: &lt;alloc::boxed::Box&lt;F,A&gt; as core::ops::function::FnOnce&lt;Args&gt;&gt;::call_once
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/alloc/src/boxed.rs:2020:9
  70: &lt;alloc::boxed::Box&lt;F,A&gt; as core::ops::function::FnOnce&lt;Args&gt;&gt;::call_once
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/alloc/src/boxed.rs:2020:9
  71: std::sys::pal::unix::thread::Thread::new::thread_start
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/sys/pal/unix/thread.rs:108:17
  72: start_thread
  73: clone"/>
        </testcase>
    </testsuite>
</testsuites>
=== END JUNIT ===

Swarm logs can be found here: See fgi output for more information.
{"level":"INFO","source":{"package":"aptos_forge","file":"testsuite/forge/src/backend/k8s/cluster_helper.rs:381"},"thread_name":"main","hostname":"forge-compat-pr-15678-1737479148-17540fad8e88ab5681f3a91190b9f5","timestamp":"2025-01-21T17:13:29.126546Z","message":"Deleting namespace forge-compat-pr-15678: Some(NamespaceStatus { conditions: None, phase: Some(\"Terminating\") })"}
{"level":"INFO","source":{"package":"aptos_forge","file":"testsuite/forge/src/backend/k8s/cluster_helper.rs:398"},"thread_name":"main","hostname":"forge-compat-pr-15678-1737479148-17540fad8e88ab5681f3a91190b9f5","timestamp":"2025-01-21T17:13:29.126573Z","message":"aptos-node resources for Forge removed in namespace: forge-compat-pr-15678"}

failures:
    compatibility::simple-validator-upgrade

test result: FAILED. 0 passed; 1 failed; 0 filtered out

Failed to run tests:
Tests Failed
Error: Tests Failed

Stack backtrace:
   0: anyhow::error::<impl anyhow::Error>::msg
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/anyhow-1.0.89/src/error.rs:85:36
   1: aptos_forge::runner::Forge<F>::run
             at ./testsuite/forge/src/runner.rs:358:13
   2: forge::run_forge_with_changelog
             at ./testsuite/forge-cli/src/main.rs:428:24
   3: forge::main
             at ./testsuite/forge-cli/src/main.rs:331:21
   4: core::ops::function::FnOnce::call_once
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/ops/function.rs:250:5
   5: std::sys_common::backtrace::__rust_begin_short_backtrace
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/sys_common/backtrace.rs:155:18
   6: std::rt::lang_start::{{closure}}
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/rt.rs:166:18
   7: core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/ops/function.rs:284:13
   8: std::panicking::try::do_call
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panicking.rs:552:40
   9: std::panicking::try
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panicking.rs:516:19
  10: std::panic::catch_unwind
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panic.rs:146:14
  11: std::rt::lang_start_internal::{{closure}}
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/rt.rs:148:48
  12: std::panicking::try::do_call
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panicking.rs:552:40
  13: std::panicking::try
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panicking.rs:516:19
  14: std::panic::catch_unwind
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panic.rs:146:14
  15: std::rt::lang_start_internal
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/rt.rs:148:20
  16: main
  17: __libc_start_main
  18: _start
Debugging output:
NAME                                    READY   STATUS      RESTARTS   AGE
aptos-node-0-validator-0                1/1     Running     0          7m23s
aptos-node-1-validator-0                1/1     Running     0          4m46s
aptos-node-2-validator-0                1/1     Running     0          7m23s
aptos-node-3-validator-0                1/1     Running     0          7m23s
forge-testnet-deployer-knlv9            0/1     Completed   0          7m30s
genesis-aptos-genesis-eforge217-l8rwm   0/1     Completed   0          7m23s

Copy link
Contributor

❌ Forge suite realistic_env_max_load failure on 17b78fb4e49b2434187d200ec040c52cd4bfb1e2

Forge test runner terminated:
Trailing Log Lines:
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/runtime.rs:335:13
  16: aptos_forge::runner::Forge<F>::run
             at ./testsuite/forge/src/runner.rs:281:29
  17: forge::run_forge_with_changelog
             at ./testsuite/forge-cli/src/main.rs:428:24
  18: forge::main
             at ./testsuite/forge-cli/src/main.rs:331:21
  19: core::ops::function::FnOnce::call_once
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/ops/function.rs:250:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
Debugging output:
Failed to get debugging output: error: You must be logged in to the server (Unauthorized)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants