[comparison-testing-tool] Add more features to the comparison testing tool #11890

rahxephon89 · 2024-02-03T19:17:06Z

Description

This PR:

adds a new flag --target-account to the comparison testing tool such that it can dump txns with a specific target account; usage: cargo run --begin-version <BEGIN_VERSION> --limit <LIMIT> dump --target-account <account-address> ...
adds a new sub command online to execute transactions without dumping the pre-state data; usage: cargo run --begin-version <BEGIN_VERSION> --limit <LIMIT> online [OPTIONS] <ENDPOINT> [OUTPUT_PATH]
refactors some of the code to make the tool more efficient.

Usage:

Test Plan

Manual test

trunk-io · 2024-02-03T19:17:09Z

⏱️ 24h 1m total CI duration on this PR

Job	Cumulative Duration	Recent Runs
rust-move-unit-coverage	6h 35m	🟩 🟩 🟩 🟩 🟩 (+8 more)
rust-unit-tests	5h 44m	🟩 🟩 🟩 🟩 🟩 (+6 more)
windows-build	4h 10m	🟩 🟩 🟩 🟩 🟩 (+8 more)
rust-move-tests	3h 12m	🟩 🟩 🟩 🟩 🟩 (+7 more)
rust-lints	1h 30m	🟩 🟩 🟩 🟩 🟩 (+6 more)
run-tests-main-branch	56m	🟥 🟥 🟥 🟥 🟥 (+6 more)
check	45m	🟩 🟩 🟩 🟩 🟩 (+6 more)
general-lints	32m	🟩 🟩 🟩 🟩 🟩 (+6 more)
check-dynamic-deps	27m	🟩 🟩 🟩 🟩 🟩 (+7 more)
semgrep/ci	4m	🟩 🟩 🟩 🟩 🟩 (+7 more)
file_change_determinator	2m	🟩 🟩 🟩 🟩 🟩 (+6 more)
file_change_determinator	2m	🟩 🟩 🟩 🟩 🟩 (+6 more)
permission-check	41s	🟩 🟩 🟩 🟩 🟩 (+6 more)
permission-check	32s	🟩 🟩 🟩 🟩 🟩 (+6 more)
permission-check	30s	🟩 🟩 🟩 🟩 🟩 (+6 more)
permission-check	30s	🟩 🟩 🟩 🟩 🟩 (+6 more)

🚨 2 jobs on the last run were significantly faster/slower than expected

Job	Duration	vs 7d avg	Delta
run-tests-main-branch	6m	4m
windows-build	26m	20m

_{settings ⋅ feedback ⋅ docs ⋅ learn more about trunk.io}

codecov · 2024-02-03T19:53:19Z

Codecov Report

Attention: Patch coverage is 0% with 24 lines in your changes are missing coverage. Please review.

Project coverage is 63.9%. Comparing base (d6c174c) to head (9cb17be).
Report is 1 commits behind head on main.

Files	Patch %	Lines
..._party/move/move-core/types/src/account_address.rs	0.0%	10 Missing ⚠️
types/src/state_store/state_key.rs	0.0%	10 Missing ⚠️
...tos-move/aptos-vm/src/move_vm_ext/warm_vm_cache.rs	0.0%	3 Missing ⚠️
aptos-move/aptos-vm/src/gas.rs	0.0%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff            @@
##             main   #11890     +/-   ##
=========================================
- Coverage    63.9%    63.9%   -0.1%     
=========================================
  Files         807      807             
  Lines      178311   178334     +23     
=========================================
  Hits       114052   114052             
- Misses      64259    64282     +23

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

runtian-zhou · 2024-02-08T03:48:52Z

Can you add an command example to use the target-address and online feature in the PR description?

runtian-zhou · 2024-02-08T03:51:25Z

aptos-move/aptos-e2e-comparison-testing/src/main.rs

+        target_account: Option<AccountAddress>,
+    },
+    /// Online execution
+    Online {


Hmm I wasn't quite sure what this command means. Can you add more description on what this would do?

Sorry for the confusion. With this command, the tool does not need to dump the state. Instead, it will try to get each txn, use DebuggerStateView the get the state, except for the code, which is obtained from compilation of the source code and load it to the code state. The logic is defined in data_state_view.rs in this PR.

I think your explanation can go to the comment instead of /// Online execution

runtian-zhou · 2024-02-08T03:58:10Z

aptos-move/e2e-tests/src/executor.rs

        )
        .unwrap();
-        let mut session = vm.new_session(&resolver, SessionId::void());
-
+        let mut session = vm.new_session_with_flush_flag(resolver, SessionId::void(), true);


Hmm the logic indeed looks pretty weird to me. Ideally I would avoid making changes in the MoveVM but would like to understand the problem here a bit better.

Looking at the code, the MoveVM will be spawned fresh in line 1008. The only thing that was cached previously is the WarmVMCache which would probably cache the previous version of Aptos Framework. Is that the code you were having issue with? i.e: the older version of aptos framework is being loaded and cause execution problem?

Yeah... Currently, the tool executes v1 and then v2 for each txn. After execution of v1 code, we need to reload the aptos framework compiled by v2.

My take is we should avoid making changes here in the Move VM. Instead, we should look into how to invalidate this WarmVMCache on a framework upgrade. This is the logic the real VM is doing in the node as well so shouldn't be something new.

georgemitenkov · 2024-02-08T17:05:58Z

aptos-move/aptos-vm/src/move_vm_ext/vm.rs

+        self.new_session_with_flush_flag(resolver, session_id, false)
+    }
+
+    pub fn new_session_with_flush_flag<'r, S: AptosMoveResolver>(


Maybe add a comment why this flag is useful? Otherwise we have different APIs for sessions which are created differently. Also, what you can do instead is to have new_clean_session which has the implementation with must_flush = true. This way you can make this function debugger-only (mark as a cfg feature?)

So having a private new_session_impl and then wrapping new_session and new_clean_session seems cleaner than having one public API call some other public API?

aptos-move/e2e-tests/src/executor.rs

georgemitenkov · 2024-02-08T17:11:49Z

types/src/state_store/state_key.rs

+        match self.inner() {
+            StateKeyInner::AccessPath(access_path) => {
+                !access_path.path.is_empty()
+                    && access_path.path[0] == CODE_TAG


I think you can check if access path is code using is_code function?

georgemitenkov · 2024-02-08T17:12:37Z

types/src/state_store/state_key.rs

@@ -154,6 +154,20 @@ impl StateKey {
    pub fn get_shard_id(&self) -> u8 {
        CryptoHash::hash(self).nibble(0)
    }
+
+    pub fn is_aptos_path(&self) -> bool {


nit: is_aptos_code or similar? so that the function name describes we process modules

georgemitenkov · 2024-02-08T17:16:01Z

third_party/move/move-core/types/src/account_address.rs

@@ -15,13 +15,17 @@ use std::{convert::TryFrom, fmt, str::FromStr};
 pub struct AccountAddress([u8; AccountAddress::LENGTH]);

 impl AccountAddress {
+    /// Hex address: 0x4
+    pub const FOUR: Self = Self::get_hex_address_four();


Why not ordering constants for addresses like 1,2,3,4?😄

Yeah, it is currently order by the alphabet so four is the first one. I will change that

georgemitenkov · 2024-02-08T17:19:27Z

aptos-move/aptos-e2e-comparison-testing/src/data_stateview.rs

+    }
+
+    fn get_usage(&self) -> StateViewResult<StateStorageUsage> {
+        unimplemented!()


nit: unreachable!()

georgemitenkov · 2024-02-08T17:23:34Z

aptos-move/aptos-e2e-comparison-testing/src/data_stateview.rs

+
+pub struct DataStateView {
+    debugger_view: DebuggerStateView,
+    code_data: Option<FakeDataStore>,


Why use FakeDataStore? What does this code data do? I see we pass it into new but how it gets populated?

Sorry for the confusion. DataStateView is currently used in two places: 1) in the dump mode, I used data_read_state_keys to store the data state value, which is then dumped to the file; 2) in the online mode, I use code_data to store the bytecode compiled by V1 and V2, whenever a txn is obtained, I will use debugger_view + code_data to execute the txn. It is used in the get_state_value function as long as it is not none

aptos-move/aptos-e2e-comparison-testing/src/online_execution.rs

georgemitenkov · 2024-02-08T17:27:51Z

aptos-move/aptos-e2e-comparison-testing/src/execution.rs

    ) -> Option<Result<(WriteSet, Vec<ContractEvent>), VMStatus>> {
        let executor = FakeExecutor::no_genesis();
        let mut executor = executor.set_not_parallel();
-        *executor.data_store_mut() = state.clone();
+        *executor.data_store_mut() = state;


So we overwrite default data created by FakeExecutor with empty state?

yes, it happens in the online execution mode

georgemitenkov · 2024-02-08T17:30:42Z

aptos-move/aptos-e2e-comparison-testing/src/execution.rs

+                    return Some(executor.try_exec_entry_with_resolver(
+                        senders,
+                        entry_function,
+                        &executor.data_store().clone().as_move_resolver(),


Why? try_exec would use executor.data_store() and modify it, so I guess the main problem here us that you want to run something multiple times from the same state (and "uncommit") any modifications?

In the current design, each txn has its own state so the corresponding state is only run once without any modifications. In the offline execution mode, the data state is from the file (obtained in the dump mode). In the online execution mode, the data state is from DebuggerStateView.

georgemitenkov · 2024-02-08T17:33:40Z

@rahxephon89 left a few (relevant and not so relevant) comments, but for my understanding, why we use FakeExecutor here? I feel like the main issue is that we create it and it has data store associated, so it becomes non-reusable if you want to flush changes every time you run an executor?

We have an API inside a vm to run a block of txns on top of StateView without committing the changes, but producing outputs, would not that be useful here?

rahxephon89 · 2024-02-08T17:58:01Z

@rahxephon89 left a few (relevant and not so relevant) comments, but for my understanding, why we use FakeExecutor here? I feel like the main issue is that we create it and it has data store associated, so it becomes non-reusable if you want to flush changes every time you run an executor?

We have an API inside a vm to run a block of txns on top of StateView without committing the changes, but producing outputs, would not that be useful here?

Thanks for pointing this out, @georgemitenkov! There are some APIs that I don't understand well so I have not used them. We can discuss more offline.

georgemitenkov · 2024-02-11T23:29:24Z

aptos-move/aptos-e2e-comparison-testing/src/execution.rs

    ) -> Option<Result<(WriteSet, Vec<ContractEvent>), VMStatus>> {
        let executor = FakeExecutor::no_genesis();
        let mut executor = executor.set_not_parallel();
-        *executor.data_store_mut() = state.clone();


I don't think we need a FakeExecutor here, why not just do AptosVM::execute_block_no_limit(&vec![txn], &state)? Then you avoid coupling the execution with data state, and can use whatever state you want, either using V1 code or V2 code

Plus our testing is a mess, I would trust execute_block_no_limit API more than exec_... variants

The benefit of this API is that it 1) doesn't commit to state, so you can reuse the same state version for multiple executions, and 2) all things like aggregators are handled

Thanks for the advice, @georgemitenkov. One concern I have is that for early txns, VM_BINARY_FORMAT_V6 is not enabled but the compiler only generates V6 bytecode. Is there a way to solve this issue?

github-actions · 2024-03-13T00:16:04Z

✅ Forge suite `compat` success on `aptos-node-v1.9.5` ==> `9cb17beedfab45a5d1b969f1b53c1fe59e890654`

Compatibility test results for aptos-node-v1.9.5 ==> 9cb17beedfab45a5d1b969f1b53c1fe59e890654 (PR)
1. Check liveness of validators at old version: aptos-node-v1.9.5
compatibility::simple-validator-upgrade::liveness-check : committed: 6972 txn/s, latency: 4779 ms, (p50: 4800 ms, p90: 7700 ms, p99: 8100 ms), latency samples: 244040
2. Upgrading first Validator to new version: 9cb17beedfab45a5d1b969f1b53c1fe59e890654
compatibility::simple-validator-upgrade::single-validator-upgrade : committed: 1824 txn/s, latency: 15414 ms, (p50: 17800 ms, p90: 22300 ms, p99: 23100 ms), latency samples: 94880
3. Upgrading rest of first batch to new version: 9cb17beedfab45a5d1b969f1b53c1fe59e890654
compatibility::simple-validator-upgrade::half-validator-upgrade : committed: 361 txn/s, submitted: 618 txn/s, expired: 257 txn/s, latency: 33742 ms, (p50: 32200 ms, p90: 50900 ms, p99: 56700 ms), latency samples: 27830
4. upgrading second batch to new version: 9cb17beedfab45a5d1b969f1b53c1fe59e890654
compatibility::simple-validator-upgrade::rest-validator-upgrade : committed: 2866 txn/s, latency: 10703 ms, (p50: 9900 ms, p90: 17800 ms, p99: 18700 ms), latency samples: 137580
5. check swarm health
Compatibility test for aptos-node-v1.9.5 ==> 9cb17beedfab45a5d1b969f1b53c1fe59e890654 passed
Test Ok

github-actions · 2024-03-13T00:17:36Z

✅ Forge suite `realistic_env_max_load` success on `9cb17beedfab45a5d1b969f1b53c1fe59e890654`

two traffics test: inner traffic : committed: 8127 txn/s, latency: 4839 ms, (p50: 4500 ms, p90: 5700 ms, p99: 12600 ms), latency samples: 3502820
two traffics test : committed: 100 txn/s, latency: 1927 ms, (p50: 1800 ms, p90: 2100 ms, p99: 7000 ms), latency samples: 1740
Latency breakdown for phase 0: ["QsBatchToPos: max: 0.225, avg: 0.204", "QsPosToProposal: max: 0.343, avg: 0.258", "ConsensusProposalToOrdered: max: 0.454, avg: 0.412", "ConsensusOrderedToCommit: max: 0.304, avg: 0.289", "ConsensusProposalToCommit: max: 0.715, avg: 0.701"]
Max round gap was 1 [limit 4] at version 1731523. Max no progress secs was 4.257454 [limit 15] at version 1731523.
Test Ok

rahxephon89 mentioned this pull request Feb 3, 2024

[comparison-testing-tool] Allow specifying target address when dumping the txn data #11762

Closed

rahxephon89 force-pushed the teng/comparison-testing-online-alternative branch 3 times, most recently from dc11dbd to 47e7195 Compare February 4, 2024 04:32

rahxephon89 changed the title ~~[WIP] Add more features to the comparison testing tool~~ Add more features to the comparison testing tool Feb 4, 2024

rahxephon89 changed the title ~~Add more features to the comparison testing tool~~ [comparison-testing-tool] Add more features to the comparison testing tool Feb 4, 2024

rahxephon89 marked this pull request as ready for review February 4, 2024 04:52

rahxephon89 requested review from davidiw, wrwg, zekun000, vgao1996 and georgemitenkov as code owners February 4, 2024 04:52

rahxephon89 requested review from runtian-zhou and msmouse February 4, 2024 04:52

rahxephon89 force-pushed the teng/comparison-testing-online-alternative branch 2 times, most recently from 8d22bcb to 8269ec9 Compare February 8, 2024 00:12

runtian-zhou reviewed Feb 8, 2024

View reviewed changes

georgemitenkov reviewed Feb 8, 2024

View reviewed changes

rahxephon89 force-pushed the teng/comparison-testing-online-alternative branch 3 times, most recently from 299e209 to 59e0971 Compare February 9, 2024 02:31

rahxephon89 requested review from georgemitenkov and runtian-zhou February 9, 2024 02:32

georgemitenkov reviewed Feb 11, 2024

View reviewed changes

rahxephon89 force-pushed the teng/comparison-testing-online-alternative branch from 59e0971 to 473c552 Compare February 14, 2024 04:52

rahxephon89 enabled auto-merge (squash) March 12, 2024 17:00

rahxephon89 disabled auto-merge March 12, 2024 17:03

This comment has been minimized.

Sign in to view

rahxephon89 force-pushed the teng/comparison-testing-online-alternative branch from 0806120 to 14b9e82 Compare March 12, 2024 19:25

rahxephon89 added 13 commits March 12, 2024 16:16

add features

539fd32

refactor

32ed92a

handling review comments

1adb581

fix

610cab4

always get aptos framework from git

1ce39b6

simplify mismatch info

41690ba

add failed cache

61375b5

remove clean session

533d877

revert clean session

3c5ac79

handle review comments

d64674e

handle comments

98f034d

comment followup

ff47d49

add gas meter to avoid hanging in loop

5e1dd5b

rahxephon89 force-pushed the teng/comparison-testing-online-alternative branch from 14b9e82 to 5803bee Compare March 12, 2024 23:17

update package publish handling

9cb17be

rahxephon89 force-pushed the teng/comparison-testing-online-alternative branch from 5803bee to 9cb17be Compare March 12, 2024 23:18

rahxephon89 enabled auto-merge (squash) March 12, 2024 23:39

This comment has been minimized.

Sign in to view

rahxephon89 merged commit e318636 into main Mar 13, 2024
82 of 83 checks passed

rahxephon89 deleted the teng/comparison-testing-online-alternative branch March 13, 2024 00:17

[comparison-testing-tool] Add more features to the comparison testing tool #11890

[comparison-testing-tool] Add more features to the comparison testing tool #11890

Conversation

rahxephon89 commented Feb 3, 2024 • edited Loading

Description

Test Plan

trunk-io bot commented Feb 3, 2024 • edited Loading

codecov bot commented Feb 3, 2024 • edited Loading

Codecov Report

runtian-zhou commented Feb 8, 2024 • edited Loading

Choose a reason for hiding this comment

rahxephon89 Feb 8, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

georgemitenkov commented Feb 8, 2024 • edited Loading

rahxephon89 commented Feb 8, 2024

georgemitenkov Feb 11, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

github-actions bot commented Mar 13, 2024

✅ Forge suite compat success on aptos-node-v1.9.5 ==> 9cb17beedfab45a5d1b969f1b53c1fe59e890654

github-actions bot commented Mar 13, 2024

✅ Forge suite realistic_env_max_load success on 9cb17beedfab45a5d1b969f1b53c1fe59e890654

rahxephon89 commented Feb 3, 2024 •

edited

Loading

trunk-io bot commented Feb 3, 2024 •

edited

Loading

codecov bot commented Feb 3, 2024 •

edited

Loading

runtian-zhou commented Feb 8, 2024 •

edited

Loading

rahxephon89 Feb 8, 2024 •

edited

Loading

georgemitenkov commented Feb 8, 2024 •

edited

Loading

georgemitenkov Feb 11, 2024 •

edited

Loading

✅ Forge suite `compat` success on `aptos-node-v1.9.5` ==> `9cb17beedfab45a5d1b969f1b53c1fe59e890654`

✅ Forge suite `realistic_env_max_load` success on `9cb17beedfab45a5d1b969f1b53c1fe59e890654`