Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[comparison-testing-tool] Add more features to the comparison testing tool #11890

Merged
merged 14 commits into from
Mar 13, 2024

Conversation

rahxephon89
Copy link
Contributor

@rahxephon89 rahxephon89 commented Feb 3, 2024

Description

This PR:

  • adds a new flag --target-account to the comparison testing tool such that it can dump txns with a specific target account; usage: cargo run --begin-version <BEGIN_VERSION> --limit <LIMIT> dump --target-account <account-address> ...
  • adds a new sub command online to execute transactions without dumping the pre-state data; usage: cargo run --begin-version <BEGIN_VERSION> --limit <LIMIT> online [OPTIONS] <ENDPOINT> [OUTPUT_PATH]
  • refactors some of the code to make the tool more efficient.

Usage:

Test Plan

Manual test

Copy link

trunk-io bot commented Feb 3, 2024

⏱️ 24h 1m total CI duration on this PR
Job Cumulative Duration Recent Runs
rust-move-unit-coverage 6h 35m 🟩🟩🟩🟩🟩 (+8 more)
rust-unit-tests 5h 44m 🟩🟩🟩🟩🟩 (+6 more)
windows-build 4h 10m 🟩🟩🟩🟩🟩 (+8 more)
rust-move-tests 3h 12m 🟩🟩🟩🟩🟩 (+7 more)
rust-lints 1h 30m 🟩🟩🟩🟩🟩 (+6 more)
run-tests-main-branch 56m 🟥🟥🟥🟥🟥 (+6 more)
check 45m 🟩🟩🟩🟩🟩 (+6 more)
general-lints 32m 🟩🟩🟩🟩🟩 (+6 more)
check-dynamic-deps 27m 🟩🟩🟩🟩🟩 (+7 more)
semgrep/ci 4m 🟩🟩🟩🟩🟩 (+7 more)
file_change_determinator 2m 🟩🟩🟩🟩🟩 (+6 more)
file_change_determinator 2m 🟩🟩🟩🟩🟩 (+6 more)
permission-check 41s 🟩🟩🟩🟩🟩 (+6 more)
permission-check 32s 🟩🟩🟩🟩🟩 (+6 more)
permission-check 30s 🟩🟩🟩🟩🟩 (+6 more)
permission-check 30s 🟩🟩🟩🟩🟩 (+6 more)

🚨 2 jobs on the last run were significantly faster/slower than expected

Job Duration vs 7d avg Delta
run-tests-main-branch 6m 4m +48%
windows-build 26m 20m +28%

settingsfeedbackdocs ⋅ learn more about trunk.io

Copy link

codecov bot commented Feb 3, 2024

Codecov Report

Attention: Patch coverage is 0% with 24 lines in your changes are missing coverage. Please review.

Project coverage is 63.9%. Comparing base (d6c174c) to head (9cb17be).
Report is 1 commits behind head on main.

Files Patch % Lines
..._party/move/move-core/types/src/account_address.rs 0.0% 10 Missing ⚠️
types/src/state_store/state_key.rs 0.0% 10 Missing ⚠️
...tos-move/aptos-vm/src/move_vm_ext/warm_vm_cache.rs 0.0% 3 Missing ⚠️
aptos-move/aptos-vm/src/gas.rs 0.0% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff            @@
##             main   #11890     +/-   ##
=========================================
- Coverage    63.9%    63.9%   -0.1%     
=========================================
  Files         807      807             
  Lines      178311   178334     +23     
=========================================
  Hits       114052   114052             
- Misses      64259    64282     +23     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@rahxephon89 rahxephon89 force-pushed the teng/comparison-testing-online-alternative branch 3 times, most recently from dc11dbd to 47e7195 Compare February 4, 2024 04:32
@rahxephon89 rahxephon89 changed the title [WIP] Add more features to the comparison testing tool Add more features to the comparison testing tool Feb 4, 2024
@rahxephon89 rahxephon89 changed the title Add more features to the comparison testing tool [comparison-testing-tool] Add more features to the comparison testing tool Feb 4, 2024
@rahxephon89 rahxephon89 marked this pull request as ready for review February 4, 2024 04:52
@rahxephon89 rahxephon89 force-pushed the teng/comparison-testing-online-alternative branch 2 times, most recently from 8d22bcb to 8269ec9 Compare February 8, 2024 00:12
@runtian-zhou
Copy link
Contributor

runtian-zhou commented Feb 8, 2024

Can you add an command example to use the target-address and online feature in the PR description?

target_account: Option<AccountAddress>,
},
/// Online execution
Online {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I wasn't quite sure what this command means. Can you add more description on what this would do?

Copy link
Contributor Author

@rahxephon89 rahxephon89 Feb 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the confusion. With this command, the tool does not need to dump the state. Instead, it will try to get each txn, use DebuggerStateView the get the state, except for the code, which is obtained from compilation of the source code and load it to the code state. The logic is defined in data_state_view.rs in this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think your explanation can go to the comment instead of /// Online execution

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

)
.unwrap();
let mut session = vm.new_session(&resolver, SessionId::void());

let mut session = vm.new_session_with_flush_flag(resolver, SessionId::void(), true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm the logic indeed looks pretty weird to me. Ideally I would avoid making changes in the MoveVM but would like to understand the problem here a bit better.

Looking at the code, the MoveVM will be spawned fresh in line 1008. The only thing that was cached previously is the WarmVMCache which would probably cache the previous version of Aptos Framework. Is that the code you were having issue with? i.e: the older version of aptos framework is being loaded and cause execution problem?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah... Currently, the tool executes v1 and then v2 for each txn. After execution of v1 code, we need to reload the aptos framework compiled by v2.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My take is we should avoid making changes here in the Move VM. Instead, we should look into how to invalidate this WarmVMCache on a framework upgrade. This is the logic the real VM is doing in the node as well so shouldn't be something new.

self.new_session_with_flush_flag(resolver, session_id, false)
}

pub fn new_session_with_flush_flag<'r, S: AptosMoveResolver>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a comment why this flag is useful? Otherwise we have different APIs for sessions which are created differently. Also, what you can do instead is to have new_clean_session which has the implementation with must_flush = true. This way you can make this function debugger-only (mark as a cfg feature?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So having a private new_session_impl and then wrapping new_session and new_clean_session seems cleaner than having one public API call some other public API?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

aptos-move/e2e-tests/src/executor.rs Outdated Show resolved Hide resolved
match self.inner() {
StateKeyInner::AccessPath(access_path) => {
!access_path.path.is_empty()
&& access_path.path[0] == CODE_TAG
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can check if access path is code using is_code function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -154,6 +154,20 @@ impl StateKey {
pub fn get_shard_id(&self) -> u8 {
CryptoHash::hash(self).nibble(0)
}

pub fn is_aptos_path(&self) -> bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: is_aptos_code or similar? so that the function name describes we process modules

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -15,13 +15,17 @@ use std::{convert::TryFrom, fmt, str::FromStr};
pub struct AccountAddress([u8; AccountAddress::LENGTH]);

impl AccountAddress {
/// Hex address: 0x4
pub const FOUR: Self = Self::get_hex_address_four();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not ordering constants for addresses like 1,2,3,4?😄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it is currently order by the alphabet so four is the first one. I will change that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}

fn get_usage(&self) -> StateViewResult<StateStorageUsage> {
unimplemented!()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: unreachable!()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


pub struct DataStateView {
debugger_view: DebuggerStateView,
code_data: Option<FakeDataStore>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use FakeDataStore? What does this code data do? I see we pass it into new but how it gets populated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the confusion. DataStateView is currently used in two places: 1) in the dump mode, I used data_read_state_keys to store the data state value, which is then dumped to the file; 2) in the online mode, I use code_data to store the bytecode compiled by V1 and V2, whenever a txn is obtained, I will use debugger_view + code_data to execute the txn. It is used in the get_state_value function as long as it is not none

) -> Option<Result<(WriteSet, Vec<ContractEvent>), VMStatus>> {
let executor = FakeExecutor::no_genesis();
let mut executor = executor.set_not_parallel();
*executor.data_store_mut() = state.clone();
*executor.data_store_mut() = state;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we overwrite default data created by FakeExecutor with empty state?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it happens in the online execution mode

return Some(executor.try_exec_entry_with_resolver(
senders,
entry_function,
&executor.data_store().clone().as_move_resolver(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? try_exec would use executor.data_store() and modify it, so I guess the main problem here us that you want to run something multiple times from the same state (and "uncommit") any modifications?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the current design, each txn has its own state so the corresponding state is only run once without any modifications. In the offline execution mode, the data state is from the file (obtained in the dump mode). In the online execution mode, the data state is from DebuggerStateView.

@georgemitenkov
Copy link
Contributor

georgemitenkov commented Feb 8, 2024

@rahxephon89 left a few (relevant and not so relevant) comments, but for my understanding, why we use FakeExecutor here? I feel like the main issue is that we create it and it has data store associated, so it becomes non-reusable if you want to flush changes every time you run an executor?

We have an API inside a vm to run a block of txns on top of StateView without committing the changes, but producing outputs, would not that be useful here?

@rahxephon89
Copy link
Contributor Author

@rahxephon89 left a few (relevant and not so relevant) comments, but for my understanding, why we use FakeExecutor here? I feel like the main issue is that we create it and it has data store associated, so it becomes non-reusable if you want to flush changes every time you run an executor?

We have an API inside a vm to run a block of txns on top of StateView without committing the changes, but producing outputs, would not that be useful here?

Thanks for pointing this out, @georgemitenkov! There are some APIs that I don't understand well so I have not used them. We can discuss more offline.

@rahxephon89 rahxephon89 force-pushed the teng/comparison-testing-online-alternative branch 3 times, most recently from 299e209 to 59e0971 Compare February 9, 2024 02:31
) -> Option<Result<(WriteSet, Vec<ContractEvent>), VMStatus>> {
let executor = FakeExecutor::no_genesis();
let mut executor = executor.set_not_parallel();
*executor.data_store_mut() = state.clone();
Copy link
Contributor

@georgemitenkov georgemitenkov Feb 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need a FakeExecutor here, why not just do AptosVM::execute_block_no_limit(&vec![txn], &state)? Then you avoid coupling the execution with data state, and can use whatever state you want, either using V1 code or V2 code

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Plus our testing is a mess, I would trust execute_block_no_limit API more than exec_... variants

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The benefit of this API is that it 1) doesn't commit to state, so you can reuse the same state version for multiple executions, and 2) all things like aggregators are handled

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the advice, @georgemitenkov. One concern I have is that for early txns, VM_BINARY_FORMAT_V6 is not enabled but the compiler only generates V6 bytecode. Is there a way to solve this issue?

@rahxephon89 rahxephon89 force-pushed the teng/comparison-testing-online-alternative branch from 59e0971 to 473c552 Compare February 14, 2024 04:52
@rahxephon89 rahxephon89 enabled auto-merge (squash) March 12, 2024 17:00
@rahxephon89 rahxephon89 disabled auto-merge March 12, 2024 17:03

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

@rahxephon89 rahxephon89 force-pushed the teng/comparison-testing-online-alternative branch from 0806120 to 14b9e82 Compare March 12, 2024 19:25
@rahxephon89 rahxephon89 force-pushed the teng/comparison-testing-online-alternative branch from 14b9e82 to 5803bee Compare March 12, 2024 23:17
@rahxephon89 rahxephon89 force-pushed the teng/comparison-testing-online-alternative branch from 5803bee to 9cb17be Compare March 12, 2024 23:18
@rahxephon89 rahxephon89 enabled auto-merge (squash) March 12, 2024 23:39

This comment has been minimized.

This comment has been minimized.

Copy link
Contributor

✅ Forge suite compat success on aptos-node-v1.9.5 ==> 9cb17beedfab45a5d1b969f1b53c1fe59e890654

Compatibility test results for aptos-node-v1.9.5 ==> 9cb17beedfab45a5d1b969f1b53c1fe59e890654 (PR)
1. Check liveness of validators at old version: aptos-node-v1.9.5
compatibility::simple-validator-upgrade::liveness-check : committed: 6972 txn/s, latency: 4779 ms, (p50: 4800 ms, p90: 7700 ms, p99: 8100 ms), latency samples: 244040
2. Upgrading first Validator to new version: 9cb17beedfab45a5d1b969f1b53c1fe59e890654
compatibility::simple-validator-upgrade::single-validator-upgrade : committed: 1824 txn/s, latency: 15414 ms, (p50: 17800 ms, p90: 22300 ms, p99: 23100 ms), latency samples: 94880
3. Upgrading rest of first batch to new version: 9cb17beedfab45a5d1b969f1b53c1fe59e890654
compatibility::simple-validator-upgrade::half-validator-upgrade : committed: 361 txn/s, submitted: 618 txn/s, expired: 257 txn/s, latency: 33742 ms, (p50: 32200 ms, p90: 50900 ms, p99: 56700 ms), latency samples: 27830
4. upgrading second batch to new version: 9cb17beedfab45a5d1b969f1b53c1fe59e890654
compatibility::simple-validator-upgrade::rest-validator-upgrade : committed: 2866 txn/s, latency: 10703 ms, (p50: 9900 ms, p90: 17800 ms, p99: 18700 ms), latency samples: 137580
5. check swarm health
Compatibility test for aptos-node-v1.9.5 ==> 9cb17beedfab45a5d1b969f1b53c1fe59e890654 passed
Test Ok

Copy link
Contributor

✅ Forge suite realistic_env_max_load success on 9cb17beedfab45a5d1b969f1b53c1fe59e890654

two traffics test: inner traffic : committed: 8127 txn/s, latency: 4839 ms, (p50: 4500 ms, p90: 5700 ms, p99: 12600 ms), latency samples: 3502820
two traffics test : committed: 100 txn/s, latency: 1927 ms, (p50: 1800 ms, p90: 2100 ms, p99: 7000 ms), latency samples: 1740
Latency breakdown for phase 0: ["QsBatchToPos: max: 0.225, avg: 0.204", "QsPosToProposal: max: 0.343, avg: 0.258", "ConsensusProposalToOrdered: max: 0.454, avg: 0.412", "ConsensusOrderedToCommit: max: 0.304, avg: 0.289", "ConsensusProposalToCommit: max: 0.715, avg: 0.701"]
Max round gap was 1 [limit 4] at version 1731523. Max no progress secs was 4.257454 [limit 15] at version 1731523.
Test Ok

@rahxephon89 rahxephon89 merged commit e318636 into main Mar 13, 2024
82 of 83 checks passed
@rahxephon89 rahxephon89 deleted the teng/comparison-testing-online-alternative branch March 13, 2024 00:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants