Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

randomness #12: disable randomness-breaking governance functions #12420

Merged
merged 8 commits into from
Mar 19, 2024

Conversation

zjma
Copy link
Contributor

@zjma zjma commented Mar 7, 2024

Disable governance functions that breaks randomness

The following governance functions need to be disabled (except some genesis/test usage), because they internally call aptos_framework::reconfiguration::reconfigure() which break randomness for an epoch.

  • aptos_framework::consensus_config::set()
  • aptos_framework::execution_config::set()
  • aptos_framework::version::set()
  • aptos_framework::gas_schedule::set_gas_schedule()
  • std::features::change_feature_flags()
  • aptos_framework::jwks::upsert_oidc_provider()
  • aptos_framework::jwks::remove_oidc_provider()

The new way to update these on-chain configs is to write the change(s) into a buffer, and immediately trigger aptos_governance::reconfigure (which maybe immediate/async, depending on whether randomness is enabled): buffered change will be applied at the end of the reconfig. Typically a governance proposal script will look like:

onchain_config_x::set_for_next_epoch(&framework);
onchain_config_y::set_for_next_epoch(&framework);
// ...
aptos_governance::reconfigure(&framework);

See aptos-move/aptos-release-builder/data/example-release-with-randomness-framework for full details.

Release builder updates

aptos-release-builder is updated accordingly to use the new governance functions when translating release entries.

NOTE that a new entry DefaultGasWithOverrideOld is temporarily added to support the following release proposal sequence:

  1. update txn.max_execution_gas using the old governance function aptos_framework::gas_schedule::update_gas_schedule() (so the next txn can fit).
  2. framework upgrade (old governance functions are now disabled)
  3. Restore to default gas using the new governance function aptos_framework::gas_schedule::update_for_next_epoch()

Update some existing tests accordingly

Added release entries for randomness release

Copy link

trunk-io bot commented Mar 7, 2024

⏱️ 66h 57m total CI duration on this PR
Job Cumulative Duration Recent Runs
forge-framework-upgrade-test / forge 8h 52m 🟥🟥🟥🟥🟥 (+8 more)
rust-unit-tests 6h 57m 🟩🟩🟩 (+14 more)
execution-performance / single-node-performance 6h 18m 🟩🟩🟩🟩🟩 (+13 more)
rust-smoke-tests 5h 54m 🟩🟩🟩🟩 (+12 more)
windows-build 5h 46m 🟩🟩🟩🟩🟩 (+14 more)
rust-move-tests 4h 54m 🟥🟥🟥🟥 (+12 more)
rust-unit-coverage 4h 29m 🟩
rust-move-unit-coverage 4h 16m 🟩🟩🟩🟩 (+12 more)
rust-images / rust-all 3h 44m 🟩🟩🟩🟩 (+12 more)
forge-e2e-test / forge 3h 🟩🟩🟩🟩🟩 (+8 more)
forge-compat-test / forge 2h 44m 🟩🟩🟩🟩🟩 (+8 more)
rust-smoke-coverage 2h 13m 🟩
rust-lints 1h 47m 🟩🟩🟩🟩🟩 (+13 more)
cli-e2e-tests / run-cli-tests 1h 21m 🟩🟩🟩🟩🟩 (+8 more)
run-tests-main-branch 1h 11m 🟩🟩🟩🟩🟥 (+12 more)
check 1h 11m 🟩🟩🟩🟩🟩 (+14 more)
general-lints 38m 🟩🟩🟩🟩🟩 (+13 more)
check-dynamic-deps 36m 🟩🟩🟩🟩🟩 (+14 more)
indexer-grpc-e2e-tests / test-indexer-grpc-docker-compose 30m 🟩🟩🟩🟩🟩 (+8 more)
node-api-compatibility-tests / node-api-compatibility-tests 11m 🟩🟩🟩🟩🟩 (+8 more)
semgrep/ci 8m 🟩🟩🟩🟩🟩 (+14 more)
file_change_determinator 3m 🟩🟩🟩🟩🟩 (+12 more)
execution-performance / file_change_determinator 3m 🟩🟩🟩🟩🟩 (+13 more)
file_change_determinator 3m 🟩🟩🟩🟩🟩 (+14 more)
file_change_determinator 3m 🟩🟩🟩🟩🟩 (+13 more)
permission-check 59s 🟩🟩🟩🟩🟩 (+14 more)
permission-check 53s 🟩🟩🟩🟩🟩 (+14 more)
permission-check 52s 🟩🟩🟩🟩🟩 (+14 more)
determine-docker-build-metadata 52s 🟩🟩🟩🟩🟩 (+12 more)
permission-check 51s 🟩🟩🟩🟩🟩 (+14 more)
permission-check 49s 🟩🟩🟩🟩🟩 (+13 more)
upload-to-codecov 16s 🟩

🚨 2 jobs on the last run were significantly faster/slower than expected

Job Duration vs 7d avg Delta
rust-smoke-coverage 2h 13m 3h 24m -35%
windows-build 13m 20m -35%

settingsfeedbackdocs ⋅ learn more about trunk.io

@zjma zjma added the CICD:run-e2e-tests when this label is present github actions will run all land-blocking e2e tests from the PR label Mar 7, 2024
@zjma zjma marked this pull request as ready for review March 7, 2024 03:01
@zjma zjma requested review from davidiw, movekevin and wrwg as code owners March 7, 2024 03:01

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

Copy link

codecov bot commented Mar 7, 2024

Codecov Report

Attention: Patch coverage is 0% with 43 lines in your changes are missing coverage. Please review.

Project coverage is 64.0%. Comparing base (2141444) to head (3ccbd12).
Report is 16 commits behind head on main.

Files Patch % Lines
types/src/on_chain_config/jwk_consensus_config.rs 0.0% 15 Missing ⚠️
types/src/on_chain_config/randomness_config.rs 0.0% 15 Missing ⚠️
types/src/jwks/mod.rs 0.0% 13 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff            @@
##             main   #12420    +/-   ##
========================================
  Coverage    63.9%    64.0%            
========================================
  Files         816      817     +1     
  Lines      180664   181104   +440     
========================================
+ Hits       115548   115929   +381     
- Misses      65116    65175    +59     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

proposals:
- name: step_1_increase_max_txn_gas
metadata:
title: "Increase max txn gas temporarily for framework upgrade"
description: "Increase max txn gas temporarily for framework upgrade"
execution_mode: MultiStep
update_sequence:
- DefaultGasWithOverride:
- DefaultGasWithOverrideOld: # This is translated to `gas_schedule::set_gas_schedule(...)`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if randomness is not yet enabled, does the new DefaultGasWithOverride work, or is it required to use DefaultGasWithOverrideOld ?

I understand that DefaultGasWithOverrideOld will not work after randomness is enabled, but I am asking for the opposite.

ReleaseEntry::DefaultGasWithOverrideOld(gas_overrides) => {
let gas_schedule = gas_override_default(gas_overrides)?;
if !fetch_and_equals::<GasScheduleV2>(client, &gas_schedule)? {
result.append(&mut gas::generate_gas_upgrade_proposal(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we actually need this? should we just pass the flag based on whether randomness is enabled on-chain?

though I am fine with explicit way as it is now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we want to check the on-chain framework version then. is framework version a thing?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

argh. I don't think that's possible

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

Copy link
Contributor

✅ Forge suite compat success on aptos-node-v1.9.5 ==> 3ccbd124b4c3af3f4e6cb5f32fc307c59b28c961

Compatibility test results for aptos-node-v1.9.5 ==> 3ccbd124b4c3af3f4e6cb5f32fc307c59b28c961 (PR)
1. Check liveness of validators at old version: aptos-node-v1.9.5
compatibility::simple-validator-upgrade::liveness-check : committed: 6689 txn/s, latency: 4997 ms, (p50: 4800 ms, p90: 8700 ms, p99: 9600 ms), latency samples: 234120
2. Upgrading first Validator to new version: 3ccbd124b4c3af3f4e6cb5f32fc307c59b28c961
compatibility::simple-validator-upgrade::single-validator-upgrade : committed: 1718 txn/s, latency: 16601 ms, (p50: 16600 ms, p90: 22000 ms, p99: 22500 ms), latency samples: 89360
3. Upgrading rest of first batch to new version: 3ccbd124b4c3af3f4e6cb5f32fc307c59b28c961
compatibility::simple-validator-upgrade::half-validator-upgrade : committed: 240 txn/s, submitted: 677 txn/s, expired: 436 txn/s, latency: 20342 ms, (p50: 10400 ms, p90: 55500 ms, p99: 56900 ms), latency samples: 17334
4. upgrading second batch to new version: 3ccbd124b4c3af3f4e6cb5f32fc307c59b28c961
compatibility::simple-validator-upgrade::rest-validator-upgrade : committed: 1805 txn/s, latency: 16807 ms, (p50: 18400 ms, p90: 21900 ms, p99: 22800 ms), latency samples: 86660
5. check swarm health
Compatibility test for aptos-node-v1.9.5 ==> 3ccbd124b4c3af3f4e6cb5f32fc307c59b28c961 passed
Test Ok

Copy link
Contributor

✅ Forge suite realistic_env_max_load success on 3ccbd124b4c3af3f4e6cb5f32fc307c59b28c961

two traffics test: inner traffic : committed: 7260 txn/s, latency: 5384 ms, (p50: 5100 ms, p90: 6400 ms, p99: 10200 ms), latency samples: 3143860
two traffics test : committed: 100 txn/s, latency: 1871 ms, (p50: 1800 ms, p90: 2100 ms, p99: 6100 ms), latency samples: 1800
Latency breakdown for phase 0: ["QsBatchToPos: max: 0.255, avg: 0.206", "QsPosToProposal: max: 0.336, avg: 0.306", "ConsensusProposalToOrdered: max: 0.495, avg: 0.446", "ConsensusOrderedToCommit: max: 0.316, avg: 0.302", "ConsensusProposalToCommit: max: 0.758, avg: 0.748"]
Max round gap was 1 [limit 4] at version 1479944. Max no progress secs was 4.181552 [limit 15] at version 1479944.
Test Ok

Copy link
Contributor

✅ Forge suite framework_upgrade success on aptos-node-v1.9.5 ==> 3ccbd124b4c3af3f4e6cb5f32fc307c59b28c961

Compatibility test results for aptos-node-v1.9.5 ==> 3ccbd124b4c3af3f4e6cb5f32fc307c59b28c961 (PR)
Upgrade the nodes to version: 3ccbd124b4c3af3f4e6cb5f32fc307c59b28c961
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 6922 txn/s, latency: 4805 ms, (p50: 4800 ms, p90: 7800 ms, p99: 8300 ms), latency samples: 242300
5. check swarm health
Compatibility test for aptos-node-v1.9.5 ==> 3ccbd124b4c3af3f4e6cb5f32fc307c59b28c961 passed
Test Ok

@zjma zjma merged commit bf9a0d5 into main Mar 19, 2024
46 checks passed
@zjma zjma deleted the zjma/disable-config-functions-by-randomness branch March 19, 2024 04:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CICD:run-e2e-tests when this label is present github actions will run all land-blocking e2e tests from the PR CICD:run-framework-upgrade-test
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants