Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add initial version prover_autoscaler #2993

Merged
merged 33 commits into from
Oct 9, 2024

Conversation

yorik
Copy link
Contributor

@yorik yorik commented Oct 1, 2024

What ❔

Add zksync_prover_autoscaler, which collects data, but only reports metrics instead of actual scaling.

Why ❔

First step in creating fast global prover autoscaler.

Checklist

  • PR title corresponds to the body of PR (we generate changelog entries from PRs).
  • Tests for the changes have been added / updated.
  • Documentation comments have been added / updated.
  • Code has been formatted via zk fmt and zk lint.

@yorik yorik requested a review from EmilLuta October 1, 2024 21:58
Copy link
Contributor

@Artemka374 Artemka374 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly code style suggestions, also the PR has a lot of TODOs. While I understand that it's not a final version of code, I'd suggest you to remove as much as possible of them ATM

prover/crates/bin/prover_autoscaler/src/global/queuer.rs Outdated Show resolved Hide resolved
prover/crates/bin/prover_autoscaler/src/global/scaler.rs Outdated Show resolved Hide resolved
prover/crates/bin/prover_autoscaler/src/global/scaler.rs Outdated Show resolved Hide resolved
prover/crates/bin/prover_autoscaler/src/global/scaler.rs Outdated Show resolved Hide resolved
prover/crates/bin/prover_autoscaler/src/global/scaler.rs Outdated Show resolved Hide resolved
prover/crates/bin/prover_autoscaler/src/global/scaler.rs Outdated Show resolved Hide resolved
prover/crates/bin/prover_autoscaler/src/global/scaler.rs Outdated Show resolved Hide resolved
prover/crates/bin/prover_autoscaler/src/global/scaler.rs Outdated Show resolved Hide resolved
prover/crates/bin/prover_autoscaler/src/global/scaler.rs Outdated Show resolved Hide resolved
prover/crates/bin/prover_autoscaler/src/k8s/scaler.rs Outdated Show resolved Hide resolved
@Artemka374
Copy link
Contributor

Artemka374 commented Oct 2, 2024

I might've missed some of the issues, but I can sum up them as this for now:

  • use if let Some instead of direct `if -> else{continue}
  • Try to avoid println! and dbg! - use tracing instead

Copy link
Contributor

@Artemka374 Artemka374 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good as for initial implementation, there are few other nits, but things will get polished over time I believe. Please add new dependencies to workspace deps instead of adding directly and I'm happy to approve.

core/lib/config/src/configs/prover_autoscaler.rs Outdated Show resolved Hide resolved
core/lib/config/Cargo.toml Show resolved Hide resolved
prover/crates/bin/prover_autoscaler/src/cluster_types.rs Outdated Show resolved Hide resolved
prover/crates/bin/prover_autoscaler/src/metrics.rs Outdated Show resolved Hide resolved
prover/crates/bin/prover_autoscaler/src/global/scaler.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@EmilLuta EmilLuta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a quick review. Nits and general questions don't need to be addressed. I'm concerned that this PR is getting bigger and bigger (and already is rather big).

Artemka374
Artemka374 previously approved these changes Oct 8, 2024
Copy link
Contributor

@Artemka374 Artemka374 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left one nit. Otherwise it's already big and hard to review, I'd vote for merging it ASAP

core/lib/config/Cargo.toml Outdated Show resolved Hide resolved
Artemka374
Artemka374 previously approved these changes Oct 9, 2024
@yorik yorik added this pull request to the merge queue Oct 9, 2024
Merged via the queue into main with commit ebf9604 Oct 9, 2024
41 checks passed
@yorik yorik deleted the ya-zkd-1855-implement-poc-of-quick-prover-autoscaler branch October 9, 2024 11:36
github-merge-queue bot pushed a commit that referenced this pull request Oct 15, 2024
🤖 I have created a release *beep* *boop*
---


##
[24.29.0](core-v24.28.0...core-v24.29.0)
(2024-10-14)


### Features

* Add initial version prover_autoscaler
([#2993](#2993))
([ebf9604](ebf9604))
* add metric to track current cbt ratio
([#3020](#3020))
([3fd2fb1](3fd2fb1))
* **configs:** Add port parameter to ConsensusConfig
([#2986](#2986))
([25112df](25112df))
* **configs:** Add port parameter to ConsensusConfig
([#3051](#3051))
([038c397](038c397))
* **consensus:** smooth transition to p2p syncing (BFT-515)
([#3075](#3075))
([5d339b4](5d339b4))
* **consensus:** Support for syncing blocks before consensus genesis
over p2p network
([#3040](#3040))
([d3edc3d](d3edc3d))
* **en:** periodically fetch bridge addresses
([#2949](#2949))
([e984bfb](e984bfb))
* **eth-sender:** add time_in_mempool_cap config
([#3018](#3018))
([f6d86bd](f6d86bd))
* **eth-watch:** catch another reth error
([#3026](#3026))
([4640c42](4640c42))
* Handle new yul compilation flow
([#3038](#3038))
([4035361](4035361))
* **state-keeper:** pre-insert unsealed L1 batches
([#2846](#2846))
([e5b5a3b](e5b5a3b))
* **vm:** EVM emulator support – base
([#2979](#2979))
([deafa46](deafa46))
* **zk_toolbox:** added support for setting attester committee defined
in a separate file
([#2992](#2992))
([6105514](6105514))
* **zk_toolbox:** Redesign zk_toolbox commands
([#3003](#3003))
([114834f](114834f))
* **zktoolbox:** added checking the contract owner in
set-attester-committee command
([#3061](#3061))
([9b0a606](9b0a606))


### Bug Fixes

* **api:** Accept integer block count in `eth_feeHistory`
([#3077](#3077))
([4d527d4](4d527d4))
* **api:** Adapt `eth_getCode` to EVM emulator
([#3073](#3073))
([15fe5a6](15fe5a6))
* bincode deserialization for VM run data
([#3044](#3044))
([b0ec79f](b0ec79f))
* bincode deserialize for WitnessInputData
([#3055](#3055))
([91d0595](91d0595))
* **external-node:** make fetcher rely on unsealed batches
([#3088](#3088))
([bb5d147](bb5d147))
* **state-keeper:** ensure unsealed batch is present during IO init
([#3071](#3071))
([bdeb411](bdeb411))
* **vm:** Check protocol version for fast VM
([#3080](#3080))
([a089f3f](a089f3f))
* **vm:** Prepare new VM for use in API server and fix divergences
([#2994](#2994))
([741b77e](741b77e))


### Reverts

* **configs:** Add port parameter to ConsensusConfig
([#2986](#2986))
([#3046](#3046))
([abe35bf](abe35bf))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

---------

Co-authored-by: zksync-era-bot <[email protected]>
github-merge-queue bot pushed a commit that referenced this pull request Oct 31, 2024
🤖 I have created a release *beep* *boop*
---


##
[16.6.0](prover-v16.5.0...prover-v16.6.0)
(2024-10-31)


### Features

* (DB migration) Rename recursion_scheduler_level_vk_hash to
snark_wrapper_vk_hash
([#2809](#2809))
([64f9551](64f9551))
* Add initial version prover_autoscaler
([#2993](#2993))
([ebf9604](ebf9604))
* added seed_peers to consensus global config
([#2920](#2920))
([e9d1d90](e9d1d90))
* attester committees data extractor (BFT-434)
([#2684](#2684))
([92dde03](92dde03))
* Bump crypto and protocol deps
([#2825](#2825))
([a5ffaf1](a5ffaf1))
* **circuit_prover:** Add circuit prover
([#2908](#2908))
([48317e6](48317e6))
* **consensus:** Support for syncing blocks before consensus genesis
over p2p network
([#3040](#3040))
([d3edc3d](d3edc3d))
* **da-clients:** add secrets
([#2954](#2954))
([f4631e4](f4631e4))
* gateway preparation
([#3006](#3006))
([16f2757](16f2757))
* Integrate tracers and implement circuits tracer in vm2
([#2653](#2653))
([87b02e3](87b02e3))
* Move prover data to
/home/popzxc/workspace/current/zksync-era/prover/data
([#2778](#2778))
([62e4d46](62e4d46))
* Prover e2e test
([#2975](#2975))
([0edd796](0edd796))
* **prover:** add CLI option to run prover with max allocation
([#2794](#2794))
([35e4cae](35e4cae))
* **prover:** Add endpoint to PJM to get queue reports
([#2918](#2918))
([2cec83f](2cec83f))
* **prover:** Add error to panic message of prover
([#2807](#2807))
([6e057eb](6e057eb))
* **prover:** Add min_provers and dry_run features. Improve metrics and
test. ([#3129](#3129))
([7c28964](7c28964))
* **prover:** Add scale failure events watching and pods eviction.
([#3175](#3175))
([dd166f8](dd166f8))
* **prover:** Add sending scale requests for Scaler targets
([#3194](#3194))
([767c5bc](767c5bc))
* **prover:** Add support for scaling WGs and compressor
([#3179](#3179))
([c41db9e](c41db9e))
* **prover:** Autoscaler sends scale request to appropriate agents.
([#3150](#3150))
([bfedac0](bfedac0))
* **prover:** Extract keystore into a separate crate
([#2797](#2797))
([e239260](e239260))
* **prover:** Optimize setup keys loading
([#2847](#2847))
([19887ef](19887ef))
* **prover:** Refactor WitnessGenerator
([#2845](#2845))
([934634b](934634b))
* **prover:** Update witness generator to zkevm_test_harness 0.150.6
([#3029](#3029))
([2151c28](2151c28))
* **prover:** Use query macro instead string literals for queries
([#2930](#2930))
([1cf959d](1cf959d))
* **prover:** WG refactoring
[#3](#3)
([#2942](#2942))
([df68762](df68762))
* **prover:** WitnessGenerator refactoring
[#2](#2)
([#2899](#2899))
([36e5340](36e5340))
* Refactor metrics/make API use binaries
([#2735](#2735))
([8ed086a](8ed086a))
* Remove prover db from house keeper
([#2795](#2795))
([85b7346](85b7346))
* **tee:** use hex serialization for RPC responses
([#2887](#2887))
([abe0440](abe0440))
* **utils:** Rework locate_workspace, introduce Workspace type
([#2830](#2830))
([d256092](d256092))
* vm2 tracers can access storage
([#3114](#3114))
([e466b52](e466b52))
* **vm:** Do not panic on VM divergence
([#2705](#2705))
([7aa5721](7aa5721))
* **vm:** EVM emulator support – base
([#2979](#2979))
([deafa46](deafa46))
* **vm:** Extract batch executor to separate crate
([#2702](#2702))
([b82dfa4](b82dfa4))
* **zk_toolbox:** `zk_supervisor prover` subcommand
([#2820](#2820))
([3506731](3506731))
* **zk_toolbox:** Add external_node consensus support
([#2821](#2821))
([4a10d7d](4a10d7d))
* **zk_toolbox:** Add SQL format for zk supervisor
([#2950](#2950))
([540e5d7](540e5d7))
* **zk_toolbox:** deploy legacy bridge
([#2837](#2837))
([93b4e08](93b4e08))
* **zk_toolbox:** Redesign zk_toolbox commands
([#3003](#3003))
([114834f](114834f))
* **zkstack_cli:** Build dependencies at zkstack build time
([#3157](#3157))
([724d9a9](724d9a9))


### Bug Fixes

* allow compilation under current toolchain
([#3176](#3176))
([89eadd3](89eadd3))
* **api:** Return correct flat call tracer
([#2917](#2917))
([218646a](218646a))
* count SECP256 precompile to account validation gas limit as well
([#2859](#2859))
([fee0c2a](fee0c2a))
* Fix Doc lint.
([#3158](#3158))
([c79949b](c79949b))
* ignore unknown fields in rpc json response
([#2962](#2962))
([692ea73](692ea73))
* **prover:** Do not exit on missing watcher data.
([#3119](#3119))
([76ed6d9](76ed6d9))
* **prover:** fix setup_metadata_to_setup_data_key
([#2875](#2875))
([4ae5a93](4ae5a93))
* **prover:** Run for zero queue to allow scaling down to 0
([#3115](#3115))
([bbe1919](bbe1919))
* **tee_verifier:** correctly initialize storage for re-execution
([#3017](#3017))
([9d88373](9d88373))
* **vm:** Prepare new VM for use in API server and fix divergences
([#2994](#2994))
([741b77e](741b77e))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants