Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(prover): Add ProverJobMonitor #2666

Merged
merged 13 commits into from
Aug 19, 2024
Merged

feat(prover): Add ProverJobMonitor #2666

merged 13 commits into from
Aug 19, 2024

Conversation

EmilLuta
Copy link
Contributor

@EmilLuta EmilLuta commented Aug 15, 2024

ProverJobMonitor will be house keeper's counter part in prover subsystem. TL;DR; it's a singleton component, monitoring prover subsystem jobs.

The TL;DR; is that prover and core won't share any databases. This enables:

  • core deployments without affecting prover
  • removing prover infrastructure (DB) in proverless envs

The release plan is as follows:

  • release a component (PJM) that runs in parallel with HK
  • migrate all jobs/metrics/dashboards to PJM
  • delete their counterparts in HK
  • remove redundant infrastructure

This PR contains:

  • a new component (PJM)
  • fixes for bugs/issues with old metrics (backported to HK)
  • refactoring of metrics (PJM metrics cover same metrics as HK, but they are different, as we can cover more with less)
  • various other small nits

P.S. Name is up for discussion, feel free to suggest better name.

ProverJobMonitor will be house keeper's counter part in prover
subsystem. TL;DR; it's a singleton component, monitoring prover
subsystem jobs.

The TL;DR; is that prover and core won't share any databases.
This enables:
- core deployments without affecting prover
- removing prover infrastructure (DB) in proverless envs

The release plan is as follows:
- release a component (PJM) that runs in parallel with HK
- migrate all jobs/metrics/dashboards to PJM
- delete their counterparts in HK
- remove redundant infrastructure

This PR contains:
- a new component (PJM)
- fixes for bugs/issues with old metrics (backported to HK)
- refactoring of metrics (PJM metrics cover same metrics as HK, but they are different, as we can cover more with less)
- various other small nits
@EmilLuta EmilLuta mentioned this pull request Aug 15, 2024
4 tasks
deny.toml Outdated Show resolved Hide resolved
@EmilLuta EmilLuta added this pull request to the merge queue Aug 19, 2024
Merged via the queue into main with commit e22cfb6 Aug 19, 2024
54 checks passed
@EmilLuta EmilLuta deleted the evl-prover-job-monitor branch August 19, 2024 12:27
github-merge-queue bot pushed a commit that referenced this pull request Aug 21, 2024
🤖 I have created a release *beep* *boop*
---


##
[24.19.0](core-v24.18.0...core-v24.19.0)
(2024-08-21)


### Features

* **db:** Allow creating owned Postgres connections
([#2654](#2654))
([47a082b](47a082b))
* **eth-sender:** add option to pause aggregator for gateway migration
([#2644](#2644))
([56d8ee8](56d8ee8))
* **eth-sender:** added chain_id column to eth_txs + support for gateway
in tx_aggregator
([#2685](#2685))
([97aa6fb](97aa6fb))
* **eth-sender:** gateway support for eth tx manager
([#2593](#2593))
([25aff59](25aff59))
* **prover_cli:** Add test for status, l1 and config commands.
([#2263](#2263))
([6a2e3b0](6a2e3b0))
* **prover_cli:** Stuck status
([#2441](#2441))
([232a817](232a817))
* **prover:** Add ProverJobMonitor
([#2666](#2666))
([e22cfb6](e22cfb6))
* **prover:** parallelized memory queues simulation in BWG
([#2652](#2652))
([b4ffcd2](b4ffcd2))
* update base token rate on L1
([#2589](#2589))
([f84aaaf](f84aaaf))
* **zk_toolbox:** Add zk_supervisor run unit tests command
([#2610](#2610))
([fa866cd](fa866cd))
* **zk_toolbox:** Run formatters and linterrs
([#2675](#2675))
([caedd1c](caedd1c))


### Bug Fixes

* **contract-verifier:** Check for 0x in zkvyper output
([#2693](#2693))
([0d77588](0d77588))
* make set token multiplier optional
([#2696](#2696))
([16dff4f](16dff4f))
* **prover:** change bucket for RAM permutation witnesses
([#2672](#2672))
([8b4cbf4](8b4cbf4))
* use lower fair l2 gas price for cbt
([#2690](#2690))
([e1146fc](e1146fc))


### Performance Improvements

* **logs-bloom:** do not run heavy query if migration was completed
([#2680](#2680))
([f9ef00e](f9ef00e))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

---------

Co-authored-by: zksync-era-bot <[email protected]>
Comment on lines +1 to +9
# Seeds for failure cases proptest has generated in the past. It is
# automatically read and these particular cases re-run before any
# novel cases are generated.
#
# It is recommended to check this file in to source control so that
# everyone who runs the test benefits from these saved cases.
cc ca181a7669a6e07b68bce71c8c723efcb8fd2a4e895fc962ca1d33ce5f8188f7 # shrinks to circuit_id = 1
cc ce71957c410fa7af30e04b3e85423555a8e1bbd26b4682b748fa67162bc5687f # shrinks to circuit_id = 1
cc 6d3b0c60d8a5e7d7dc3bb4a2a21cce97461827583ae01b2414345175a02a1221 # shrinks to key = ProverServiceDataKey { circuit_id: 1, round: BasicCircuits }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this an intended change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not at all. Unclear how it happened.

github-merge-queue bot pushed a commit that referenced this pull request Aug 28, 2024
🤖 I have created a release *beep* *boop*
---


##
[16.5.0](prover-v16.4.0...prover-v16.5.0)
(2024-08-28)


### Features

* **prover_cli:** Add test for status, l1 and config commands.
([#2263](#2263))
([6a2e3b0](6a2e3b0))
* **prover_cli:** Stuck status
([#2441](#2441))
([232a817](232a817))
* **prover:** Add ProverJobMonitor
([#2666](#2666))
([e22cfb6](e22cfb6))
* **prover:** parallelized memory queues simulation in BWG
([#2652](#2652))
([b4ffcd2](b4ffcd2))
* Provide easy prover setup
([#2683](#2683))
([30edda4](30edda4))


### Bug Fixes

* **prover_cli:** Remove congif file check
([#2695](#2695))
([2f456f0](2f456f0))
* **prover_cli:** Update prover cli README
([#2700](#2700))
([5a9bbb3](5a9bbb3))
* **prover:** change bucket for RAM permutation witnesses
([#2672](#2672))
([8b4cbf4](8b4cbf4))
* **prover:** fail when fri prover job is not found
([#2711](#2711))
([8776875](8776875))
* **prover:** Revert use of spawn_blocking in LWG/NWG
([#2682](#2682))
([edfcc7d](edfcc7d))
* **prover:** speed up LWG and NWG
([#2661](#2661))
([6243399](6243399))
* **vm:** Fix used bytecodes divergence
([#2741](#2741))
([923e33e](923e33e))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).
EmilLuta added a commit that referenced this pull request Sep 3, 2024
This PR is a follow-up on
#2666, namely the remove
prover side from house keeper.

This PR contains:
- remove all prover jobs from house keeper (now in PJM)
- move core metrics from prover jobs to l1 batch metrics reporter
- remove old configuration

With these changes core & prover are fully decoupled. This will enable
removing unnecessary databases across all envs that don't run provers.
Alongside, core and prover deployments are independent.
github-merge-queue bot pushed a commit that referenced this pull request Sep 4, 2024
This PR is a follow-up on
#2666, namely the remove
prover side from house keeper.

This PR contains:
- remove all prover jobs from house keeper (now in PJM)
- move core metrics from prover jobs to l1 batch metrics reporter
- remove old configuration

With these changes core & prover are fully decoupled. This will enable
removing unnecessary databases across all envs that don't run provers.
Alongside, core and prover deployments are independent.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants