Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: improved base node monitoring #5390

Merged

Conversation

stringhandler
Copy link
Collaborator

@stringhandler stringhandler commented May 17, 2023

Add the ability to run tokio_console on the console_wallet.

I found one case where the UI would use a lot of CPU, but I couldn't find the cause.

Reduced the UI caching time to be more responsive.

Renamed base_node_monitor_refresh_interval to base_node_monitor_max_refresh_interval and implemented a backoff strategy.
If a new block is found when querying the base node, immediately query it again, backing off each time it has the same block. This allows you to update quickly if the node is syncing. Otherwise, backoff to the max interval. The default is 90 seconds.
(was previously 3 seconds)

Changed output and transaction manager service to rely on the base node service to send them an event when a new block is encountered. Both were doing this check themselves in addition to the check the base node service was doing.

The base node service also only checked on the height, now the hash is also checked.

@ghpbot-tari-project ghpbot-tari-project added the CR-insufficient_context Your PRs commit messages don't provide enough context to justify accepting the change. label May 17, 2023
@stringhandler stringhandler marked this pull request as draft May 17, 2023 16:06
@github-actions
Copy link

github-actions bot commented May 17, 2023

Test Results (CI)

1 147 tests  ±0   1 147 ✔️ ±0   9m 58s ⏱️ - 5m 31s
     37 suites ±0          0 💤 ±0 
       1 files   ±0          0 ±0 

Results for commit dd3ee20. ± Comparison against base commit 8631bc2.

♻️ This comment has been updated with latest results.

@ghpbot-tari-project ghpbot-tari-project added P-acks_required Process - Requires more ACKs or utACKs P-reviews_required Process - Requires a review from a lead maintainer to be merged labels May 17, 2023
@github-actions
Copy link

github-actions bot commented May 17, 2023

Test Results (Integration tests)

  2 files  ±0  12 suites  ±0   16m 11s ⏱️ - 9m 4s
29 tests ±0  29 ✔️ +1  0 💤 ±0  0  - 1 
29 runs   - 1  29 ✔️ ±0  0 💤 ±0  0  - 1 

Results for commit dd3ee20. ± Comparison against base commit 8631bc2.

♻️ This comment has been updated with latest results.

@stringhandler stringhandler changed the title WIP test: better ffi testing test: better ffi testing May 23, 2023
@stringhandler stringhandler marked this pull request as ready for review May 23, 2023 14:50
Copy link
Collaborator

@SWvheerden SWvheerden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we fine with not triggering on node sync and only when it received a block via propagation?
But apart from that it looks good

});
}
self.last_seen_tip_height = state.chain_metadata.map(|cm| cm.height_of_longest_chain());
BaseNodeEvent::BaseNodeStateChanged(_state) => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will stop the wallet from refreshing on node sync done, which might not be ideal

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This state is published when only the latency changes, so it was triggering much more frequently than needed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See related comment ^^. I think we should add a ConnectedBaseNodeChanged event so we can refresh on that specifically.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved with comment below

(None, _) => true,
_ => false,
};
BaseNodeEvent::BaseNodeStateChanged(_state) => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will trigger the first time it gets a block from the node, so I think that makes sense

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See related comment ^^.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved with comment below

Copy link
Contributor

@hansieodendaal hansieodendaal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this is a good change. Please see the comments noted below, and maybe the PR title should be feat: improved base node monitoring.

base_layer/wallet/src/base_node_service/handle.rs Outdated Show resolved Hide resolved
Comment on lines +219 to +220
// returns true if a new block, otherwise false
async fn update_state(&self, new_state: BaseNodeState) -> bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should monitor changes to the base node_id here as well (not all the rest of BaseNodeState) - if it changed then this function should return true as well. That way, if a base node is swapped, we can trigger validate_outputs to run upon that change,

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved with comment below

});
}
self.last_seen_tip_height = state.chain_metadata.map(|cm| cm.height_of_longest_chain());
BaseNodeEvent::BaseNodeStateChanged(_state) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See related comment ^^. I think we should add a ConnectedBaseNodeChanged event so we can refresh on that specifically.

(None, _) => true,
_ => false,
};
BaseNodeEvent::BaseNodeStateChanged(_state) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See related comment ^^.

let mut lock = self.state.write().await;
let (new_block_detected, height) = match (new_state.chain_metadata.clone(), lock.chain_metadata.clone()) {
let (new_block_detected, height, hash) = match (new_state.chain_metadata.clone(), lock.chain_metadata.clone()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let (new_block_detected, height, hash) = match (new_state.chain_metadata.clone(), lock.chain_metadata.clone()) {
let (new_block_detected, connected_base_node_change_detected , height, hash) = match (new_state.chain_metadata.clone(), lock.chain_metadata.clone()) {

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better to open a ticket for this, so that we add the handling of this event in the correct places.

I don't think it's necessary, because I believe the logic of the code is to send a Default state when the base node changes. This will not publish the NewBlockEvent, but when we do get a chain metadata from the new base node, if it has the same block hash, there is no need to update, but if it changes, it will trigger the appropriate jobs

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, evaluating the block hash will sort this out

};

if new_block_detected {
self.publish_event(BaseNodeEvent::NewBlockDetected(height));
self.publish_event(BaseNodeEvent::NewBlockDetected(hash, height));
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if connected_base_node_change_detected {
self.publish_event(BaseNodeEvent::ConnectedBaseNodeChanged)
}

@stringhandler stringhandler changed the title test: better ffi testing feat: improved base node monitoring May 24, 2023
hansieodendaal
hansieodendaal previously approved these changes May 24, 2023
Copy link
Contributor

@hansieodendaal hansieodendaal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK

@ghpbot-tari-project ghpbot-tari-project removed the P-reviews_required Process - Requires a review from a lead maintainer to be merged label May 24, 2023
@ghpbot-tari-project ghpbot-tari-project added the P-reviews_required Process - Requires a review from a lead maintainer to be merged label May 24, 2023
@stringhandler stringhandler merged commit c704890 into tari-project:development May 24, 2023
@stringhandler stringhandler deleted the st-better-ffi-tests branch May 24, 2023 14:03
SWvheerden pushed a commit that referenced this pull request May 25, 2023
Add the ability to run tokio_console on the console_wallet. 

I found one case where the UI would use a lot of CPU, but I couldn't
find the cause.

Reduced the UI caching time to be more responsive.

Renamed base_node_monitor_refresh_interval to
base_node_monitor_max_refresh_interval and implemented a backoff
strategy.
If a new block is found when querying the base node, immediately query
it again, backing off each time it has the same block. This allows you
to update quickly if the node is syncing. Otherwise, backoff to the max
interval. The default is 90 seconds.
(was previously 3 seconds)

Changed output and transaction manager service to rely on the base node
service to send them an event when a new block is encountered. Both were
doing this check themselves in addition to the check the base node
service was doing.

The base node service also only checked on the height, now the hash is
also checked.

---------

Co-authored-by: Hansie Odendaal <[email protected]>
SWvheerden added a commit that referenced this pull request May 29, 2023
##
[0.50.0-pre.2](v0.50.0-pre.1...v0.50.0-pre.2)
(2023-05-29)

### ⚠ BREAKING CHANGES

* add optional range proof types (5372)

### Features

* add metadata signature check
([5411](#5411))
([9c2bf41](9c2bf41))
* add optional range proof types
([5372](#5372))
([f24784f](f24784f))
* added burn feature to the console wallet
([5322](#5322))
([45685b9](45685b9))
* improved base node monitoring
([5390](#5390))
([c704890](c704890))


### Bug Fixes

* **comms:** only set final forward address if configured to port 0
([5406](#5406))
([ff7fb6d](ff7fb6d))
* deeplink to rfc spec
([5342](#5342))
([806d3b8](806d3b8))
* don't use in memory datastores for chat client dht in integration
tests ([#5399](#5399))
([cbdca6f](cbdca6f))
* fix panic when no public addresses
([5367](#5367))
([49be2a2](49be2a2))
* loop on mismatched passphrase entry
([5396](#5396))
([ed120b2](ed120b2))
* use domain separation for wallet message signing
([5400](#5400))
([7d71f8b](7d71f8b))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CR-insufficient_context Your PRs commit messages don't provide enough context to justify accepting the change. P-acks_required Process - Requires more ACKs or utACKs P-reviews_required Process - Requires a review from a lead maintainer to be merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants