-
Notifications
You must be signed in to change notification settings - Fork 403
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add TxObserver for Prometheus manager #1448
Add TxObserver for Prometheus manager #1448
Conversation
7bc2a8a
to
96c71c9
Compare
Codecov Report
@@ Coverage Diff @@
## main #1448 +/- ##
==========================================
+ Coverage 55.56% 55.89% +0.33%
==========================================
Files 103 105 +2
Lines 4451 4507 +56
Branches 676 688 +12
==========================================
+ Hits 2473 2519 +46
Misses 1435 1435
- Partials 543 553 +10
Flags with carried forward coverage won't be shown. Click here to find out more.
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good start, some comments about naming and also would like to see the option for periodic bulk messaging
packages/caliper-core/lib/common/messages/prometheusUpdateMessage.js
Outdated
Show resolved
Hide resolved
@@ -0,0 +1,97 @@ | |||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't prometheus specific at this point so I think we could just call this ManagerTxObserver and the file manager-tx-observer,js
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have named it this way because it would be inline with the transaction monitor module in the config file. In the existing codebase module: prometheus-push
is bound to a PrometheusPushTxObserver
and module: prometheus
is bound to a PrometheusTxObserver
, etc., so PrometheusManagerTxObserver
would be the equivalent for the planned module: prometheus-manager
config property: https://github.com/hyperledger/caliper/blob/4aa36ea91e21966d4aac65d04a36720ee93ca36d/packages/caliper-core/lib/worker/tx-observers/tx-observer-dispatch.js#L21-L25
Should I perform the refactor to ManagerTxObserver
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed we, lets leave the names as they currently are
roundIndex: this.currentRound, | ||
roundLabel: this.roundLabel, | ||
status: results.GetStatus(), | ||
latency: (results.GetTimeFinal() - results.GetTimeCreate())/1000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here you divide latency by 1000, but above you don't. I assume they should both be the same
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had noticed this, but I had left it as such so that it is inline with the implementations in the other prometheus observers:
I would like to know if this is a bug in the codebase or is it intended. If this is a bug, I could fix this for the other observers in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That looks like a bug to me in the other observers and both entries should divide by 1000 as it looks like the latency histogram is specified as seconds
* Called when TXs are submitted. | ||
* @param {number} count The number of submitted TXs. Can be greater than one for a batch of TXs. | ||
*/ | ||
txSubmitted(count) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that we send messages for every submit and finished txn, this could result in a large number of messages being sent on high tps runs. This may be fine but it's also not tuneable if there is an issue. It would be good to allow for periodic sending as well with the workers collating the information so that we can do bulk sending.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a few doubts regarding how this could be implemented.
I'm thinking of the following approach:
- For the periodic-sending feature: A timer is started during the instantiation of the observer, which would regularly send the messages present in an array that keeps track of the pending messages.
- For the collation feature: The max collation size is obtained in the config and an array is maintained with pending messages. When a new message arrives and the collation size exceeds on appending the message, the pending messages are sent in bulk.
However I'm not sure about how we could handle the sending of messages during termination of Caliper after the benchmark is successful. In both these features, there might be some loss of data as:
- In case of periodic sending, the timer for the last batch of messages might not be sent as the timer might trigger after the scrape server is shut down during the termination of Caliper.
- In case of collation, the last batch of messages might have a count less than the size defined in the config, in which case, it would not be sent when Caliper terminates.
I would like to know if the approaches would be feasible and if there is any way we could handle these cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed the deactivate method should help in handling the termination of a worker case
|
||
/** | ||
* | ||
* @return {boolean} the fake path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment doesn't match code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to self: This needs to be fixed in https://github.com/hyperledger/caliper/blob/c648fa9510fdb856d9cfec455dc8eb42cee2600b/packages/caliper-core/test/worker/tx-observers/prometheus-push-tx-observer.js#L39 and https://github.com/hyperledger/caliper/blob/04a1154fcbc530131b3c03fa97a7ecb833660f56/packages/caliper-core/test/worker/tx-observers/prometheus-tx-observer.js#L39 as well in a later PR
warnOnReplace: false, | ||
warnOnUnregistered: false | ||
}); | ||
mockery.registerMock('../../common/utils/caliper-utils', Utils); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this actually needed ? the class under test doesn't seem to use the Utils package
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is needed because the base interface (TxObserverInterface) is dependent on the Utils class. Without the mock, this error would be thrown on running the tests:
Error: Util.resolvePath: Parameter is undefined
at Function.resolvePath (lib/common/utils/caliper-utils.js:247:19)
at new TxObserverInterface (lib/worker/tx-observers/tx-observer-interface.js:26:61)
at new PrometheusManagerTxObserver (lib/worker/tx-observers/prometheus-manager-tx-observer.js:21:105)
at new createTxObserver (lib/worker/tx-observers/prometheus-manager-tx-observer.js:36:123)
at Context.<anonymous> (test/worker/tx-observers/prometheus-manager-tx-observer.js:90:26)
at processImmediate (node:internal/timers:466:21)```
c2688ae
to
b6506f6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left details on discord
0d30390
to
a7f17a5
Compare
Signed-off-by: CaptainIRS <[email protected]>
* Rename prometheusUpdateMessage to workerMetricsMessage * Fix doc comment in test * Add collate and periodic update methods * Add logs for config property errors * Fix histogram calculation Signed-off-by: CaptainIRS <[email protected]>
Co-authored-by: Dave Kelsey <[email protected]> Signed-off-by: CaptainIRS <[email protected]>
a7f17a5
to
4fc70d1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
In this PR:
PR 1/n for #1353