-
Notifications
You must be signed in to change notification settings - Fork 403
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multi-Round Benchmark gets randomly stuck in one of the rounds #1068
Comments
@nimaafraz Look like Worker#4 didn't finish the round for some reason. Did you see any error logs on the worker side? Could you enable debug logs so we can see what happens exactly? |
@aklenik The problem is resolved when dedicating more CPU. Is there a way to see the individual worker logs? How can I enable the debug logs? |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
I'm facing the same problem, especially when the network is under heavy load. Like the OP, I'm testing the simple benchmark. I have reached a fairly high TPS when using caliper npm package [v0.4.0] a couple of months ago. Recently I've decided to test caliper docker image [v0.4.2] in a Kubernetes environment. I've tested both the local and the distributed [MQTT] methods. When I increase the load, there is a high probability that the test will get stuck.
These are the last lines of the caliper.log file [debug is enabled]:
|
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
I'm facing the same issue too
|
My guess here is that everyone is using hyperledger fabric. This is probably a combination of 2 things
You can change the timeout by adding the following to the command line when you launch caliper
which will change the timeout to 60 seconds The problem still remains regarding whether the transaction was committed or whether the node-sdk missed the event |
Even with a timeout change a build failed with the same problem. This was using fabric sdk 2.1 which may have contributed to the hang as this version is not supported, it's possible that timeouts are not handled properly. Really need to get rid of all the unsupported stuff that is still within caliper. |
I am facing the same issue when benchmarking Besu using Caliper 0.4.2 with simple benchmark.
|
I believe I have tracked down now why this can happen it's related to timing and not handling errors of a submitted transaction and marking it complete. For example there are situations in the fabric connector that can throw an error before a transaction is submitted and thus the connector doesn't try to handle these errors as it itself throws it. Caliper itself should have generic handling when a submit transaction fails |
A Scenario was discovered where if you send a single txn and the connector throws an error then that single txn never finishes and the worker loops forever waiting for that transaction to finish. eg in the workload ``` invokerIdentity: 'unknownuser' ``` example benchmark ``` test: name: fixed-asset-test description: >- This is a test yaml for the existing fixed-asset benchmarks workers: type: local number: 1 rounds: - label: empty-contract-evaluate chaincodeID: fixed-asset txNumber: 1 rateControl: type: fixed-rate opts: tps: 2 workload: module: benchmarks/api/fabric/workloads/empty-contract.js arguments: chaincodeID: fixed-asset consensus: false ``` In the wider support for connectors though the caliper framework should ensure that connectors that either don't handle errors or throw errors should make sure that the submission is registered as a failure (which is what a connector would do if it did catch an error) This fix ensures that any error received will mark a transaction as finished closes hyperledger-caliper#1068 Signed-off-by: D <[email protected]>
A Scenario was discovered where if you send a single txn and the connector throws an error then that single txn never finishes and the worker loops forever waiting for that transaction to finish. eg in the workload ``` invokerIdentity: 'unknownuser' ``` example benchmark ``` test: name: fixed-asset-test description: >- This is a test yaml for the existing fixed-asset benchmarks workers: type: local number: 1 rounds: - label: empty-contract-evaluate chaincodeID: fixed-asset txNumber: 1 rateControl: type: fixed-rate opts: tps: 2 workload: module: benchmarks/api/fabric/workloads/empty-contract.js arguments: chaincodeID: fixed-asset consensus: false ``` In the wider support for connectors though the caliper framework should ensure that connectors that either don't handle errors or throw errors should make sure that the submission is registered as a failure (which is what a connector would do if it did catch an error) This fix ensures that any error received will mark a transaction as finished closes #1068 Signed-off-by: D <[email protected]>
nope not tracked this down after all as it is happening with the new fabric 2.4 connector. It doesn't appear to be related to fabric connectors. see #1340 |
This addresses a long standing issue in caliper where a round will hang waiting for unfinished transactions that have actually finished but not been recorded. It also addresses an issue where the fabric connectors try to change the time_create value of a TxStatus but in fact actually doesn't change the creation time. closes hyperledger-caliper#1068 closes hyperledger-caliper#1340 Signed-off-by: D <[email protected]>
This addresses a long standing issue in caliper where a round will hang waiting for unfinished transactions that have actually finished but not been recorded. It also addresses an issue where the fabric connectors try to change the time_create value of a TxStatus but in fact actually doesn't change the creation time. closes #1068 closes #1340 Signed-off-by: D <[email protected]>
I'm running a Multi-Round benchmark (simple benchmark) with various TPS rates. The benchmarks get randomly stuck with one unfinished transaction and never returns.
Context
I checked the container logs, everything looks normal in Peers/Orderers/CAs. The chaincode containers logs are frozen with no log after the first instance of caliper repeating the
Submitted: 20 Succ: 19 Fail:0 Unfinished:1
Expected Behavior
For the benchmark, rounds execute one after another.
Actual Behavior
Here is the Peer Logs
Possible Fix
Steps to Reproduce
Existing issues
Context
Your Environment
The text was updated successfully, but these errors were encountered: