-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unusually high network activity from CKB Light Client. #168
Comments
are there any difference in the scripts setting for these 10 light client? could you help to call the |
I do not see any difference. They all return an empty result.
|
could you upgrade to v0.3.3? I suspect it might be related to this bug fix: #163 |
I upgraded to v0.3.3 by replacing the binary without deleting the configuration or data files. I no longer see high bandwidth issues. However, I see that one node has very significantly higher CPU usage than the others. This is node 9, where as before it was node 1 with high CPU and high bandwidth. Output from the console is below. |
are there any diff between |
The only differences are the ports used. testnet8-config.txt Here is the console output from client 8, which is running normally. |
Was the tip header of node 9 correct? (RPC method:
|
I left all clients running overnight. The elevated CPU activity has stopped now. The current block on client 9 from Below is the most recent console output. My router reports that there was a normal amount internet activity during the night. However, I noticed something interesting in the virtual machine dashboard. There was high network activity for several hours, but this time it was only local. I suspect that this is because I have local testnet full node (v0.110.0) which is configured as the sole boot node in the light client configuration. The clients started in an off state, and you can see above when I turned the clients back on based on the memory usage. Looking at the network traffic, you can see approximately 18 hours of high network activity before stopping by itself. It was receiving 1.6 MBps continuously during this period. This chart is MBps (megabytes per second), not Mbps (megabits per second). This is approximately 100 GB for the 18 hour period. Here is the output of
|
In the log file "testnet9-log.txt", it print the following log message every 8 seconds:
So, those log messages mean that remote CKB node ( |
Could you provide the results of:
|
{
"jsonrpc":"2.0",
"result":[
{
"addresses":[
{
"address":"/ip4/192.168.0.73/tcp/18115/p2p/QmQnnZeQnpsABhQEz5hGMoV4jkuDR3cRr6STriTEmBzKFu",
"score":"0x64"
}
],
"connected_duration":"0x325293e",
"node_id":"QmQnnZeQnpsABhQEz5hGMoV4jkuDR3cRr6STriTEmBzKFu",
"protocols":[
{
"id":"0x64",
"version":"2"
},
{
"id":"0x0",
"version":"2"
},
{
"id":"0x2",
"version":"2"
},
{
"id":"0x78",
"version":"2"
},
{
"id":"0x1",
"version":"2.1"
},
{
"id":"0x79",
"version":"2"
},
{
"id":"0x4",
"version":"2"
}
],
"sync_state":{
"proved_best_known_header":null,
"requested_best_known_header":null
},
"version":"0.110.0 (0679b11 2023-05-16)"
}
],
"id":1
} All the other clients have long lists of clients for
{
"jsonrpc": "2.0",
"result": {
"best_known_block_number": "0xb33386",
"best_known_block_timestamp": "0x18ca97dadf9",
"fast_time": "0x3e8",
"ibd": false,
"inflight_blocks_count": "0x0",
"low_time": "0x5dc",
"normal_time": "0x4e2",
"orphan_blocks_count": "0x0"
},
"id": 42
} |
It looks like your local bootnode is not broadcasting other full node addresses to client 9. Could you upgrade your local bootnode to latest version, then reboot client 9 and try again? |
Does any of this explain why something used 100GB of bandwidth?
Will try next. |
The full node and ckb light client 9 are now at the newest available versions. Client 9 is using more CPU cycles and high bandwidth again. Network activity on the virtual machine is 3.5MB/s. Internet traffic is just over 1.5MB/s. It is probable that it is sustaining 1.5MB/s to the local full node and 1.5MB/s to the external full node. Here is the current screen output from client 9. Here is the result of Here is the result of {
"jsonrpc": "2.0",
"result": {
"compact_target": "0x1d0892c8",
"dao": "0x9655c8eedbc64f4a20fbaa85c6c62700b165db4ca063d50500c18d82bf7ece08",
"epoch": "0x70803f7001ec7",
"extra_hash": "0x9ed419567e868465abbe943ef8a1859764092a89f1f62646569ce69af17b18b4",
"hash": "0x8e9ef98d1de9eee0fb2a2d8271f29766d5380929898b94ae93c749d372f23983",
"nonce": "0x33aeecb665c83f6cdb9627e306eb7b49",
"number": "0xb35ce0",
"parent_hash": "0x9c16173c7f423767b930ef9d2574272a791a95bd93a7ceccf7c27933d78e7579",
"proposals_hash": "0x70bb2879cf3a8f386e6d892b2f14b54a9337b9f8418b31ff59a44fccc7f5ae3d",
"timestamp": "0x18cae80427a",
"transactions_root": "0x8225a143c7b4939296009344a4c5c7bb4f79e0f66bef6e30dfb2af2dad24041e",
"version": "0x0"
},
"id": 42
} After stopping client 9, network activity ceases. |
could you zip the client 9 data folder and upload? I'd like to try to reproduce it in my local environment. |
|
we have reproduced this abnormal network usage, working on bug fix, will update later, thanks. |
There are 3 sub-issues for this issue:
|
@jordanmack could you update your light clients' config bootnodes section to the public nodes: I think it may resolve this cpu abnormal usage as a temp solution. |
I add some rules to ban a peer: Light clients could not distinguish a peer whether it just has a stale state or it is malicious, so just ban it, then find a new peer to instead of it. p.s. After the few minutes, the ban status will be removed. |
you may upgrade light client to 0.3.4 now, it will resolve this cpu and network abnormal usage issue: |
So far everything is running fine. Nothing except for a few long fork errors from two clients on startup. If I shutdown a client and delete the |
One other question, do light clients create connections with other light clients for any reason, or do they only connect to full nodes? |
yes
only connect to full nodes |
I run ten testnet CKB light clients on a virtual machine for testing purposes. Today I received a notification from my ISP that I was approaching my usage limit. My router reports several hundred GB of transmission in the last few days, which is very unusual. I traced the increase in network activity to the virtual machine with the light clients. The current version of the light clients running is v0.3.0, which was installed on November 1, 2023.
Below is a screenshot of the host node activity charts for the month. Starting on December 13th, there was an increase in network activity which continued to increase until hitting nearly 3MB/s on December 19th. This network activity is persistent and amounts to hundreds of GB over a few days.
Here are a few minutes of internet network activity for my network.
Looking at a
ps
of the clients on the host machine. One node has significantly more CPU time than the others. However, there is nothing unusual about the screen output. (This was done after restarting the virtual machine and all the light clients when high network activity was noticed again.)After killing the client with high CPU time, the bandwidth has dropped significantly. The other 9 light client nodes are still running.
I have turned off this virtual machine for the time being. If you would like me to debug further, please let me know how I should do so.
The text was updated successfully, but these errors were encountered: