You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have detected some problem when gridgain thrown networkTimeout exception, and I can simulate it, if you follow next steps:
Sample client:
I build a sample gridgain client that put and get randoms K/V objects on a grid cache. My example store a object on the cache and after 100 milliseconds it queries this object.
While my client is running, I enable the packets loss simulation using this command:
comcast --device=bond1 --packet-loss=40%
I know that 40% of lost packets is maybe high, but this isn't the problem ... when you enable the packet loss, you can see how the client is getting slower, and if you wait some minutes you get this exception:
GET --> KEY: f2fe30a5-efcf-4247-ae73-defaef89c587 VALUE: 6d1f465a-7e88-4842-a85e-36f1d066ae2e
PUT --> KEY: 7125fb63-85c0-4ddc-bb0a-0bd6e1b03b5b VALUE: 639ebe67-ac2c-4880-a533-682d7e84066f
GET --> KEY: 7125fb63-85c0-4ddc-bb0a-0bd6e1b03b5b VALUE: 639ebe67-ac2c-4880-a533-682d7e84066f
PUT --> KEY: a837db75-b571-44cd-bd10-92061e4ed4e7 VALUE: 0133d8cb-6085-44a5-ba27-4033020c03c6
GET --> KEY: a837db75-b571-44cd-bd10-92061e4ed4e7 VALUE: 0133d8cb-6085-44a5-ba27-4033020c03c6
Exception in thread "main" class org.gridgain.grid.cache.GridCacheAtomicUpdateTimeoutException: Cache update timeout out (consider increasing networkTimeout configuration property).
For more information see:
Troubleshooting: http://bit.ly/GridGain-Troubleshooting
Documentation Center: http://bit.ly/GridGain-Documentation
at org.gridgain.grid.kernal.processors.cache.distributed.dht.atomic.GridNearAtomicUpdateFuture.checkTimeout(GridNearAtomicUpdateFuture.java:301)
at org.gridgain.grid.kernal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$19.onTimeout(GridDhtAtomicCache.java:1847)
at org.gridgain.grid.kernal.processors.timeout.GridTimeoutProcessor$TimeoutWorker.body(GridTimeoutProcessor.java:138)
at org.gridgain.grid.util.worker.GridWorker.run(GridWorker.java:151)
at java.lang.Thread.run(Unknown Source)
When this happen my client and java example hang up, now if I disable packet loss using this command:
comcast --mode stop --device=bond1
My gridgain node works fine, I can check my K/V objects using ggvisorcmd.sh, if I disable my node I can see how my gridgain client detects it, like this:
PUT --> KEY: 7125fb63-85c0-4ddc-bb0a-0bd6e1b03b5b VALUE: 639ebe67-ac2c-4880-a533-682d7e84066f
GET --> KEY: 7125fb63-85c0-4ddc-bb0a-0bd6e1b03b5b VALUE: 639ebe67-ac2c-4880-a533-682d7e84066f
PUT --> KEY: a837db75-b571-44cd-bd10-92061e4ed4e7 VALUE: 0133d8cb-6085-44a5-ba27-4033020c03c6
GET --> KEY: a837db75-b571-44cd-bd10-92061e4ed4e7 VALUE: 0133d8cb-6085-44a5-ba27-4033020c03c6
Exception in thread "main" class org.gridgain.grid.cache.GridCacheAtomicUpdateTimeoutException: Cache update timeout out (consider increasing networkTimeout configuration property).
For more information see:
Troubleshooting: http://bit.ly/GridGain-Troubleshooting
Documentation Center: http://bit.ly/GridGain-Documentation
at org.gridgain.grid.kernal.processors.cache.distributed.dht.atomic.GridNearAtomicUpdateFuture.checkTimeout(GridNearAtomicUpdateFuture.java:301)
at org.gridgain.grid.kernal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$19.onTimeout(GridDhtAtomicCache.java:1847)
at org.gridgain.grid.kernal.processors.timeout.GridTimeoutProcessor$TimeoutWorker.body(GridTimeoutProcessor.java:138)
at org.gridgain.grid.util.worker.GridWorker.run(GridWorker.java:151)
at java.lang.Thread.run(Unknown Source)
[11:57:27] Topology snapshot [ver=77, nodes=1, CPUs=4, heap=3.5GB]
[11:57:48] Topology snapshot [ver=78, nodes=2, CPUs=8, heap=10.0GB]
But my gridgain client can't write and query K/V objects again, he is hang up ...
I think that when the gridgain throw org.gridgain.grid.cache.GridCacheAtomicUpdateTimeoutException, the client must give me a null, like if it doesn't find the specific key, and it must continue working normally.
The text was updated successfully, but these errors were encountered:
Can you try increasing network timeout as suggested by the exception? Default is 4000ms, so I would recommend setting it to 10000ms to give it enough time to deal with 40% packet loss.
If that does not help, we will need to take a look at the thread dumps from each node.
Hi all,
I have detected some problem when gridgain thrown networkTimeout exception, and I can simulate it, if you follow next steps:
I build a sample gridgain client that put and get randoms K/V objects on a grid cache. My example store a object on the cache and after 100 milliseconds it queries this object.
The example's source is available on this gist:
https://gist.github.com/andresgomez92/f3bf78682acaecc8cde6
When client is running, you can see some like this:
While my client is running, I enable the packets loss simulation using this command:
I know that 40% of lost packets is maybe high, but this isn't the problem ... when you enable the packet loss, you can see how the client is getting slower, and if you wait some minutes you get this exception:
When this happen my client and java example hang up, now if I disable packet loss using this command:
My gridgain node works fine, I can check my K/V objects using ggvisorcmd.sh, if I disable my node I can see how my gridgain client detects it, like this:
But my gridgain client can't write and query K/V objects again, he is hang up ...
I think that when the gridgain throw org.gridgain.grid.cache.GridCacheAtomicUpdateTimeoutException, the client must give me a null, like if it doesn't find the specific key, and it must continue working normally.
The text was updated successfully, but these errors were encountered: