New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Stats enhancements #714

Merged

AlexandraRoatis merged 9 commits into master-pre-merge from stats-update

Nov 21, 2018

Contributor

AlexandraRoatis commented Nov 20, 2018 •

edited

Loading

Description

Fixes some issues related to the gathered sync statistics, as follows:

improved access to resources by using different locks;
~~replaces stream with parallelStream;~~
specified element types for lists;
calculate average response times without producing incorrect negative values (by ignoring data that looks inconsistent like response time < request time);
correct formatting and license for stats test;
correctly tracking requests to peers; updated unit test;
correctly tracking received blocks instead of imported blocks;
using System.nanoTime instead of System.currentTimeMillis for computing the average response times.

Continues Issue #661 .

Type of change

Insert x into the following checkboxes to confirm (eg. [x]):

Testing

Please describe the tests you used to validate this pull request. Provide any relevant details for test configurations as well as any instructions to reproduce these results.

existing test suite + 1 new unit test

Verification

Insert x into the following checkboxes to confirm (eg. [x]):

I have self-reviewed my own code and conformed to the style guidelines of this project.
New and existing tests pass locally with my changes.
I have added tests for my fix or feature.
I have made appropriate changes to the corresponding documentation.
My code generates no new warnings.
Any dependent changes have been made.

AlexandraRoatis added bug enhancement labels

AlexandraRoatis added this to the 0.3.2 milestone

AlexandraRoatis requested review from arajasek, AionJayT and aion-kelvin

November 20, 2018 18:31

aion-kelvin suggested changes

View reviewed changes

Contributor

aion-kelvin left a comment

Hey Alexandra, thanks for working on this. Have some suggestions and questions for you.

Also; general question -- in the description, you mentioned:

calculate average response times without producing incorrect negative values (by ignoring data that looks inconsistent like response time < request time);

Do we know why this inconsistent data happens?

modAionImpl/src/org/aion/zero/impl/sync/SyncStats.java Show resolved Hide resolved

modAionImpl/src/org/aion/zero/impl/sync/SyncStats.java Show resolved Hide resolved

modAionImpl/src/org/aion/zero/impl/sync/SyncStats.java

-                  private double avgBlocksPerSec;
+                  /** @implNote Access to this resource is managed by the {@link #requestsLock}. */
+                  private final Map<String, RequestCounter> requestsToPeers = new HashMap<>();

Contributor

aion-kelvin Nov 20, 2018

Is there a benefit to manually doing locking as opposed to just using Collections.synchronizedMap to wrap the requestsToPeers map (and same question for all the other maps that have their own lock)?

Contributor Author

AlexandraRoatis Nov 20, 2018

In the initial code, we had both synchronized collections and synchronized methods which was overkill. I opted to remove both in favor of separate locks. The disadvantage of the concurrent maps would be that in some methods the map is filtered multiple times. With concurrent maps only, we might get inconsistent data, like the overall average for the peer times not matching the rest of the shown data.

modAionImpl/src/org/aion/zero/impl/sync/SyncStats.java Outdated Show resolved Hide resolved

modAionImpl/src/org/aion/zero/impl/sync/SyncStats.java Show resolved Hide resolved

modAionImpl/src/org/aion/zero/impl/sync/SyncStats.java Show resolved Hide resolved

modAionImpl/src/org/aion/zero/impl/sync/SyncStats.java Outdated Show resolved Hide resolved

modAionImpl/src/org/aion/zero/impl/sync/SyncStats.java Outdated

+                                          .sum();
+                          requestsToPeers
+                                  .entrySet()
+                                  .parallelStream()

Contributor

aion-kelvin Nov 20, 2018

Hmm I'm not sure if parallelStream() is what we want here... since the stream is modifying percentageReq, can't that cause a ConcurrentModificationException on it?

Collaborator

AionJayT Nov 20, 2018 •

edited

Loading

use for-loop will be better. if the calculation is not really heavy. use stream will slow down the processing time cause the thread context switch stuff.
https://blog.oio.de/2016/01/22/parallel-stream-processing-in-java-8-performance-of-sequential-vs-parallel-stream-processing/

modAionImpl/src/org/aion/zero/impl/sync/SyncStats.java Outdated Show resolved Hide resolved

modAionImpl/src/org/aion/zero/impl/sync/SyncStats.java Outdated Show resolved Hide resolved

Contributor Author

AlexandraRoatis commented Nov 20, 2018

@aion-kelvin The stats make some assumptions on the correspondence of a response to a request. They are matched positionally because there are no identifiers to match the request to its response, especially for status requests, which is what is tracked in this case.

Since different threads are at work here and the request is first sent out and then added to statistics, there can be scenarios where the response is received and added to statistics before the request.

I'd be tempted to log the stats information before sending the request, but I'm worried that it might interfere with application logic, and perhaps delegating all the stats to low priority threads is the next necessary enhancement step.

AionJayT reviewed

View reviewed changes

Collaborator

AionJayT left a comment

LGTM except for the parallelstream.

Contributor

aion-kelvin commented Nov 20, 2018

Cool, thanks for the explanation

AlexandraRoatis force-pushed the stats-update branch 2 times, most recently from f1fd792 to 967c298 Compare

November 21, 2018 14:44

aion-kelvin approved these changes

View reviewed changes

Contributor Author

AlexandraRoatis commented Nov 21, 2018

I can confirm that a kernel running overnight with this update worked correctly. As you can see below it also fixed the negative values bug.

Old version output:

====== sync-responses-by-peer ======
        peer        avg. response
------------------------------------
   «overall»            765617 ms
   id:3f066f           -738521 ms
   id:8629a3             44610 ms
   id:4be9f8             56727 ms
   id:a30d20             67141 ms
   id:a30d30             67230 ms
   id:a30d10             89453 ms
   id:acda45             97917 ms
   id:526445            109275 ms
   id:0f9d39            124429 ms
   id:1fe402           7737905 ms

Output for kernel with this PR:

====== sync-responses-by-peer ======
        peer        avg. response
------------------------------------
   «overall»            784311 ms
   id:3f066f                 9 ms
   id:4be9f8             17748 ms
   id:acda45             35856 ms
   id:a30d30             45848 ms
   id:8629a3             50529 ms
   id:a30d20             66535 ms
   id:a30d10             78114 ms
   id:526445             95776 ms
   id:0f9d39            140577 ms
   id:1fe402           7312124 ms

Both versions were ran in parrallel for the same duration of time.

Contributor

aion-kelvin commented Nov 21, 2018 via email

Is 9 ms correct? Why is it so fast?

________________________________ From: Alexandra Roatis <[email protected]> Sent: 21 November 2018 10:13:17 To: aionnetwork/aion Cc: Kelvin Lam; Mention Subject: Re: [aionnetwork/aion] Stats enhancements (#714) I can confirm that a kernel running overnight with this update worked correctly. As you can see below it also fixed the negative values bug. Old version output: ====== sync-responses-by-peer ====== peer avg. response ------------------------------------ «overall» 765617 ms id:3f066f -738521 ms id:8629a3 44610 ms id:4be9f8 56727 ms id:a30d20 67141 ms id:a30d30 67230 ms id:a30d10 89453 ms id:acda45 97917 ms id:526445 109275 ms id:0f9d39 124429 ms id:1fe402 7737905 ms Output for kernel with this PR: ====== sync-responses-by-peer ====== peer avg. response ------------------------------------ «overall» 784311 ms id:3f066f 9 ms id:4be9f8 17748 ms id:acda45 35856 ms id:a30d30 45848 ms id:8629a3 50529 ms id:a30d20 66535 ms id:a30d10 78114 ms id:526445 95776 ms id:0f9d39 140577 ms id:1fe402 7312124 ms Both versions were ran in parrallel for the same duration of time. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<#714 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AmJXXF-nAqntbeu8ecW2XIows1Q1Pb5fks5uxW2NgaJpZM4YrsTN>.

Contributor Author

AlexandraRoatis commented Nov 21, 2018

I suspect it's another node running on the office network.

Collaborator

AionJayT commented Nov 21, 2018

How does the latency been calculated?

Collaborator

AionJayT commented Nov 21, 2018

LGTM, but sync-responses-by-peer probably need to have more look.

AionJayT approved these changes

View reviewed changes

Contributor Author

AlexandraRoatis commented Nov 21, 2018

@AionJayT: yes, the response time definitely needs more refinement. As can be deduced from the answer to Kelvin's question above there are many sources of inaccuracies in the measurement. It should only be used for comparing different peers rather than for exact values.

AlexandraRoatis added 9 commits

November 21, 2018 11:53


          improved access to resources by using different locks; replaces strea…

999cac8

…m with parallelStream; specified element types for lists


          calculate average response times without producing incorrect negative…

4bfe10b

… value


          correct formatting and license for stats test

45a3345


          correctly tracking requests to peers; updated unit test

4f2fddb


          correctly tracking received blocks instead of imported blocks

c43a7a6


          using nano time for the average response time computation

283f3b0


          bugfix: using the minimum size of the two lists

feb645d


          minor updates requested in reviews

ba44525


          replaced streams used for computations with for loops; updated back t…

03007e4

…o stream for sorting the maps

AlexandraRoatis force-pushed the stats-update branch from 967c298 to 03007e4 Compare

November 21, 2018 16:53

arajasek approved these changes

View reviewed changes

AlexandraRoatis merged commit 0ae32fb into master-pre-merge

AlexandraRoatis deleted the stats-update branch

November 21, 2018 18:37

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug enhancement