Ties and data in JSON #280

beached · 2020-10-27T16:10:45Z

Would adding a third digit in the timing make sense for the JSON tests where the tests are coming pretty close now. Or adding more data, but that has the affect of taking longer and making it harder to compare with old.

nuald · 2020-10-27T16:31:42Z

Adding third digit is absolutely doable (just few changes in runner scripts), however as we're moving into milliseconds scale, additional factors could affect the results:

printing within the benchmarks, it's not guaranteed to by async/sync across different platforms and may introduce results fluctuations in the milliseconds scale. It's easy to fix for JSON tests though as we can just move printing operation out of the benchmarked code (however, it won't be so easy for other tests);
network communications. TCP is synced protocol and affects the results too, I'm not sure why I didn't use UDP (as I recall not all platforms support it). Surely, it's a localhost communication and Linux kernel optimizes it, but I guess it could affect results nonetheless. Not sure what to do about it, I guess I may look into UDP again (I think if the platform doesn't support it natively, we may try to use the native library and bind to it).
non-deterministic GC. I guess it's nothing we can do here, disabling GC is a no-go as it's a microoptimization, and moreover, usually GC-ed languages are quite slow, so the milliseconds scale won't matter here anyway.

Please give your feedback, I could overestimate the effects of printing and network operations, thus runner scripts changes could be enough.

beached · 2020-10-27T16:40:59Z

I think increasing the data size will result in measuring more of json parsing and the other factors become less of a factor. So if that is the goal, the price is longer test times. At gb/s json parsing 100mb isnt a lot. I noticed it uses max time, is worst case the desired time? Or did i read that wrong

nuald · 2020-10-27T17:07:29Z

The results are given as arithmetic mean values with the standard deviations, and the original timing is measured between network requests to the runner (start/stop measuring). While arithmetic mean could be not the best number to rely on, I think it's good enough for the relative comparisons.

I think increasing the sample JSON is acceptable. I had to decrease it in due time as my home server took forever to run tests, but now I have newer hardware and could use bigger fixture file.

beached · 2020-10-27T17:19:18Z

Another quick one, regarding the issues and memory usage. Would adding another column for usage before parsing, the JSON string mostly, be useful? I don't think subtracting it is doable(Haskell example is streaming) but having the knowledge the change from before/after might be useful in seeing if the cost/impact on memory is worth it to library users.

nuald · 2020-10-28T20:11:43Z

First, let me note that I did some research (please see additional details below), and found out that the current math is giving slightly misleading results. For example, in my tests I had values (0.78, 0.78, ... , 0.78, 0.82) and that last fluctuation worsened the final results. Therefore, I've decided to switch to use medians. Unfortunately, GitHub doesn't natively support rendering math formulas (hence no pleasant way to show quartiles), thus I think I'm going to use the format MEDIAN_MAD (e.g. 0.78_0.01), where MAD is the median absolute deviation.

Second, regarding memory usage. Not only Haskell has streaming parsing, but some other tests too. I think it's worth to show the memory increase within the benchmark, however the table is already quite wide, so maybe it's better to change the format of the existing column, i.e. to use BASE_MEDIAN_{BASE_MAD} + MEM_DIFF_MEDIAN_{MEM_DIFF_MAD}, there BASE is the memory before the benchmark, MEM_DIFF - memory increase during the benchmark. For example, the value 122.91 ± 05.96 will become something like 22.91_3.94 + 100.00_2.02

Back to the research I did:

increasing the JSON file is no-go, some tests require way too much memory to parse it, and I'm getting OOM exceptions;
printing operation indeed affecting the results. As it's quite minor change (at least for JSON tests), I'm going to implement it;
however, I'm still getting fluctuations even on native hardware (outside of Docker) and without network requests (verified with time elapsed output in native code), and those fluctuations nullify the increased precision with 3 digit outputs (as the deviations are within centiseconds). I hope that moving prints and use median will provide better results, hence no further work will be required, but if not, I'm going to dig deeper.

nuald · 2020-10-31T17:51:05Z

It's all addressed in #281. Plus to the changes mentioned above, I've also changed time measurements to use nanoseconds instead of seconds in float format.

nuald closed this as completed Oct 31, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ties and data in JSON #280

Ties and data in JSON #280

beached commented Oct 27, 2020

nuald commented Oct 27, 2020

beached commented Oct 27, 2020

nuald commented Oct 27, 2020

beached commented Oct 27, 2020

nuald commented Oct 28, 2020 •

edited

Loading

nuald commented Oct 31, 2020

Ties and data in JSON #280

Ties and data in JSON #280

Comments

beached commented Oct 27, 2020

nuald commented Oct 27, 2020

beached commented Oct 27, 2020

nuald commented Oct 27, 2020

beached commented Oct 27, 2020

nuald commented Oct 28, 2020 • edited Loading

nuald commented Oct 31, 2020

nuald commented Oct 28, 2020 •

edited

Loading