Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ties and data in JSON #280

Closed
beached opened this issue Oct 27, 2020 · 6 comments
Closed

Ties and data in JSON #280

beached opened this issue Oct 27, 2020 · 6 comments

Comments

@beached
Copy link
Contributor

beached commented Oct 27, 2020

Would adding a third digit in the timing make sense for the JSON tests where the tests are coming pretty close now. Or adding more data, but that has the affect of taking longer and making it harder to compare with old.

@nuald
Copy link
Collaborator

nuald commented Oct 27, 2020

Adding third digit is absolutely doable (just few changes in runner scripts), however as we're moving into milliseconds scale, additional factors could affect the results:

  • printing within the benchmarks, it's not guaranteed to by async/sync across different platforms and may introduce results fluctuations in the milliseconds scale. It's easy to fix for JSON tests though as we can just move printing operation out of the benchmarked code (however, it won't be so easy for other tests);
  • network communications. TCP is synced protocol and affects the results too, I'm not sure why I didn't use UDP (as I recall not all platforms support it). Surely, it's a localhost communication and Linux kernel optimizes it, but I guess it could affect results nonetheless. Not sure what to do about it, I guess I may look into UDP again (I think if the platform doesn't support it natively, we may try to use the native library and bind to it).
  • non-deterministic GC. I guess it's nothing we can do here, disabling GC is a no-go as it's a microoptimization, and moreover, usually GC-ed languages are quite slow, so the milliseconds scale won't matter here anyway.

Please give your feedback, I could overestimate the effects of printing and network operations, thus runner scripts changes could be enough.

@beached
Copy link
Contributor Author

beached commented Oct 27, 2020

I think increasing the data size will result in measuring more of json parsing and the other factors become less of a factor. So if that is the goal, the price is longer test times. At gb/s json parsing 100mb isnt a lot. I noticed it uses max time, is worst case the desired time? Or did i read that wrong

@nuald
Copy link
Collaborator

nuald commented Oct 27, 2020

The results are given as arithmetic mean values with the standard deviations, and the original timing is measured between network requests to the runner (start/stop measuring). While arithmetic mean could be not the best number to rely on, I think it's good enough for the relative comparisons.

I think increasing the sample JSON is acceptable. I had to decrease it in due time as my home server took forever to run tests, but now I have newer hardware and could use bigger fixture file.

@beached
Copy link
Contributor Author

beached commented Oct 27, 2020

Another quick one, regarding the issues and memory usage. Would adding another column for usage before parsing, the JSON string mostly, be useful? I don't think subtracting it is doable(Haskell example is streaming) but having the knowledge the change from before/after might be useful in seeing if the cost/impact on memory is worth it to library users.

@nuald
Copy link
Collaborator

nuald commented Oct 28, 2020

First, let me note that I did some research (please see additional details below), and found out that the current math is giving slightly misleading results. For example, in my tests I had values (0.78, 0.78, ... , 0.78, 0.82) and that last fluctuation worsened the final results. Therefore, I've decided to switch to use medians. Unfortunately, GitHub doesn't natively support rendering math formulas (hence no pleasant way to show quartiles), thus I think I'm going to use the format MEDIANMAD (e.g. 0.780.01), where MAD is the median absolute deviation.

Second, regarding memory usage. Not only Haskell has streaming parsing, but some other tests too. I think it's worth to show the memory increase within the benchmark, however the table is already quite wide, so maybe it's better to change the format of the existing column, i.e. to use BASE_MEDIANBASE_MAD + MEM_DIFF_MEDIANMEM_DIFF_MAD, there BASE is the memory before the benchmark, MEM_DIFF - memory increase during the benchmark. For example, the value 122.91 ± 05.96 will become something like 22.913.94 + 100.002.02

Back to the research I did:

  • increasing the JSON file is no-go, some tests require way too much memory to parse it, and I'm getting OOM exceptions;
  • printing operation indeed affecting the results. As it's quite minor change (at least for JSON tests), I'm going to implement it;
  • however, I'm still getting fluctuations even on native hardware (outside of Docker) and without network requests (verified with time elapsed output in native code), and those fluctuations nullify the increased precision with 3 digit outputs (as the deviations are within centiseconds). I hope that moving prints and use median will provide better results, hence no further work will be required, but if not, I'm going to dig deeper.

@nuald
Copy link
Collaborator

nuald commented Oct 31, 2020

It's all addressed in #281. Plus to the changes mentioned above, I've also changed time measurements to use nanoseconds instead of seconds in float format.

@nuald nuald closed this as completed Oct 31, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants