Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate temperature impact factors during blockchain sync #432

Closed
divan opened this issue Oct 24, 2017 · 13 comments
Closed

Investigate temperature impact factors during blockchain sync #432

divan opened this issue Oct 24, 2017 · 13 comments
Labels

Comments

@divan
Copy link
Contributor

divan commented Oct 24, 2017

Problem

Initial blockchain synchronization is known to be CPU expensive and causes phone (over)heating. This is very undesirable, as immediately turns users away from using the app and drains battery.

We need to find a way to trade off this high CPU usage during initial synchronization to something else - longer sync time or slower disc I/O, for example. As a minimum, we want to reduce CPU usage to the sane defaults, as a maximum - be able to reason about exact numbers and configure this tradeoff dynamically (user with latest model phone and charging would prefer "fast and CPU expensive sync", whilst user with older phone and on battery would prefer "slower and CPU friendly sync".

Implementation

As blockchain synchronization code resides in ethereum-go, we need to profile the sync process and figure out what CPU-bound parts can be safely traded for longer time, and possible outcomes/drawbacks from it.

Acceptance Criteria

  • There is a main reason of phone temperature increase during sync is identified

Random notes

Temperature is believed to be strongly linked with CPU usage, but as the final acceptance criteria is still a low temperature during sync, it make sense to measure it (using the same phone between two app releases). Instead of using manual temperature testing with hand, one can use cheap infrared cameras like Flir One to measure exact temperature :)

@tiabc
Copy link
Contributor

tiabc commented Oct 25, 2017

I think it's not an easily achievable thing and the scope of this issue is considered really huge.

May we reduce it down to just figuring out which specific sections of go-ethereum code most impact cpu utilization and thus phone heating? Not only may it be cpu, by the way. go-ethereum stores everything on disk and disk operations may also have some significant impact as well.

The idea is to identify such places and understand where we move next, creating new issues and parallelising work. Does it make sense?

@adambabik
Copy link
Contributor

LES/2 might be helpful to reduce CPU usage. go-ethereum team merged the first PR related to it just a few days ago: ethereum/go-ethereum#14970

@divan divan changed the title Decrease CPU usage on blockchain sync Investigate temperature impact factors during blockchain sync Oct 26, 2017
@divan
Copy link
Contributor Author

divan commented Oct 26, 2017

@tiabc yes, let's narrow this issue to investigation only. Renamed the issue.

@3esmit
Copy link
Member

3esmit commented Oct 27, 2017

I also suffer this issue on Galaxy Note 4

@divan
Copy link
Contributor Author

divan commented Nov 1, 2017

I started with naïve assumption of CPU–temperature link and did some profiling on the laptop. It quickly showed that most of the CPU time during blockchain sync is spent in writing blocks into leveldb in InsertHeaderChain() function.

I introduced small delay in this call — effectively making this function sleep before executing expensive write:

+    time.Sleep(100 * time.Microsecond)
     if err := writeHeader(header); err != nil {
         return i, err
     }

Then I measured CPU usage and total time of syncronizing from scratch.

Here are some preliminary results from my laptop (it has 4 cores, so 400% CPU usage is max):

Delay Mean CPU Time to sync
0 229% 88.52s
50μs 176% 153.98s
100μs 153% 163.13s
150μs 125% 222.04s
200μs 129% 354.77s
250μs 134% 461.80s
300μs 122% 533.14s
350μs 110% 570.09s
400μs 93% 526.46s
450μs 83% 503.66s
500μs 83% 570.26s
550μs 78% 589.08s
600μs 77% 616.33s
1ms 58% 629.36s

3d

Looks like reverse dependency is pretty clear here, but I'm going to collect more data to find exact function.

Here are detailed graphs of CPU usage for three cases - no delay, 100μs and 1ms:
no delay
100μs
1ms


Next steps

  • build custom version of app with introduced delay and test on a real device and actually, measure temperature impact
  • investigate if other parameters can also be tuned (numbers of blocks written to disc per batch, for example)
  • collect more data to better model link between CPU usage, time, delay and other parameters
  • play with different models of implementing delay — presets, smart presents (if phone is on charge — fast sync/high CPU, on battery — slow sync/low CPU), auto-tunable (CPU-priority, for example, with upper bound), user-tunable (allow user to tune), etc.

Code used for collecting data and make graphs: https://gist.github.com/divan/55a55240ea553ae47517733f9fc24a48

@themue
Copy link
Contributor

themue commented Nov 1, 2017

Great work. I'm current following your pprof label hint of yesterday.

@adambabik
Copy link
Contributor

Graphs! I love them, great job!

The first thing I noticed is that it can consume more than 100% of CPU which indicates it uses multiple cores. Could you try to change runtime.GOMAXPROCS(runtime.NumCPU()) to runtime.GOMAXPROCS(1) in cmd/statusd/main.go and run your computations? Hopefully, we could set it to 1 during syncing and switch to runtime.NumCPU() after to decrease CPU usage. I hope it won't negatively impact app performance...

Also, have you tried with the newest develop? We have pretty new CHT root and it should be much quicker than 88.52s.

@divan
Copy link
Contributor Author

divan commented Nov 2, 2017

@adambabik yes, it uses 4 cores on my machine, so maximum CPU value is 400%.

Interesting suggestion, I'll try to play with it as well. Still, I guess that "processing and writing to the disc 2048 block headers as fast as possible" is the major CPU burning factor here.

I ran these tests on feature/cli-api branch, so it lacks your CHT update, but that's even better, because this timeframe looks pretty good for statistically significant measurements of delay impact.

I ran more measurements, inserting delays in range from 50 to 650 microseconds with 50μs step, updated the original comment, and here are two 2D charts (the same data as 3D chart above):

delay/time delay/cpu

@0xc1c4da
Copy link
Contributor

0xc1c4da commented Nov 18, 2017

nice work @divan I'm curious if we know CPU usage looks like under high traffic Whisper usage? But perhaps this is better suited for another issue

@divan
Copy link
Contributor Author

divan commented Nov 18, 2017

@jarradh no, we don't. I don't think we even simulated high traffic scenario yet. But I do have some ideas how to simulate it, and yes, that'd be a separate issue.

@adambabik
Copy link
Contributor

To simulate this, we need an account that sends a lot of Whisper messages one after another. It would be great to collect some data with different PoW values because we currently use 0.001 and messages with that low PoW can be passed around only by our geth nodes run with custom settings. I believe we'd like to bump it to at least 0.2 (which is a default valye) so that most of geth nodes could pass our messages around, right? Interesting how that would affect the CPU usage.

@ghost
Copy link

ghost commented Sep 24, 2018

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@ghost ghost added the stale label Sep 24, 2018
@ghost
Copy link

ghost commented Oct 1, 2018

This issue has been automatically closed. Please re-open if this issue is important to you.

@ghost ghost closed this as completed Oct 1, 2018
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants