High CPU consumption in probes #1454

2opremio · 2016-05-09T14:46:39Z

On the three worker machines of the service the CPU consumption of the 0.15 candidate is over the roof

Also, reports are being dropped. Probably for the same reason:

<probe> WARN: 2016/05/09 14:22:16.781555 Docker reporter took longer than 1s
<probe> ERRO: 2016/05/09 14:22:23.174719 Dropping report to 10.0.26.13:4040
<probe> WARN: 2016/05/09 14:22:25.812799 Docker reporter took longer than 1s
<probe> ERRO: 2016/05/09 14:22:27.388793 Dropping report to 10.0.26.13:4040
<probe> ERRO: 2016/05/09 14:22:29.333923 Dropping report to 10.0.26.13:4040
<probe> ERRO: 2016/05/09 14:22:32.135056 Dropping report to 10.0.26.13:4040
<probe> ERRO: 2016/05/09 14:22:38.283660 Dropping report to 10.0.26.13:4040
<probe> ERRO: 2016/05/09 14:22:41.262928 Dropping report to 10.0.26.13:4040
<probe> ERRO: 2016/05/09 14:22:44.862768 Dropping report to 10.0.26.13:4040
<probe> ERRO: 2016/05/09 14:22:50.425897 Dropping report to 10.0.26.13:4040
<probe> ERRO: 2016/05/09 14:22:53.383612 Dropping report to 10.0.26.13:4040
<probe> WARN: 2016/05/09 14:22:55.870504 Docker reporter took longer than 1s
<probe> ERRO: 2016/05/09 14:22:57.934171 Dropping report to 10.0.26.13:4040
<probe> ERRO: 2016/05/09 14:22:59.411752 Dropping report to 10.0.26.13:4040
<probe> WARN: 2016/05/09 14:23:01.011165 Docker reporter took longer than 1s
<probe> WARN: 2016/05/09 14:23:02.742918 Docker reporter took longer than 1s
<probe> ERRO: 2016/05/09 14:23:03.517608 Dropping report to 10.0.26.13:4040
<probe> ERRO: 2016/05/09 14:23:06.190370 Dropping report to 10.0.26.13:4040
<probe> ERRO: 2016/05/09 14:23:08.418448 Dropping report to 10.0.26.13:4040
<probe> ERRO: 2016/05/09 14:23:11.232937 Dropping report to 10.0.26.13:4040
<probe> ERRO: 2016/05/09 14:23:14.856745 Dropping report to 10.0.26.13:4040

Resulting in a half-assed visualization of the service:

Note how an app-mapper, a frontend and the ui-servers are missing

Profile:
pprof.localhost:4041.samples.cpu.001.pb.gz

The text was updated successfully, but these errors were encountered:

2opremio · 2016-05-09T15:03:02Z

There are ~200 containers per machine from which less than 100 are running.

In order to tweak this, kubelet's arguments need to be adjusted http://kubernetes.io/docs/admin/garbage-collection/ so it seems we should be able to support that number of containers.

2opremio · 2016-05-09T15:58:32Z

Here's the output of go tool pprof -focus 'GetNode' -png pprof.localhost\:4041.samples.cpu.001.pb.gz

2opremio · 2016-05-09T16:01:34Z

It seems we are building the nodes for the report in every reporter iteration which, for containers which didn't change, are wasted CPU cycles.

I will try caching the nodes and only regenerating then when they are affected by a docker event.

tomwilkie · 2016-05-10T07:54:24Z

The "Dropping report" is a new logging line added this release; it will be dropping reports due to the app being slow, not the probe.

2opremio · 2016-05-10T07:59:18Z

The "Dropping report" is a new logging line added this release; it will be dropping reports due to the app being slow, not the probe.

Good to know, then that's #1457

2opremio added this to the 0.15.0 milestone May 9, 2016

This was referenced May 10, 2016

Precompute base of the container nodes #1456

Merged

Make reports smaller #1201

Open

High CPU consumption in the app #1457

Open

This was referenced May 10, 2016

Limit tables to 20 rows #1465

Merged

Add performance tests to CI #1406

Open

2opremio self-assigned this May 11, 2016

2opremio closed this as completed May 12, 2016

2opremio added the performance Excessive resource usage and latency; usually a bug or chore label May 13, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

High CPU consumption in probes #1454

High CPU consumption in probes #1454

2opremio commented May 9, 2016 •

edited

Loading

2opremio commented May 9, 2016

2opremio commented May 9, 2016

2opremio commented May 9, 2016 •

edited

Loading

tomwilkie commented May 10, 2016

2opremio commented May 10, 2016

High CPU consumption in probes #1454

High CPU consumption in probes #1454

Comments

2opremio commented May 9, 2016 • edited Loading

2opremio commented May 9, 2016

2opremio commented May 9, 2016

2opremio commented May 9, 2016 • edited Loading

tomwilkie commented May 10, 2016

2opremio commented May 10, 2016

2opremio commented May 9, 2016 •

edited

Loading

2opremio commented May 9, 2016 •

edited

Loading