Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Probe is using 40% cpu #284

Closed
tomwilkie opened this issue Jun 24, 2015 · 13 comments
Closed

Probe is using 40% cpu #284

tomwilkie opened this issue Jun 24, 2015 · 13 comments
Labels
tech-debt Unpleasantness that does (or may in future) affect development
Milestone

Comments

@tomwilkie
Copy link
Contributor

top - 14:48:42 up  3:50,  4 users,  load average: 1.08, 1.21, 0.93
Tasks: 120 total,   2 running, 118 sleeping,   0 stopped,   0 zombie
%Cpu(s):  7.0 us, 22.2 sy,  0.0 ni, 70.6 id,  0.0 wa,  0.2 hi,  0.0 si,  0.0 st
KiB Mem:   1017796 total,   565664 used,   452132 free,    11924 buffers
KiB Swap:  3063804 total,   176724 used,  2887080 free.   138760 cached Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                                         
 6850 root      20   0   59044  54084   2800 S  38.0  5.3  64:07.44 probe                                                                                                           
 1217 root      20   0  375036   6956   1748 S   4.0  0.7   7:23.28 dragent                                                                                                         
 1262 root      20   0  748996  29240   4464 S   4.0  2.9   3:54.85 ruxitagent                                                                                                      
 1218 root      20   0 1707540  65120   7056 S   2.6  6.4   7:28.29 java                                                                                                            
  655 root      20   0 1200600  13084   3144 S   2.0  1.3  12:48.40 docker                                                                                                          
 2250 root      20   0  256652  37376  20596 S   1.7  3.7   3:57.39 weaver                                                                                                          
 2485 root      20   0   13328   2840   1540 S   1.3  0.3   2:42.29 weavedns                                                                                                        
 5644 ruxitusr  20   0  215408  70100  64848 S   1.0  6.9   0:37.53 ruxitagentnetwo                                                                                                 
    7 root      20   0       0      0      0 S   0.7  0.0   1:34.30 rcu_sched             
@tomwilkie
Copy link
Contributor Author

I think the cpu usage stems from walking the proc tree 3 times - once for procspy, onces for process topology, and once for docker tagger.

@tomwilkie
Copy link
Contributor Author

prof copy

@tomwilkie
Copy link
Contributor Author

70% of our CPU activity is syscalls (read, listdir, stat etc)

We could reduce this by two thirds if we walk the proc tree once, gathering the required info, then let prospy, process reporter and docker tag loose on it.

process reported + docker tagger are easy, so I'll start with them

@tomwilkie
Copy link
Contributor Author

Next step: make procspy use the cached process walk. Will need to extend the process work to read fds & /proc//net/tcp entries.

@tomwilkie tomwilkie removed their assignment Jul 8, 2015
@inercia
Copy link
Contributor

inercia commented Sep 1, 2015

@tomwilkie what do you mean when you say "make procspy use the cached process walk"? you mean to move the code from #287 to procspy?

@tomwilkie
Copy link
Contributor Author

It would need extending too; procspy reads more files than the existing proc walker.

I would probably move the bits of procspy we use into the scope codebase (as opposed to moving this out into procspy); this would introduce a lot of coupling.

@tomwilkie tomwilkie added the tech-debt Unpleasantness that does (or may in future) affect development label Sep 1, 2015
@inercia
Copy link
Contributor

inercia commented Sep 1, 2015

Do you mean that we should drop scope's dependency on procspy altogether?

@tomwilkie
Copy link
Contributor Author

By moving the bits of procspy we use into scope; yes.

@inercia
Copy link
Contributor

inercia commented Sep 1, 2015

Ok, I'll take a look at this...

@inercia inercia self-assigned this Sep 1, 2015
@tomwilkie
Copy link
Contributor Author

Have a quick chat with @peterbourgon before you start.

Also, try and get your other outstanding PRs finished first; we don't like too many outstanding.

@inercia
Copy link
Contributor

inercia commented Sep 1, 2015

Sure, I'll discuss the details with @peterbourgon. I'll try to finish #404 first, but I usually try to have several tasks in the pipeline so I never run out of work...

@tomwilkie
Copy link
Contributor Author

Seems like cpu usage has gone high again. Need to do some more profiling.

@inercia
Copy link
Contributor

inercia commented Sep 10, 2015

@tomwilkie #450 reduces the number of syscalls by avoiding many stat()s, open()s, close()s

profile

Note: I'm using go 1.5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tech-debt Unpleasantness that does (or may in future) affect development
Projects
None yet
Development

No branches or pull requests

2 participants