Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chrome renderer and worker processes killed before leak scan or summary #544

Closed
derekbruening opened this issue Nov 28, 2014 · 9 comments
Closed

Comments

@derekbruening
Copy link
Contributor

From [email protected] on August 05, 2011 16:35:53

Running WorkerFileSystemTest.Temporary we have the main chrome.exe process
and two children, the renderer (not there with --single-process) and the
worker (test-specific). The renderer and worker die before their error
summary is printed out (presumably killed by the parent chrome process).

We need to either change how they're killed, or add an annotation ( issue #283 ) to send a nudge requesting a summary right before they are killed: simplest to have the nudge also kill to avoid needing bi-directional communication.

% ls -lt logs
drwxr-xr-x+ 1 bruening Domain Users 0 Aug 5 16:23 DrMemory-chrome.exe.3608.000
drwxr-xr-x+ 1 bruening Domain Users 0 Aug 5 16:23 DrMemory-chrome.exe.5900.000
drwxr-xr-x+ 1 bruening Domain Users 0 Aug 5 16:22 DrMemory-chrome.exe.2232.000
% grep FINAL logs/DrMemory-/r
logs/DrMemory-chrome.exe.2232.000/results.txt:FINAL SUMMARY:

% head -3 logs/D_/r_
==> logs/DrMemory-chrome.exe.2232.000/results.txt <==
Dr. Memory version 1.4.451 build 1 built on Aug 5 2011 16:14:33
Dr. Memory results for pid 2232: "chrome.exe"
Application cmdline: ""E:\src\chromium\src\build\Debug\chrome.exe" --disable-web-resources --disable-preconnect --no-first-run --no-default-browser-check --enable-logging --log-level=1 --safebrowsing-disable-auto-update --no-default-browser-check --test-type=ui --user-data-dir="C:\Users\bruening\AppData\Local\Temp\scoped_dir3606" --testing-channel=ChromeTestingInterface:5244.1 --noerrdialogs --metrics-recording-only --enable-logging --allow-file-access-from-files --disable-tab-closeable-state-watcher --allow-file-access --unlimited-quota-for-files --use-gl=osmesa --disable-accelerated-compositing --enable-file-cookies --homepage=about:blank --test-name=WorkerFileSystemTest.Temporary"

==> logs/DrMemory-chrome.exe.5900.000/results.txt <==
Dr. Memory version 1.4.451 build 1 built on Aug 5 2011 16:14:33
Dr. Memory results for pid 5900: "chrome.exe"
Application cmdline: ""E:\src\chromium\src\build\Debug\chrome.exe" --type=renderer --noerrdialogs --disable-accelerated-compositing --enable-logging --log-level=1 --use-gl=osmesa --lang=en-US --force-fieldtest=ConnCountImpact/conn_count_6/ConnnectBackupJobs/ConnectBackupJobsEnabled/DnsImpact/default_enabled_prefetch/DnsParallelism/parallel_10/GlobalSdch/global_enable_sdch/IdleSktToImpact/idle_timeout_60/Prefetch/ContentPrefetchPrerender2/ProxyConnectionImpact/proxy_connections_32/SSLFalseStart/FalseStart_disabled/SpdyCwnd/cwnd16/SpdyImpact/npn_with_spdy/WebSocketExperiment/default/ --user-data-dir="C:\Users\bruening\AppData\Local\Temp\scoped_dir3606" --disable-client-side-phishing-detection --channel=2232.26FAB4B8.537668041 /prefetch:3"

==> logs/DrMemory-chrome.exe.3608.000/results.txt <==
Dr. Memory version 1.4.451 build 1 built on Aug 5 2011 16:14:33
Dr. Memory results for pid 3608: "chrome.exe"
Application cmdline: ""E:\src\chromium\src\build\Debug\chrome.exe" --type=worker --channel=2232.270674D0.1368925514 --enable-logging --log-level=1 --ignored=" --type=renderer " /prefetch:5"

Original issue: http://code.google.com/p/drmemory/issues/detail?id=544

@derekbruening
Copy link
Contributor Author

From [email protected] on August 05, 2011 15:24:22

Note that this is not just a missing summary: it's a missing leak scan as
well.

On Linux SIGTERM is used instead of SIGKILL, giving Memcheck a chance to do
its leak scan and print its summary.

On Windows there is no simple "soft" kill from another process. Could use BroadcastSystemMessage to inform target to exit.

Dr. Memory's nudges are meant for this type of situation, though, and have been used for exactly this on Linux in the past. Re: annotation-triggers-nudge: the current API doesn't support a client nudging another process: only its own. No clean way to launch drconfig: though private loader should support that.

Summary: chrome renderer and worker processes killed before leak scan or summary

@derekbruening
Copy link
Contributor Author

From [email protected] on August 08, 2011 08:25:34

I'm leaning toward adding dr_nudge_process() to API, could be useful in other situations as well (we plan to add other IPC to API in the future).

I'm not a fan of just having a guessed-at timeout before a hard kill that seems long enough for the leak scan: but OTOH then you know for sure the process was shut down.

Status: Accepted
Owner: [email protected]

@derekbruening
Copy link
Contributor Author

From [email protected] on December 01, 2011 09:53:43

the plan for now is to use --single-process

however it seems that there are still chrome.exe processes that are killed abrubtly, preventing a clean exit and leak scan. I have not yet studied in detail but I think they're other tabs or sthg and the ui_test scripting kills them.

@derekbruening
Copy link
Contributor Author

From [email protected] on December 02, 2011 07:36:26

to clarify: what we miss here is the list of named suppressions that were used, the leak scan (if leaks were enabled), and least important if running drmem through other scripts, the error summary

@derekbruening
Copy link
Contributor Author

From [email protected] on December 05, 2011 10:27:46

the plan for this quarter is:

  1. add --single-process to chrome scripts
  2. runtime op to print supp used incrementally
    exponential backoff in freq printed: some ui_tests runs have 6Kx
    can later add script to scrape

Owner: [email protected]

@derekbruening
Copy link
Contributor Author

From [email protected] on April 19, 2012 12:50:42

this has been a real pain point for debugging where asserts don't stay up long enough to attach debuggers. it has also caused problems with pcache generation where child processes don't get to generate everything b/c they're terminated too soon. I may go fix it.

** TODO until have annotations, could have drmem look for NtTerminateProcess targeting another process and turn it into a nudge

nudging checks drmarker, so should fail when target is not under DR (and we
could assume that corresponds to being under drmem), and if it fails drmem
can let the NtTerminateProcess go through: else, turn into a nop

Owner: [email protected]

@derekbruening
Copy link
Contributor Author

From [email protected] on April 23, 2012 12:48:40

summary:

  • under new option -soft_kills which is Windows-only for now
  • on seeing NtTerminateProcess of another process, sends a
    nudge to that process using dr_nudge_client_ex() from DRi#742
  • nudge handler now has two types: leak scan, or terminate
  • terminate nudge uses dr_exit_process() from DRi#743 to do
    a clean exit that includes a leak scan, error and suppression
    summary, and symbol and code cache generation
  • NtTerminateProcess syscall is skipped if the nudge is sent
    successfully, which indicates that the target process
    is under DR control
  • added a new test procterm that creates a child whose leak will
    only be caught if the parent's termination of the child is
    turned into a clean exit

doesn't seem to be much of a perf hit as measured on ui_tests
NPAPITesterBase.NPObjectProxy, and solves multiple problems of
child processes being terminated

-no_soft_kills:
% for ((i=0; i<3; i++)); do BROWSER_WRAPPER="e:\src\drmemory\git\build_drmem_rel\bin\drmemory.exe -no_soft_kills -pause_at_assert -light -logdir C:/users/bruening/AppData/LocalLow/drmemory -suppress e:\src\chromium\src\tools\valgrind\DrMemory\suppressions.txt -quiet -batch -no_results_to_stderr -callstack_max_frames 40 -dr e:/src/dr/git/exports --" /usr/bin/time ./ui_tests --ui-test-action-timeout=1200000 --ui-test-action-max-timeout=2800000 --ui-test-terminate-timeout=1200000 --gtest_filter="NPAPITesterBase.NPObjectProxy"; done 2>&1 | grep elapsed | sed 's/elapsed/ elapsed/;s/:/ /' | awk '{s+=$4;s+=60*$3;l = l ", " $3 ":" $4;n++}END{print "ave=",s/n,l}'
ave= 59.3267 , 1:02.29, 0:55.75, 0:59.94

-soft_kills:
% for ((i=0; i<3; i++)); do BROWSER_WRAPPER="e:\src\drmemory\git\build_drmem_rel\bin\drmemory.exe -pause_at_assert -light -logdir C:/users/bruening/AppData/LocalLow/drmemory -suppress e:\src\chromium\src\tools\valgrind\DrMemory\suppressions.txt -quiet -batch -no_results_to_stderr -callstack_max_frames 40 -dr e:/src/dr/git/exports --" /usr/bin/time ./ui_tests --ui-test-action-timeout=1200000 --ui-test-action-max-timeout=2800000 --ui-test-terminate-timeout=1200000 --gtest_filter="NPAPITesterBase.NPObjectProxy"; done 2>&1 | grep elapsed | sed 's/elapsed/ elapsed/;s/:/ /' | awk '{s+=$4;s+=60*$3;l = l ", " $3 ":" $4;n++}END{print "ave=",s/n,l}'
ave= 58.7467 , 0:55.56, 1:02.60, 0:58.08

so no perf diff at all, surprisingly

a manual run I did shows a hit but see loops above:
-no_soft_kills:
[ OK ] NPAPITesterBase.NPObjectProxy (59967 ms)
0.01user 0.04system 1:00.10elapsed 0%CPU (0avgtext+0avgdata 237056maxresident)k
-soft_kills:
[ OK ] NPAPITesterBase.NPObjectProxy (63126 ms)
0.00user 0.01system 1:03.21elapsed 0%CPU (0avgtext+0avgdata 236544maxresident)k

@derekbruening
Copy link
Contributor Author

From [email protected] on April 23, 2012 12:54:40

confirming for -soft_kills:
% grep -c 'SUPPRESSIONS USED' /c/users/bruening/AppData/LocalLow/drmemory/D_/g_
/c/users/bruening/AppData/LocalLow/drmemory/DrMemory-chrome.exe.2668.000/global.2668.log:1
/c/users/bruening/AppData/LocalLow/drmemory/DrMemory-chrome.exe.2828.000/global.2828.log:1
/c/users/bruening/AppData/LocalLow/drmemory/DrMemory-chrome.exe.3164.000/global.3164.log:1
/c/users/bruening/AppData/LocalLow/drmemory/DrMemory-chrome.exe.3412.000/global.3412.log:1
/c/users/bruening/AppData/LocalLow/drmemory/DrMemory-chrome.exe.404.000/global.404.log:1
/c/users/bruening/AppData/LocalLow/drmemory/DrMemory-chrome.exe.4188.000/global.4188.log:1
/c/users/bruening/AppData/LocalLow/drmemory/DrMemory-chrome.exe.4472.000/global.4472.log:1
/c/users/bruening/AppData/LocalLow/drmemory/DrMemory-chrome.exe.4524.000/global.4524.log:1
/c/users/bruening/AppData/LocalLow/drmemory/DrMemory-chrome.exe.4912.000/global.4912.log:1
/c/users/bruening/AppData/LocalLow/drmemory/DrMemory-chrome.exe.5168.000/global.5168.log:1
/c/users/bruening/AppData/LocalLow/drmemory/DrMemory-chrome.exe.5260.000/global.5260.log:1
/c/users/bruening/AppData/LocalLow/drmemory/DrMemory-chrome.exe.6068.000/global.6068.log:1

while -no_soft_kills only has final stuff for 2 of the 4 processes (only 2
runs here, threw away other accidentally):
% grep -c 'SUPPRESSIONS USED' /c/users/bruening/AppData/LocalLow/drmemory/D_/g_
/c/users/bruening/AppData/LocalLow/drmemory/DrMemory-chrome.exe.4320.000/global.4320.log:0
/c/users/bruening/AppData/LocalLow/drmemory/DrMemory-chrome.exe.5532.000/global.5532.log:1
/c/users/bruening/AppData/LocalLow/drmemory/DrMemory-chrome.exe.5872.000/global.5872.log:0
/c/users/bruening/AppData/LocalLow/drmemory/DrMemory-chrome.exe.5944.000/global.5944.log:1
/c/users/bruening/AppData/LocalLow/drmemory/DrMemory-chrome.exe.4476.000/global.4476.log:0
/c/users/bruening/AppData/LocalLow/drmemory/DrMemory-chrome.exe.5504.000/global.5504.log:1
/c/users/bruening/AppData/LocalLow/drmemory/DrMemory-chrome.exe.5768.000/global.5768.log:0
/c/users/bruening/AppData/LocalLow/drmemory/DrMemory-chrome.exe.6132.000/global.6132.log:1

@derekbruening
Copy link
Contributor Author

From [email protected] on April 26, 2012 18:52:52

This issue was closed by revision r831 .

Status: Fixed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant