Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Killed: 9"? #48

Closed
spencerwilson-optimizely opened this issue Sep 27, 2018 · 20 comments
Closed

"Killed: 9"? #48

spencerwilson-optimizely opened this issue Sep 27, 2018 · 20 comments

Comments

@spencerwilson-optimizely
Copy link

spencerwilson-optimizely commented Sep 27, 2018

Hello there,

Thank you for your work on the tool! I'm very excited to use it, but seem to have hit a snag:

17:12 ~/code/optimizely/src/www
$ pstree 98465
--- 98465 swilson /Users/swilson/code/optimizely/.virtualenv/www/test/bin/python /Users/swilson/code/optimizely/.virtualenv/www/test/bin/pytest --pdb services_test/client_tes [truncated]

17:13 ~/code/optimizely/src/www
$ sudo py-spy --pid 98465 -f hottt
Killed: 9

I've ensured that the PID of pytest isn't moving around; i.e., it's a long-running process with no child processes. Yet while it's running, I get the output Killed: 9, exit status 137.

Python 2.7.12
py-spy 0.1.8
macOS 12.13.6

Let me know if I can provide any other details!

@benfred
Copy link
Owner

benfred commented Sep 27, 2018

So - it sort of looks like something killed py-spy =(by going 'kill -9' or similar) =(.

Can you figure out what process issued the SIGKILL? There are probably a bunch of ways of getting this, but an easy first stab might be to use dtrace. Running these commands:

curl  http://www.brendangregg.com/DTrace/kill.d > kill.d
sudo /usr/sbin/dtrace -qs kill.d 

will print out a series of lines with the sending process info and what signal is being sent every time a signal happens on your machine. If you filter down to '9' signals sent at the same time py-spy is killed, you should get the process that is killing it.

@spencerwilson-optimizely
Copy link
Author

Good idea. I found that each attempt to run sudo py-spy -p $PID -f hott resulted in a line like

 FROM      COMMAND   SIG TO     RESULT
 9719         sudo     9 9719   0

sudo is killing itself.

I added some debug logging to sudo via /etc/sudo.conf and obtained these logs: gist.

I'm confident that the security policy in effect is permitting the command to go through:

Sep 26 22:20:10 sudo[9924] policy plugin returns 1

cross-referencing w/

The check_policy() function returns 1 if the command is allowed, 0 if not allowed

from man sudo_plugin. Also note that in the ultra debug mode logs its attempting to exec the py-spy command.

I suspect what's most relevant is

Sep 26 22:20:10 sudo[9924] failed to read child status: EOF

in part because that's the single line that gets logged if I decrease the log level to warn.

Probably worth mentioning at this point, too:

$ sudo --version
Sudo version 1.8.17p1
Sudoers policy plugin version 1.8.17p1
Sudoers file grammar version 45
Sudoers I/O plugin version 1.8.17p1

@spencerwilson-optimizely
Copy link
Author

spencerwilson-optimizely commented Sep 27, 2018

For reference, here's the line of code that's coming from.

@spencerwilson-optimizely
Copy link
Author

spencerwilson-optimizely commented Sep 27, 2018

More data, demoing the 'direct invocation' style + showing I'm not trying to use /usr/bin/python:

09:05 ~/code/optimizely-integrations
$ cat foo.py
while True:
  1 + 1

09:05 ~/code/optimizely-integrations
$ sudo py-spy -- /Users/swilson/.pyenv/versions/2.7.12/bin/python foo.py
Killed: 9

@spencerwilson-optimizely
Copy link
Author

Ah, looks like the process gets created, but as a child of /sbin/launchd:

09:27 ~/code/optimizely-integrations
# via https://askubuntu.com/q/56266
$ logpid() { while sleep 1; do  ps -p $1 -o pcpu= -o pmem= ; done; }

09:27 ~/code/optimizely-integrations
$ sudo py-spy -- /Users/swilson/.pyenv/versions/2.7.12/bin/python foo.py
Killed: 9

09:27 ~/code/optimizely-integrations
$ pstree -p 21178
-+= 00001 root /sbin/launchd
 \--- 21178 root /Users/swilson/.pyenv/versions/2.7.12/bin/python foo.py

09:27 ~/code/optimizely-integrations
$ logpid 21178
 99.0  0.0
100.0  0.0
100.0  0.0
 99.7  0.0
100.0  0.0
100.0  0.0
100.0  0.0
^C

@benfred
Copy link
Owner

benfred commented Sep 27, 2018

Hmm, weird.

Instead of launching py-spy like sudo py-spy .. can you go like sudo su to get a shell session as root, and then run py-spy from that ? I'm interested to see if sudo still kills itself in this case.

Can you also enable logging in py-spy? (RUST_LOG=info py-spy --pid 123456).

I think the error message from using the system python would be different. (also the system python on high sierra is 2.7.10 instead).

@spencerwilson-optimizely
Copy link
Author

spencerwilson-optimizely commented Sep 27, 2018

13:12 ~
$ whoami
root

13:12 ~
$ py-spy -- /Users/swilson/.pyenv/versions/2.7.12/bin/python foo.py
Killed: 9

13:13 ~
$ echo $?
137

13:13 ~
$ RUST_LOG=info py-spy -- /Users/swilson/.pyenv/versions/2.7.12/bin/python foo.py
 INFO 2018-09-27T20:15:24Z: py_spy: Command line args: ArgMatches { args: {"rate": MatchedArg { occurs: 0, indices: [3], vals: ["200"] }, "python_program": MatchedArg { occurs: 2, indices: [1, 2], vals: ["/Users/swilson/.pyenv/versions/2.7.12/bin/python", "foo.py"] }, "duration": MatchedArg { occurs: 0, indices: [4], vals: ["2"] }}, subcommand: None, usage: Some("USAGE:\n    py-spy [FLAGS] [OPTIONS] --pid <pid> [python_program]...") }
 INFO 2018-09-27T20:15:24Z: py_spy::python_spy: Found process binary @ '/Users/swilson/.pyenv/versions/2.7.12/bin/python2.7'
Killed: 9

13:15 ~
$ pstree -p 27545
-+= 00001 root /sbin/launchd
 \--- 27545 root /Users/swilson/.pyenv/versions/2.7.12/bin/python foo.py

Strangely, Brendan Gregg's dtrace script isn't listing any SIGKILLs, unlike before.

When I run the looping script w/o py-spy and then manually kill it, the signal is observed by dtrace:

$ python foo.py &
[1] 27609

13:21 ~
$ kill -9 27609

# in another shell...
dtrace: system integrity protection is on, some features will not be available

 FROM      COMMAND   SIG TO     RESULT
27436           sh     9 27609  0

I'm starting to wonder if system integrity protection might be involved. I attempted to trace system calls to see if I could find what was sending or printing the "Killed: 9", but no luck yet

13:35 ~
$ csrutil status
System Integrity Protection status: enabled.

13:35 ~
$ dtruss py-spy -- /Users/swilson/.pyenv/versions/2.7.12/bin/python foo.py
dtrace: system integrity protection is on, some features will not be available

dtrace: failed to execute py-spy: dtrace cannot control executables signed with restricted entitlements

@spencerwilson-optimizely
Copy link
Author

hm, I have another machine, macOS 10.13.4 w/ SIP enabled, and sudo py-spy -- $(pyenv which python) foo.py works fine there.

@benfred
Copy link
Owner

benfred commented Sep 28, 2018

I don't think this problem is related to SIP - iirc the error messages when hitting SIP are a permissions denied error (also it's not getting that far =).

Based off the logs, py-spy is getting killed when reading the virtual memory maps - which really only calls task_for_pid followed by a couple of mach_vm_region calls: https://github.com/rbspy/proc-maps/blob/96b274b826d7f9246856986af632b33672e9a7d9/src/mac_maps/mod.rs#L93-L95

You should be able to see these calls running with something like

sudo  dtrace -n 'pid$target::mach_vm_region:entry'  \
             -n 'pid$target::task_for_pid:entry' \
             -c 'py-spy -f o.svg -- python foo.py'

which should list one task_for_pid call followed by a bunch of mach_vm_region (to get the memory maps). if this works there will also be an extra task_for_pid call at the end (to get the handle to read the process memory).

I'm really not sure why this would cause a sigkill though =( Do you have any antivirus software running? Or is there any other major differences in the environment between your machines?

@benfred
Copy link
Owner

benfred commented Sep 28, 2018

hmm - actually you should be able to use dtruss, and if you can't you probably can't use dtrace either 🤦‍♂️ .

Can you use dtruss/dtrace on your python (without py-spy) ? like does dtruss python foo.py bring up the same error?

I'll try out re-enabling SIP tomorrow to test out.

@spencerwilson-optimizely
Copy link
Author

I identified the issue with dtruss:

21:55 ~
$ which py-spy
/Users/swilson/.pyenv/shims/py-spy

21:55 ~
$ head -n1 $(which py-spy)
#!/usr/bin/env bash

sudo dtruss py-spy ... attempts to trace /usr/bin/env, which is a no-go under SIP (just like attempting to py-spy on /usr/bin/python, I suspect). Substituting with sudo dtruss $(pyenv which py-spy) ... works! So I think that's a red herring. Sorry for that distraction >_>

The output of the command you suggested above:

$ sudo  dtrace -n 'pid$target::mach_vm_region:entry'               -n 'pid$target::task_for_pid:entry'              -c "$(pyenv which py-spy) -f o.svg -- python foo.py"
dtrace: system integrity protection is on, some features will not be available

dtrace: description 'pid$target::mach_vm_region:entry' matched 1 probe
dtrace: description 'pid$target::task_for_pid:entry' matched 2 probes
dtrace: pid 14131 has exited
CPU     ID                    FUNCTION:NAME
  5   4734               task_for_pid:entry

@spencerwilson-optimizely
Copy link
Author

Do you have any antivirus software running? Or is there any other major differences in the environment between your machines?

Good idea. I upgraded the other to the same version of macOS and ensured same version of Python and py-spy, and still the one works, one doesn't. There is an antivirus running, Cb Defense. Unfortunately, I can't turn it off without a trip over to our IT team (and even no guarantees then ;) ). I'll pitch it to them to try to test that hypothesis, though. I don't know anything about how antivirus software works, but that's kinda freaky if it can kill things without being observed by dtrace.

@benfred
Copy link
Owner

benfred commented Sep 28, 2018

hmm , it looks like it's getting killed in task_for_pid then? I guess we'd have to also log the task_for_pid return too to be sure (instead of just the entry)

sudo  dtrace -n 'pid$target::task_for_pid:' \
             -c 'py-spy -f o.svg -- python foo.py'

which should dump out something like

CPU     ID                    FUNCTION:NAME
  2  28269               task_for_pid:entry 
  2  28270                   task_for_pid:0 
  2  28271                   task_for_pid:3 
  2  28272                   task_for_pid:8 
  2  28273                   task_for_pid:a 
  2  28268              task_for_pid:return 

@spencerwilson-optimizely
Copy link
Author

spencerwilson-optimizely commented Sep 28, 2018

dtrace: system integrity protection is on, some features will not be available

dtrace: description 'pid$target::task_for_pid:' matched 14 probes
dtrace: pid 28285 has exited
CPU     ID                    FUNCTION:NAME
  2   4991               task_for_pid:entry
  2   4992                   task_for_pid:0
  2   4993                   task_for_pid:3
  2   4994                   task_for_pid:8

Reran it multiple times and the output seemed consistently that ^

@crepererum
Copy link

I think I've seen this problem before 🤔
hishamhm/htop#682

@benfred
Copy link
Owner

benfred commented Sep 28, 2018

@crepererum I think that htop problem might be the same as this : https://jvns.ca/blog/2018/01/28/mac-freeze/ - which we should handle (with the sleep call here: https://github.com/rbspy/proc-maps/blob/96b274b826d7f9246856986af632b33672e9a7d9/src/mac_maps/mod.rs#L144 and I think apple has fixed the kernel bug in the macos version he's using). The htop issue is mentioned on this twitter thread about that post anyways: https://twitter.com/b0rk/status/957498366606368768

@spencerwilson-optimizely I think this program should reproduce the problem (without all the other rust/py-spy code getting in the way and confusing the issue):

// file: task_for_pid_test.cc
#include <mach/mach_init.h>
#include <mach/port.h>
#include <stdio.h>
#include <stdlib.h>

int main(int argc, const char * argv[]) {
    if (argc <= 1) {
        printf("usage %s <pid>\n", argv[0]);
        return 1;
    }

    int pid = atoi(argv[1]);

    mach_port_name_t task = MACH_PORT_NULL;
    kern_return_t kr = task_for_pid(mach_task_self(), pid, &task);
    if (kr != KERN_SUCCESS) {
        printf("failed to call task_for_pid on pid %i: %x\n", pid, kr);
        return kr;
    }
    printf("success!\n");
    return 0;
}

compile with g++ task_for_pid_test.cc -o task_for_pid_test run with task_for_pid_test 12345 where 12345 is your pid.

@spencerwilson-optimizely
Copy link
Author

spencerwilson-optimizely commented Sep 28, 2018

@benfred : In one shell I kicked off python foo.py, and in another:

$ ./task_for_pid_test 30533
Killed: 9

Really nice work! You called it 🔮

Again, nothing appears in kill.d's output with each run of ./task_for_pid_test.

@benfred
Copy link
Owner

benfred commented Sep 28, 2018

It's great that we figured out where it's happening - but I'm still not sure of why =(.

It might be worth checking to see if you can disable your antivirus and try it out. I'm leaning towards it being a 3rd party program killing the process rather than something with os restrictions like SIP or taskgated (because I'd expect the task_for_pid call to fail rather than cause the process to get killed if this call is restricted by the OS). It still might be worth turning off SIP and verifying though.

Is there anything in the system log when this happens ? (Like open Console.app -> and checking system.log or /var/log). Or is there anything in the antivirus logs?

@spencerwilson-optimizely
Copy link
Author

spencerwilson-optimizely commented Sep 28, 2018

11:27 ~
$ tail -f /var/log/system.log | rg task_for_pid
# I run ./task_for_pid_test in another shell...
Sep 28 11:27:05 us-swilson CbDefense_Svc[8660]: The application "/Users/swilson/task_for_pid_test" attempted to open the process "python3.6" by calling the function "task_for_pid". The operation was blocked and the application terminated by CbDefense

Filed a ticket with our IT team to see if an exception can safely be made.

@benfred
Copy link
Owner

benfred commented Sep 28, 2018

awesome! glad we got to the bottom of this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants