-
Notifications
You must be signed in to change notification settings - Fork 4
KernelDebugging
- summary How to debug with a two-machine setup
You need two machines, both running exactly the same version of OSX (e.g. 10.5.8 and 10.5.8, or 10.6.2 and 10.6.2). It helps if they're the same architecture, but this isn't strictly necessary.
You'll need to download the OSX Kernel Debug SDK for the OS version you're currently using/targeting. When mounted, it will show up as `/Volumes/KernelDebugKit` for whatever version you're using.
You need a mechanism for transferring files between the two computers. One way of doing this is to export a common directory over NFS; another is to use `rsync`. For the purposes of this document, I'll assume that you have a directory `/target` and that the machine is called `target` as well, so that `rsync -cav . target:/target` will copy your files across.
You need to have enabled your target box to go into a debug state upon a kernel panic, or upon the power button being depressed (usually used to put the machine to sleep). This only needs to be done once per machine, since it's stored in hardware status.
* `sudo nvram boot-args="debug=0x14e"`
Alternatively, you can put it in the `/Library/Preferences/SystemConfiguration/com.apple.Boot.plist` under the `Kernel Flags` entry, which makes it suitable for virtual machines
What do the debug flags mean? Well, they can take one of several values; they're listed at Apple's Kernel Programming documentation. Here is a succinct summary:
* `0x01` - Halt at boot time and wait for debugger attach * `0x02` - Send kernel debugging printf output to console * `0x04` - Drop into debugger on non-maskable interrupt * `0x08` - Send kernel debugging kprintf to serial port * `0x10` - Make ddb(kdb) the default debugger * `0x20` - Output diagnostics to system log * `0x40` - Allow debugger to ARP and route * `0x80` - Support old versions of gdb on newer systems * `0x100` - Disable graphical panic dialog
These can be combined, so `0x14e` is Disable graphic + Allow ARP + Send data to serial + Drop on NMI + Send data to console. `0x144` is another common variant; that just has less logging printed.
The non-maskable interrupt, also known as the programmer's button, is invoked by pressing the power switch on most macs. Note that by enabling the NMI the power switch loses its touch-to-sleep or touch-to-shutdown property. If you are using a Mac where there is no power switch, you can also send Command+PowerKey (e.g. if your keyboard has a power button) or Command+Option+Control+Shift+Escape.
By pressing the NMI you gain access to the debugger immediately, and is useful when booting a system normally and then wanting to attach a remote debugger.
# `cd /path/to/maczfs` # `rm -rf build` # `xcodebuild -configuration Debug` # `cd build/Debug` # `ssh target sudo rm -rf /target/*` # `rsync -cav . target:/target` # `ssh target sudo chown -R root:wheel /target` # `ssh target sudo kextload -s /target /target/zfs.kext` # `rsync -cav target:/target/*.sym .` # `gdb -arch i386 /Volumes/KernelDebugKit/mach_kernel`
The debugger can be set up as follows:
# `source /Volumes/KernelDebugKit/kgmacros` # `target remote-kdp` # `add-kext zfs.kext`
You're now good to go with debugging. You can attach to the remote target, print out the frame, and then debug to your heart's content.
# `attach target` # ... # `kdp-reboot`
If you want to cycle through running test cases, you'll need to add `/target` to your `PATH` and `DYLD_LIBRARY_PATH` variables. This is easy to do with `env` when running cases:
# `ssh target env PATH=/target DYLD_LIBRARY_PATH=/target zpool scrub`
* [http://developer.apple.com/mac/library/DOCUMENTATION/Darwin/Conceptual/KEXTConcept/KEXTConceptDebugger/hello_debugger.html Kext debugging] from developer.apple.com * [http://developer.apple.com/mac/library/technotes/tn2002/tn2063.html TN2063] understanding kernel panic logs * [http://developer.apple.com/sdk/ Kernel Debug SDKs] for your operating system * [http://blob.inf.ed.ac.uk/sxw/2010/01/17/debugging-a-mac-os-x-kernel-panic/ Debugging a kernel panic] by Simon, which suggests that a EDX containing 0x0 is symptomatic of a null dereference.