I mentioned previously that on my Mac Mini I am using a cellular connection for my Internet link (instead of dial-up). However, from time to time, the connection would get stuck (after dropping) in the “Disconnecting…” state in the graphical tools. There didn’t seem to be anything I could do to stop it. The system doesn’t have what I usually consider essential tools – ptrace, strace, ltrace. In any case, there is a good chance that all three could be Linux-specific commands, and this system is running Mac OS X 10.4 (Tiger).
Then I remembered gdb. Looking up the processes for pppd I found this:
$ ps auwx | grep ppp[.]*d
root 21475 0.0 0.1 28040 1204 cu. Ss+ 10:24AM 0:00.57 pppd serviceid F31F5F28-9986-489D-88F3-CFA56FF89443 controlled
$
$ sudo gdb -p 21475
Password:
GNU gdb 6.1-20040303 (Apple version gdb-384) (Mon Mar 21 00:05:26 GMT 2005)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type “show copying” to see the conditions.
There is absolutely no warranty for GDB. Type “show warranty” for details.
This GDB was configured as “powerpc-apple-darwin”.
/private/var/log/21475: No such file or directory.
Attaching to process 21475.
Reading symbols for shared libraries . done
Reading symbols for shared libraries …………………………………………….. done
0x90032084 in wait4 ()
(gdb) step
Single stepping until exit from function wait4,
which has no line number information.
^C
Program received signal SIGINT, Interrupt.
0x90032088 in wait4 ()
(gdb) q
The program is running. Quit anyway (and detach it)? (y or n) y
Detaching from process 21475 thread 0xd03.
$
This tells me that the pppd daemon is inside a wait4() function (described in the wait(2) man page). This function is waiting for a child process to complete. So then, the next step is: what is this child process that pppd is waiting on?
$ ps alwwx | grep ppp[.]*d
0 21475 42 0 31 0 552328 1228 – Ss+ cu. 0:00.57 pppd serviceid F31F5F28-9986-489D-88F3-CFA56FF89443 controlled
$ ps alwwx | grep 21475
501 25310 25201 0 31 0 8780 8 – R+ p3 0:00.00 grep 21475
0 21475 42 0 31 0 552328 1228 – Ss+ cu. 0:00.57 pppd serviceid F31F5F28-9986-489D-88F3-CFA56FF89443 controlled
0 25131 21475 0 31 0 27688 740 – S+ cu. 0:00.02 /usr/libexec/CCLEngine -m 1 -l F31F5F28-9986-489D-88F3-CFA56FF89443 -f /Library/Modem Scripts/Nokia 3G Packet RB 460 -v -E -S 5 -L 120 -I Internet Connect -i file://localhost/System/Library/Extensions/PPPSerial.ppp/Contents/Resources/NetworkConnect.icns -C Cancel
$ sudo gdb -p 25131
GNU gdb 6.1-20040303 (Apple version gdb-384) (Mon Mar 21 00:05:26 GMT 2005)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type “show copying” to see the conditions.
There is absolutely no warranty for GDB. Type “show warranty” for details.
This GDB was configured as “powerpc-apple-darwin”.
/private/var/log/25131: No such file or directory.
Attaching to process 25131.
Reading symbols for shared libraries . done
Reading symbols for shared libraries …….. done
0x90001b04 in ioctl ()
(gdb) s
Single stepping until exit from function ioctl,
which has no line number information.
^C
Program received signal SIGINT, Interrupt.
0x90001b04 in ioctl ()
(gdb) q
The program is running. Quit anyway (and detach it)? (y or n) y
Detaching from process 25131 thread 0x20b.
$
So…. the script is stuck in ioctl (described in ioctl(2)). A kill was not sufficient, but a kill -9 stopped it. After this, the graphical tools stopped reporting “Disconnecting…” and a reconnect was possible – and went cleanly.
One aside: note the first line:
grep ppp[.]*d
This matches “nothing” (as well as multiple characters) but does not match the grep command itself (which it would if the nonsense pattern were not included). Small thing, but can help especially in scripts that grep through the ps command output. Other patterns are usable here; the key is that the pattern will not match itself, will match nothing (empty string), and will not match anything which is present in the output.
Howdy,
Thanks for the interesting article. Are you aware of the “ktrace” command? It’s essentially the BSD equivalent of strace — you can get a system call trace of a running program. I find the output somewhat less useful than strace, but it’s still good to know about.
Thanks for this article, I’m struggeling with this problem myself. I can’t seem to get it to work under 10.5 Leopard, though. When I run “gdb -p ” gdb freezes right after it attaches to the program. ps alwwx | grep doesn’t show any other processes than the pppd program either. Any ideas? Thanks.
Don’t have any great ideas. However, if you find that gdb is stopped like that, you may want to try a ^C – which I think will give you a gdb prompt and pause the application. Sending a signal to the application with a kill command from another shell session should do it too.
However, if it is a zombie process – or in an uninterruptable state – then not even gdb can get in. In fact, if it is a zombie process, there’s nothing for it to get (not even memory).
Leopard uses dtrace, not ktrace anymore. Which it adopted from Solaris 10. Solaris used to use a command called truss to do the same as ktrace and strace. Dtrace is far more advanced.
A script has been provided called dtruss that now replaces ktrace.
sudo dtruss -p
Mark