Archive for December 2007

More circuits

My weather karma seems to have gone the wrong way. During the summer, when it was pouring with rain and nobody was flying due to the weather, I didn’t have a single cancellation. Now the winter’s arrived, which I was told often produces some wonderfully clear (albeit cold) days, I’ve had such a long string of cancellations that it’s now been six weeks since I last went flying!

That means that in twelve weeks I had just one booking that went ahead for a double lesson when I started in the circuit. It was inevitable that I was going to be somewhat rusty this time out in the air, and somewhat “behind the aircraft”.

Had another curve-ball as well, there was a pretty strong wind today blowing across the main runway so we’d be using the much shorter cross-wind Runway 23. So today I would be flying a different circuit than last time, in a different direction onto a shorter runway.

I think it’s fair to say that almost all of my landings weren’t exactly greasers, except for the one that for some reason I got almost exactly right. A couple of silly mistakes along the way didn’t help either, but in general I got most things right and just had some issues with the approach — difficult on this runway anyway, and the flare. Practice will sort both of those out.

In the time since the last flight, I’d bought a simple GPS unit (Locosys GT-11) so I could record the track of where I’ve flown. Here’s the track of today’s flying:

20071227-circuit.png

P/UT Hours Today 0:45, Total 11:35

Web 2.0 Service Pack 1

Why do I need to tell each and every web service who my friends are?  Why can’t last.fm, flickr and twitter just get this information from Facebook?  Likewise, why I do have to tell them all where I live, how old I am, what my website is, etc.?

How to (and why) supervise forking processes

Yesterday’s celebratory blog post demonstrated that Upstart is now able to supervise processes that fork into the background, as most daemons do. Now that the code has undergone a little more testing, and been pushed into the archive, it’s worth explaining a little bit more of the background as to the how, and why, we do this.

The why is easiest to answer first. Daemons are normally written to fork, usually twice; this detaches them from the terminal, process group and session that they were spawned from so that they remain running after the user logs out. The fork isn’t just mechanism though, over time a convention has occurred that means daemons don’t go into the background until their initialisation is complete and they’re ready to receive connections — if that’s their bag.

Simply adding an option to remain in the foreground might appear to eliminate the need to deal with the problem, but this also takes away the notification that the daemon is ready for use. Over time this signal can be replaced with other notifications: registering a known D-Bus name, or simply raising SIGSTOP; but these require code changes that need to be agreed with upstream first. Making code changes also assumes that we have the code. Whether we like it or not, sysadmins will often have the need to run proprietary daemons — or even simply older versions of software where the patch is too invasive.

So that’s why we have to do it, now how do we?

This is one of the reasons that building the service supervisor into init, rather than having it as a seperate process, makes sense. Init has a few special kernel-provided buffs, one of which is that orphaned processes are reparented to it. When you run a daemon from the command-line, the process is initially your child; it forks once and the parent dies, the new child is now orphaned, and thus reparented to init. (Most daemons now run setsid and fork a second time. This is to ensure that if they open a tty device, they don’t unexpectedly become its owner.) Init, like any other process, receives notification about its children through wait so will know when daemons terminate; the “must have” of supervision.

So if all daemons are our children we are notified when they terminate and why; we can compare their exit status or signal against a list of known good ones, and choose whether we need to respawn the dead job or mark it as stopped normally.

This isn’t enough though, all we get is the process id of the dead child. We still need to relate that back to a job somehow. One way to do that is to use waitid with the WNOWAIT flag, leaving the process on the table so we can examine /proc to find out more about it. This seems like quite a reasonable approach, we can then match a process to a job by details such as what binary it was actually running. Unfortunately this only works for singleton processes where we’re guaranteed that only one of them exists, both at the job level and at the process-level itself; should the process fork, even to run another child, we could accidentally consider it to have died. Daemons need to be able to run their own children, or even have pools of them to use; and we also need to be able to run multiple copies of daemons where we can support it.

So we really do need to know the process id of the actual daemon process we should be supervising. Unfortunately any method of passing this back to init, even relatively common ones like writing it to a pid file, aren’t sufficiently standard or reliable to do this kind of work.

Ideally the kernel would just tell init when a process was reparented to it, provided both the child process id and that of its previous parent. Such a notification doesn’t exist today, though would be a nice project to try and get it into the kernel mainline; difficult if there’s only one implementation using it.

If we can’t have that, a syscall that would allow us to watch a process and find out when it forks would be the second-best thing. We’d have the previous process id since we were watching it, and we’d hopefully be able to obtain the new child process id from this.

Happily that syscall exists, and I suspect you use it all the time if you’re a developer; it’s a bit of a mad leap to using it inside init, but as you can see, it works rather nicely. All we need do is watch the process, and follow it each time it spawns a new child. We stop watching as soon as we have followed twice (once if a different option is used), or if the process runs a different binary by itself. And thus we can know the process id of daemons we spawned, even if they attempt to detach from their parent process which they’ll just be reparented to anyway.

What’s the syscall? Oh, hmm, is that the time? Got to go! Alright, it’s ptrace.

Supervising forking processes


quest /tmp# cat test.c
#include <sys/types.h>

#include <stdlib.h>
#include <unistd.h>

int
main (int   argc,
      char *argv[])
{
        pid_t pid;

        pid = fork ();
        if (pid > 0)
                exit (0);

        pid = fork ();
        if (pid > 0)
                exit (0);

        pause ();
        exit (0);
}
quest /tmp# gcc -Wall -g -O0 -o test test.c

quest /tmp# cat /etc/event.d/test
wait for daemon
exec /tmp/test

quest /tmp# start test
test (#0) goal changed from stop to start
test (#0) state changed from waiting to starting
event_new: Pending starting event
Handling starting event
event_finished: Finished starting event
test (#0) state changed from starting to pre-start
test (#0) state changed from pre-start to spawned
process_spawn: Spawned main process 6380 for test (#0)
Active test (#0) main process (6380)
test (#0) main process (6380) forked new child 6381
test (#0) main process (6381) forked new child 6382
test (#0) state changed from spawned to post-start
test (#0) state changed from post-start to running
event_new: Pending started event
Handling started event
event_finished: Finished started event