Whatever you do, don’t fix the kernel!

As you may have read in LWN (subscription required, and strongly recommended anyway), there’s been some argument on the linux-hotplug mailing list, the historically named home of udev development, about device naming.

The key threads are “default udev rules” and “Patches for device names“.

It all started when Kay reminded everybody that distributions should attempt to drop their own udev rules in favour of those supplied by upstream.  For those not familar with udev, the rules are a language that creates device nodes and performs other actions based the information about that device from the kernel.  A typical rule to put all devices from the “sound” subsystem into the “audio” group looks like:

SUBSYSTEM=="sound", GROUP="audio"

Sometimes these rules also change the names of the devices.  For example rules such as the following are automatically generated to keep the name of your ethernet devices the same between reboots:

SUBSYSTEM=="net", ATTRS{address}=="00:11:22:33:44:55", ATTR{type}=="1", NAME="eth0"

Ironically, perhaps, none of the argument is about the names of the devices, the permissions assigned or the groups they’re placed in.  We’re all pretty much in agreement about that.

Every major distribution pretty much follows the plan laid out in the devices.txt file found in the kernel’s Documentation sub-directory.  This is maintained by the Linux Assigned Names and Numbers Authority, and up until 2.2, was included by reference in the Filesystem Hierarchy Standard (FHS).  Nobody really knows why the reference was removed, I guess the LSB didn’t like having standards everybody agreed on ;-)

So what is the argument about?  Marco d’Itri, the Debian udev maintainer, is arguing because he’s spent a lot of time and effort making their rules readable and elegant in their operation.  The upstream rules are, in his opinion, somewhat scraggy.  I don’t really see this as a problem, we can fix the upstream rules to be more elegant easily enough.

My argument is different, and is a little more fundamental.

While most of the rules do udev-specific things like permissions, groups, run callouts to gather more information and perhaps run programs after device creation, we have many rules such as this:

KERNEL=="hw_random", NAME="hwrng"

What that says is:

Rename the kernel device “hw_random” to “hwrng”.

This makes the device name correct according to devices.txt.  What irritates me about this is that this rule should be entirely unnecessary!  It would be a one line patch to the kernel to cause it to name the device properly in the first place.  Then we wouldn’t need to spend the resource and CPU time changing the name every single time every Linux machine around the world boots.

There’s another set of rules that annoys me:

KERNEL=="device-mapper", NAME="mapper/control"

The kernel object for the device mapper control node is /sys/class/misc/device-mapper, but the device name according to devices.txt should be /dev/mapper/control - in a sub-directory. The kernel and udev have a mechanism to deal with this, the kernel object could be named /sys/class/misc/mapper!control and the right thing will happen.

Another similar class of devices needs udev to rename them:

KERNEL=="mice", NAME="input/mice"
KERNEL=="mouse[0-9]*", NAME="input/mouse%n"

The first one seems straight forward, but the kernel object is named /sys/class/input/mice so if we used the ! trick, it would become /sys/class/input/input!mice. I can appreciate that it’s ugly. Similarly for the mouse case.

I’ve suggested a fix for this though, and this fix also alleviates any concerns about backwards compatibility with sysfs names. The uevent from the kernel for the “mice” device looks like this:

ACTION=add
DEVPATH=/devices/virtual/input/mice
SUBSYSTEM=input
MAJOR=13
MINOR=63

I’ve suggested where the device ends up in a sub-directory, adding an extra field to this:

DEVNAME=input/mice

When present, udev would use this instead of the last part of the sysfs path as the kernel name. The extra cost to the kernel is a single %s in an existing sprintf() call, the result, a vast saving in userspace time.

This fix would also let us deal with the raw USB devices, and other things like the DVB devices, where we have to construct the device names. For example, the following rule is used to name DVB devices:

SUBSYSTEM=="dvb", PROGRAM="/bin/sh -c 'K=%k; K=$${K#dvb}; printf dvb/adapter%%i/%%s $${K%%%%.*} %%{K#*.}" NAME="%c"

That means that for every DVB device, on every computer, every time Linux boots, we have to fork and exec a shell, do some string pattern matching, fork and exec printf and apply more string pattern matching to the format string to name the device.

This could be avoided by doing that printf in the kernel, and setting DEVNAME for that device.

Device names are set down in a standard. That standard is shipped inside the kernel’s own source tree. Most distributions are already following that standard. The udev default rules follow that standard. Most distributions are likely to adopt the default udev rules. This is, for all intents and purposes, as official naming policy as you can get.

For those devices where the name is static, or constructed entirely from information from the kernel (ie. not persistent storage, input, network, etc.); why do we waste resource and CPU time every single boot changing the name that the kernel exports to match the standard?

To me this is obvious, fix the kernel to export the right name in the first place.

To kernel developers, such as Greg K-H, this is not so obvious:

“Wait, why do this at all?”

and

“Can’t you live with input devices having a few rules in udev? Is it really that hard to maintain? :)”

While patches were apparently welcome in the first thread, by the second thread when it was clear that patches were going to be done, they didn’t seem quite so welcome after all.

This isn’t the first time that I’ve seen kernel developers claim that it’s better to work around the kernel in userspace than it is to fix it. I could understand this if we didn’t have the source code to our own kernel, but we do.

The kernel isn’t sacred and it isn’t a separate part of the system. It needs to be seen as just one component of a fully integrated system, especially by its developers.

That 12ft-high wall between “kernel space” and “user space” needs to come down.

As LWN notes, we have a lot to talk about at the LPC in September.

9 Comments

  1. FACORAT Fabrice:

    IMHO kernel dev won’t fix this because it could be seen as a API exposed to userspace, and so kernel dev will not change it.
    Please note that not all users are using udev ( think embedded devices ), so IMHO they end up using the default kernel names. Chaging the name in the kernel will break theses devices.

  2. Scott James Remnant:

    @Fabrice: which is why my DEVNAME proposal *explicitly* preserves backwards compatibility - anything not using it would have rules to rename the kernel name anyway

  3. davidz:

    Scott, here’s why you’re wrong. It’s very simple and comes down to two points

    - you obviously agree we can’t break huge amounts of userspace by changing DEVPATH

    - having two names emitted from the kernel (_just_ because lots of user space is
    broken) is just wrong and confusing
    => much better to fix up things in user space

    Besides, what’s in a freaking name _anyway_? Apps should be using stable symlinks or, gosh, a device enumeration framework like HAL or the upcoming DeviceKit.

  4. knipknap:

    Well, Linux does have a strong commitment to backwards compatibility, so renaming user space API (and device names are considered that) will forever leave a trail. In your proposal the kernel will have to support the new DEVNAME forever and future changes can only be made by adding new ones; input/mice is forever. I expect breaking this API would be much less of a pain in Hotplug. If there is a performance problem, why not come up with a user-space way of changing the device name that does not require spawning a shell instead?

  5. Arpad Borsos:

    I hope at least Ubuntus kernel is patched this way so that this unnecessary work doesn’t slow my boot time down.

  6. Susan:

    As a simple user, does all this stuff your talking about have to do with why hot-plugging my camera into Fedora no longer works when I upgraded to Fedora 9?

    -Susan

  7. Blackpaw:

    Expose a raft of backward compatibility issues and any unknown current ones just to shave a few seconds (if that) off boot time?

    Don’t fix stuff that isn’t broken.

  8. nona:

    A couple of points to consider:

    1) Aside from udev and hal, what other userspace apps rely on the kernel device name (as it appears in sysfs too)?

    2) If there aren’t any, do we already have to upgrade udev/hal in lock-step with the kernel for other reasons?

    3) In what way would old userspace (old udev) break with a new kernel, or new userspace break with an old kernel?

    4) How much extra boot time is needed for all the extra rules/shell callouts/etc.

    Generally I believe you’re right. But remembering all the past breakage with udev vs kernel across up/downgrades, I can understand why some people might be apprehensive about doing this.

    Of course, if we have to up/downgrade the kernel in lock-step with udev/hal for other reasons, then this API break doesn’t really matter - we might as well sneak in the cleanup.

    If the added advantages (cleanup, speed, etc) outweigh the added risks then it’s a no-brainer of course.

  9. Scott James Remnant » Blog Archive » Calling things by the same name:

    [...] response to my blog post “Whatever you do, don’t fix the kernel!“, David Zeuthen (prominent plumber, the maintainer of HAL and author of DeviceKit) wrote: [...]

Leave a comment