Linux Kernel sync() Bug

There is a bug in the Linux kernel (up to 2.6.37) that can result in a sync() command taking many minutes (instead of several seconds).

The exact nature of the bug is unclear, but it shows up in the use of dpkg – since dpkg now uses sync() instead of fsync(). The result is updating an Ubuntu or Debian system can hang when dpkg goes to synchronize the file system.

The symptom also shows up by using the sync command directly.

The dpkg command was updated in 1.16.0 to include a force-unsafe-io option which re-enables the previous behavior which bypasses this bug. This version is not yet available in Ubuntu Lucid Lynx (10.04) but should be showing up in 10.04 LTS sooner or later. The option can be added to /etc/dpkg/dpkg.cfg or to a file in /etc/dpkg/dpkg.cfg.d to make it a default setting.

There was also a suggestion by Theodore T’so that upgrading to dpkg would be sufficient to work around this kernel bug even without the force-unsafe-io option.

It is possible to upgrade to dpkg 1.16.0 on Lucid, but it requires pulling in the Natty Narwhal repositories and setting apt tp prefer the Lucid repositories, then pulling in the Natty version of dpkg specifically.

Suggestions on how to fix the problem (aside from replacing the kernel) now include:

None of these answers are definitive and the search continues. Ubuntu bug #624877 is on this topic, as is Linux kernel bug #15426.

Correction: bug #15426 is a Linux kernel bug report, not a Debian bug report. We apologize for the error.

The kconfig utility (HP-UX)

The kconfig utility is a utility which allows you to save a complete set of kernel tunables, ready for use in configuring other systems or in returning to an older configuration. These kernel configurations can be saved, copyed, deleted, and restored using the kconfig utility.

For example, consider a HP-UX virtual machine host that was pressed into service early as a general host. How to return to the original installation kernel configuration? Use the original configuration automatically created during installation, “last_install”.

For another example, consider a host configured for the applications you use. Save the configuration and it can be replicated elsewhere with a single command and perhaps a reboot.

For a current list, use the kconfig command by itself:

# kconfig
Configuration Title
backup Automatic Backup
ivm Virtual Machine Configuration
last_install Created by last OS install

The kernel configuration can be exported to a file:

# kconfig -e ivm ivm.txt

…and later imported (possibly on a new machine):

# kconfig -i ivm.txt

The current configuration can be saved to a particular name (such as ivm):

# kconfig -s ivm

All of the usual manipulations are possible, as mentioned before: copy, delete, rename, save, load, and so forth. The manual page is kconfig(1m) and should be available on your HP-UX 11i v2 or v3 system.

Bug: synergyc freezes

If you are using Synergy in your daily work, you may have noticed that the Linux client is not working as it should. A bug (Bug #194029) reported to the Ubuntu development team provides extensive reports about the problem and possible resolutions. The biggest problem is sifting through all of them, as well as the realization that they don’t seem to have it fixed yet.

Admittedly, my problems are with Fedora 9 and Windowmaker (which shows that its not Ubuntu-specific, nor is it specific to GNOME or KDE). However, the resolutions seem to work under Fedora just as well as under Ubuntu.

The resolutions recommended are:

  • Run synergy as root: sudo synergyc. This resolution seems to be one least likely to work.
  • Run synergy with the highest priority possible using chrt: chrt -p 99 synergyc. This method can be incorporated into a startup script thusly: /usr/bin/synergyc myserver; pgrep synergyc | sudo xargs chrt -p 99
  • Recompiling the Linux kernel with a different scheduler: instead of configuring with CONFIG_FAIR_USER_SCHED use CONFIG_FAIR_CGROUP_SCHED.
  • Patching synergyc to fix the problem.
  • Enabling (for Ubuntu only) the hardy-proposed repository and updating the kernel to 2.6.24-16 or 2.6.24-17-generic seemed to work (although there were complaints that the desktop became sluggish).

In the case of Fedora 9 at least, this bug remains present even though it is almost a year old. I don’t use synergyc on a Ubuntu client – my Kubuntu host I use almost entirely at the console directly.

So what is the answer? I’d try using chrt first (for me that lessened the problem dramatically) and try upgrading to a new kernel configuration.

Is that HP-UX Kernel 32bit or 64bit?

In today’s world of HP-UX on Itanium, this may seem archaic, since Itanium is 64bit all the way. However, you might find yourself on a PARISC machine anyway. Since PARISC processors come in 32bit and 64bit varieties, it can be important to verify which of these the current kernel is (32bit or 64bit).

It is also important to recognize the various ways of checking for 32-bit versus 64-bit kernels, and to know the limitations of each.

One way is to use the command getconf to get the number of bits used by the currently running kernel:

$ getconf KERNEL_BITS

You can also get the number of bits supported by the hardware, as well as find out if the hardware supports a 32-bit as well as 64-bit mode:

$ getconf HW_CPU_SUPP_BITS
$ getconf HW_32_64_CAPABLE

Another way to find this information is to use the print_manifest command, though you need to be root to run this command:

# print_manifest | grep -i "os mode"
    OS mode:            64 bit

However, since the print_manifest command documents the entire system, it takes a lot longer than the getconf command does.

Another way, albeit much less reliable, is to use the command file on the standard kernel.

Here’s an example from a PARISC system (an rp7420 running 11i v3):

# file /stand/vmunix
/stand/vmunix:  ELF-64 executable object file - PA-RISC 2.0 (LP64)

Here’s an example from an Itanium system (an rx6600 running 11i v2):

# file /stand/vmunix
/stand/vmunix:  ELF-64 executable object file - IA64

This command is only useful if you are using the standard kernel; if the system booted with another kernel instead, this will not give you a valid answer.

You could, however, use the file command on whatever kernel you are using to get the right answer. However, since getconf is so easy and fast, why bother?

The Boot Process (FreeBSD)

When the system starts, it sometimes will fail to load the kernel – or perhaps there are other adjustments that must be made. It pays, thus, to know exactly how the system gets to the point of loading the kernel – even before initd runs or swapper runs.

It can also be a benefit in trying to get the system to use a serial console or to provide splash screens throughout.

In a FreeBSD/x86 system, the process goes something like this:

  1. The system turns on, and the BIOS begins processing and preparing hardware for the boot up. Most Intel machines do not provide a serial port view of the BIOS boot process.
  2. The system loads the first block (the master boot record, or MBR) from disk.
  3. The disk loader then loads the rest of the loader. It is at this point that the first screen appears. This loader may be GRUB or the FreeBSD loader.
  4. The next step is normally to load the FreeBSD boot loader (which is different from the FreeBSD MBR loader). The boot loader provides a FORTH-based environment for modifying the boot sequence, and so on.
  5. The final step is to load the kernel and to start processing. GRUB could load the kernel directly, but using the boot loader provides access to the prompt and to modify the boot process.

GRUB can be configured for a serial port, as can the FreeBSD kernel and loader. Likewise, the splash screens can be set here as well.

It is also possible that any of these items may fail; knowledge of the others will provide for methods of recovery.

If GRUB fails to load the loader (or kernel) properly, it can be adjusted interactively to load the right loader with the right parameters.

If the FreeBSD boot loader does not load the kernel properly (or is misconfigured), it can be adjusted by pressing a key and using the FORTH prompt to manipulate the loaded kernel – including loading a different kernel, changing parameters, loading modules, and so forth.

SystemTap (and DTrace)

SystemTap is one amazing piece of work – it is a programmer-friendy and admin-friendly interface to KProbes (which are included in the Linux 2.6 kernel).  When you compare its capabilities to what has gone before, it is truly amazing.  Here are some of the things you can do:

  • Quantify disk accesses per disk per process (or per user)
  • Quantify the number of context switches that are a result of time outs
  • List all accesses to a particular file and the process that accessed it

This is only the tip of the iceberg. There is a wiki with more details, including “war stories.”  There is a language reference there as well.

There was an excellent article in Red Hat Magazine, “Instrumenting the Linux Kernel with SystemTap” by William Cohen.

One controversy that came up was that the initial impetus for creating SystemTap was to implement something like Sun’s DTrace for Solaris but under the GNU Public License.  Solaris and DTrace are licensed with Sun’s Common Development and Distribution License (CDDL), which many feel makes DTrace incompatible with the GPL-licensed Linux kernel.

Apparently, the CDDL is also incompatible with the BSD-licensed FreeBSD, as FreeBSD 7.0 will not have DTrace either.  There appears to be some licensing issues.

According to the Wikipedia entry on the CDDL, it was designed to be both GPL-incompatible and BSD-incompatible.  With regard to the GPL, the entry suggests that Sun never clarified why; as to the BSD, Sun did not want Solaris to wind up in proprietary products – which the BSD license allows.

On a brighter note, Eugene Teo was able to get the SystemTap tool to work on the Nokia N800.  The article seems to be behind a wall at LiveJournal; the article is still in Google’s cache.  However, it does requires some amazing convolutions:

  • A kprobes-enabled kernel must be installed on the N800
  • The SystemTap programs (like stap) must be installed on the N800
  • Any traces must be cross-compiled on another host
  • The kernel module thus created must be moved to the N800
  • Once the kernel module is in place, then the trace can be done.

So every desired trace requires precross-compilation on a desktop (sigh)…  Oh, well.

There is even a GUI for SystemTap in the works.

The Linux Kernel Weather Forecast

If you are tracking the Linux kernel development (which is not a bad idea for Linux system administrators), then checkout the Linux Kernel Weather Forecast from the Linux Foundation. There is also the blog which summarizes the main changes to the Linux kernel, as well as a RSS feed of the same.

The Linux Kernel Weather Forecast tracks changes that are coming up, and patches that may be added to the kernel soon. It tracks the current release versions, as well as important related information. The blog contains updates, but the main page lists the current status of everything. There is also a RSS feed of the changes made to the main page.

What’s not to like? One stop information about what’s happening with the Linux kernel – woohoo!

PS: OpenSolaris and the BSDs should have something like this!

Abusing chroot() for security

It is often suggested that people lock programs into a chrooted environment. A heated discussion about using chroot() for security purposes came up this week on the Linux Kernel mailing list (as reported on KernelTrap), with a quote from Alan Cox summarizing the backlash against using chroot() in this way:

chroot is not and never has been a security tool. People have built things based upon the properties of chroot but extended (BSD jails, Linux vserver) but they are quite different.

Adrian Bunk (current Linux 2.6 maintainer) even went so far as to say:

incompetent people implementing security solutions are a real problem.

Alan’s suggestions are worthy of consideration for security. BSD jails should always be used wherever they are available, as they are designed for this purpose. However, BSD jails are not normally available on Linux, though there are alternatives like the Linux vserver.

There was discussion about how easy it was for the root user to escape a chroot environment. It comes to a total of three steps:

  1. Create a subdirectory within the environment.
  2. Do a chroot to that subdirectory (while remaining outside of that directory).
  3. Change directories at will.

The basic premise is that the chroot call maintains a single directory as the root (“/”) and that it will only prevent a user from moving from inside the environment to outside. If the user is already outside of that environment, then the containment does not happen. If the chroot call is made a second time, then it overwrites the original value of “/” with the new one for the current user (at least until the chroot() is exited).

So for serious security work, perhaps one should reconsider the use of chroot as Alan suggests.