Linux Kernel sync() Bug

There is a bug in the Linux kernel (up to 2.6.37) that can result in a sync() command taking many minutes (instead of several seconds).

The exact nature of the bug is unclear, but it shows up in the use of dpkg – since dpkg now uses sync() instead of fsync(). The result is updating an Ubuntu or Debian system can hang when dpkg goes to synchronize the file system.

The symptom also shows up by using the sync command directly.

The dpkg command was updated in 1.16.0 to include a force-unsafe-io option which re-enables the previous behavior which bypasses this bug. This version is not yet available in Ubuntu Lucid Lynx (10.04) but should be showing up in 10.04 LTS sooner or later. The option can be added to /etc/dpkg/dpkg.cfg or to a file in /etc/dpkg/dpkg.cfg.d to make it a default setting.

There was also a suggestion by Theodore T’so that upgrading to dpkg 1.15.8.7 would be sufficient to work around this kernel bug even without the force-unsafe-io option.

It is possible to upgrade to dpkg 1.16.0 on Lucid, but it requires pulling in the Natty Narwhal repositories and setting apt tp prefer the Lucid repositories, then pulling in the Natty version of dpkg specifically.

Suggestions on how to fix the problem (aside from replacing the kernel) now include:

None of these answers are definitive and the search continues. Ubuntu bug #624877 is on this topic, as is Linux kernel bug #15426.

Correction: bug #15426 is a Linux kernel bug report, not a Debian bug report. We apologize for the error.

Leave a comment