Why doesn’t my /bin/sh script run under Ubuntu?

This is a very interesting question – and the resolution is simple. In Ubuntu 6.10 (known as Edgy Eft) the decision was made to replace the Bourne Again Shell (bash) with the Debian Almquist Shell (or dash) as /bin/sh in Ubuntu. There was considerable uproar in Ubuntu brainstorm (community ideas) and in Ubuntu bug reports, as using dash instead of the original bash caused numerous scripts to break.

In particular, the entire reasoning given for this change was efficiency: dash is more efficient (i.e., faster) than bash. According to the explanatory document created by the Ubuntu developer team, Debian has required scripts to work on POSIX-compliant shells for some time (even pre-dating the Ubuntu project). Thus, any scripts that broke were, in essence, not “following directions” and deserved what they got.

To undo this change by the Ubuntu team, one can do this:

sudo dpkg-reconfigure dash

When this command executes, specify that you do not want dash to act as /bin/sh. This will make every script that runs /bin/sh run bash as has traditionally been the case.

You can also make your scripts run /bin/bash instead of /bin/sh; this provides all of the bash capabilities without any concern as to whether /bin/sh will change again.

Making the boot process faster is a laudable goal, but like the removal of OSS from the kernel, it caused a lot of problems for users.

In both cases, it appears that the Ubuntu team is more focused on doing the technologically “right” thing rather than providing a stable and reliable platform. Unfortunately, this means that you cannot rely on Ubuntu to stay reliable – at least from one version to the next. The response of Ubuntu to such system failures has always been that they are doing the “right” thing and the problem must be fixed by someone else (i.e., it’s not Ubuntu’s problem).

Users – many of them system administrators – take the brunt of this: they don’t care whose fault it is, nor do they care whether the boot process is faster or whether the Linux sound environment is “cleaner”; they care about the stability of their systems. A system that boots faster doesn’t matter if it crashes during the boot process because of a broken script.

If the focus of Ubuntu were to provide a stable and unchanging environment, then their decisions would be different – and would result in an improved customer experience.

Putting Debian packages on hold

When administering a Debian (or Ubuntu) system, putting packages on hold can be very useful. For example, if a critical part of the system is used by developers, and is continually updated, the developers will want to be aware of updates and will want to check their code in the new environment. Programs like Tomcat, Cocoon, MySQL will be in this category.

Similarly, if a critical portion of the system is to be updated, you wouldn’t want it to be part of the automatic updates – though you really shouldn’t automatically update, since you don’t know what can break until you test it.

To hold a package or packages, you should use dpkg --set-selections. If you run the command dpkg --get-selections you can see what is set already:

# dpkg --get-selections | head
acct                                            install
adduser                                         install
apparmor                                        install
apparmor-utils                                  install
apt                                             install
apt-transport-https                             install
apt-utils                                       install
aptitude                                        install
at                                              install
auditd                                          install

As an example, let’s consider the package dnsutils. Let’s see what would happen before we do anything:

# apt-get upgrade
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following packages will be upgraded:
  bind9-host dnsutils libbind9-60 libdns64 libisc60 libisccc60 libisccfg60 liblwres60
8 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
Need to get 1,257kB of archives.
After this operation, 0B of additional disk space will be used.
Do you want to continue [Y/n]? n
Abort.

Now let’s change this. We’ll put the package dnsutils on hold using dpkg --set-selections:

# echo dnsutils hold | dpkg --set-selections

Let us check the results:

# dpkg --get-selections | grep dnsutils
dnsutils                                        hold

Now, when we try the system update again, things have changed:

# apt-get upgrade
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following packages have been kept back:
  bind9-host dnsutils libbind9-60 libdns64 libisc60 libisccfg60 liblwres60
The following packages will be upgraded:
  libisccc60
1 upgraded, 0 newly installed, 0 to remove and 7 not upgraded.
Need to get 29.9kB of archives.
After this operation, 0B of additional disk space will be used.
Do you want to continue [Y/n]? n
Abort.

Now, dnsutils – and its related packages – are being held back, just as we wanted. The other packages are being held back because they are only required by dnsutils; without upgrading dnsutils, they won’t be upgraded either.

Disabling compcache in Ubuntu Jaunty (and Related Swap Errors)

If you have installed Ubuntu recently, you may have compcache enabled. This is a memory-based swap cache and its presence is unnecessary and unexpected in a permanent installation (it was designed for LiveCD operations). There is a bug report about compcache being enabled, along with directions on how to remove it.

This bug can also be seen if you are seeing errors like these:

Mar 6 17:27:29 server kernel: [14438.135859] compcache: Error allocating memory for compressed page: 60594, size=4096
Mar 6 17:27:29 server kernel: [14438.135871] Write-error on swap-device (254:0:484752)
Mar 6 17:27:29 server kernel: [14438.136813] allocation failed: out of vmalloc space - use vmalloc= to increase size.
Mar 6 17:27:29 server kernel: [14438.136824] compcache: Error allocating memory for compressed page: 60595, size=2093
Mar 6 17:27:29 server kernel: [14438.136835] Write-error on swap-device (254:0:484760)
Mar 6 17:27:29 server kernel: [14438.137088] allocation failed: out of vmalloc space - use vmalloc= to increase size.
Mar 6 17:27:29 server kernel: [14438.137098] compcache: Error allocating memory for compressed page: 60596, size=4079
Mar 6 17:27:29 server kernel: [14438.137108] Write-error on swap-device (254:0:484768)

You can also see it when you print swap information with the command swapon -s – if compcache is enabled, one of the swap entries will be “ramzswap”.

To disable compcache completely, do this:

rm -f /usr/share/initramfs-tools/conf.d/compcache && update-initramfs -u

The file compcache contains this line – which is what enables (and sizes) compcache:

COMPCACHE_SIZE="25%"

This was summarized nicely in this email on the ubuntu-users mailing list in February of this year.

Bug: synergyc freezes

If you are using Synergy in your daily work, you may have noticed that the Linux client is not working as it should. A bug (Bug #194029) reported to the Ubuntu development team provides extensive reports about the problem and possible resolutions. The biggest problem is sifting through all of them, as well as the realization that they don’t seem to have it fixed yet.

Admittedly, my problems are with Fedora 9 and Windowmaker (which shows that its not Ubuntu-specific, nor is it specific to GNOME or KDE). However, the resolutions seem to work under Fedora just as well as under Ubuntu.

The resolutions recommended are:

  • Run synergy as root: sudo synergyc. This resolution seems to be one least likely to work.
  • Run synergy with the highest priority possible using chrt: chrt -p 99 synergyc. This method can be incorporated into a startup script thusly: /usr/bin/synergyc myserver; pgrep synergyc | sudo xargs chrt -p 99
  • Recompiling the Linux kernel with a different scheduler: instead of configuring with CONFIG_FAIR_USER_SCHED use CONFIG_FAIR_CGROUP_SCHED.
  • Patching synergyc to fix the problem.
  • Enabling (for Ubuntu only) the hardy-proposed repository and updating the kernel to 2.6.24-16 or 2.6.24-17-generic seemed to work (although there were complaints that the desktop became sluggish).

In the case of Fedora 9 at least, this bug remains present even though it is almost a year old. I don’t use synergyc on a Ubuntu client – my Kubuntu host I use almost entirely at the console directly.

So what is the answer? I’d try using chrt first (for me that lessened the problem dramatically) and try upgrading to a new kernel configuration.

Overview: how to install UNIX/Linux to a machine with no bootable disk

Installing operating systems to the HP nc4010 ultralight notebook has been an excercise in how to accomplish the seemingly impossible: installing an operating system to a laptop with no removable disk and no bootable disk.

Generally, there are three different ways to do this:

  • Boot from the network using PXE.
  • Boot from an external add-on device such as USB CDROM or USB memory device.
  • Create a bootable disk in another system and install the disk afterwards.

Booting from the network requires several servers to be set up, including a TFTP server, a NFS server, and a DHCP server. Though they could all be on the same machine, it does represent a significant amount of set up and configuration in order to install, including the need to copy all installation parts to the NFS server to be served up to clients. In addition, there are special configurations needed for DHCP to get this started.

Booting from an external device is much easier, and can be done on the nc4010 and probably can be done on most laptops from the last 10 years or so. This method is probably the easiest to accomplish and without any fuss.

Alternately, it is possible to install the operating system normally in another system and then transfer the disk over to the new system. The biggest problem – the major problem – is that the disk locations all change. What had been /dev/hd1 is now /dev/hd0; all of this will need to be changed in order to have the new system boot properly.

The boot loader may also need to be changed to recognize the new location of the disk.

Linux has a parameter “root=/dev/zzzz” which allows the boot process to specify where the system root disk is. After this, then /etc/fstab will have to be changed (which is standard everywhere).

Solaris has UFS and ZFS, and UFS can be modified to reflect a new source disk location. ZFS is more troublesome and hard to do, as the filesystem is newer and has not been used as a boot drive for hardly any time at all. I still do not have an understanding of how to convert ZFS from using one boot disk to another (in name only) – once that happens, I’ll have OpenSolaris on an nc4010.

Tethering a Nokia 6165i to a Compaq nc4010 using Bluetooth (Kubuntu)

This was not that hard to do – for a sysadmin and geek – but not for a general user. One of the things I like about Kubuntu is its ease of use and preconfigured and debugged desktop, but here it fails. Setting up a dialup networking connection is sorely inadequate, and is completely unintuitive. Most of this problem stems from several problems:

  • The menu structure in KNetworkManager is very obtuse and non-obvious.
  • Configurations created by kppp and by KNetworkManager (running kppp) result in two different configurations in two different locations.
  • Configuration of the ppp0: network device is not available anywhere related to networking.

A lot of this problem stems from the way KNetworkManager is set up; however, the network configuration interface should also allow for the creation of a ppp0: device.

There also seems to be no way to create a tethered link to a bluetooth device without using the command line. Not a problem for me nor for many of you, but for a general user it would be a real stumbling block. (I guess I shouldn’t be surprised; from the project page comes this bit about dial-up support: The rudimentary dial-up support gives the user the ability to connect dial-up connections configured using YaST. There is much room for improvement. As of 16/11/2007 in Opensuse 10.3, this is broken.).

Most of the directions I took from this article titled Bluetooth DUN tethering – Linux and KDE (Might work with Gnome) from PinStack.com (a support site for Blackberry users).

In this case, I will assume that you know how to pair a cell phone device with your Linux laptop; it’s fairly easy (though not completely intuitive). I’ll also assume that all of the requisite bluetooth support is available and already installed.

You’ll need to get the MAC address for the bluetooth device. This could be done any number of ways; from the command line this will do:

$ hcitool scan
Scanning ...
00:12:D1:C9:DF:5E Nokia 6165i
$

After the MAC address is found, then the next step is to find the appropriate bluetooth channel for dial-up networking. This can be done using the sdptool command:

$ sdptool browse
Inquiring ...
Browsing 00:12:D1:C9:DF:5E ...
Service Name: Dial-up networking
Service RecHandle: 0x10000
Service Class ID List:
"Dialup Networking" (0x1103)
"Generic Networking" (0x1201)
Protocol Descriptor List:
"L2CAP" (0x0100)
"RFCOMM" (0x0003)
Channel: 1
Language Base Attr List:
code_ISO639: 0x656e
encoding: 0x6a
base_offset: 0x100
Profile Descriptor List:
"Dialup Networking" (0x1103)
Version: 0x0100

A lot of information spits out, but the relevant one for me was this entry. Now create the serial link that connects the “modem” to the operating system – in particular, a device /dev/rfcomm0 is created for communicating with the bluetooth modem (which is what the phone becomes). This is done thusly:

# rfcomm bind hci0 00:12:D1:C9:DF:5E 1

Once this is done, assuming the right channel was chosen, then kppp can be configured. Because of the conflict between kppp configuration and the KNetworkManager configuration, a setup step may be worth doing (as root):

# cd ~/.kde/share/config/
# rm -f kppprc
# ln -s ~me/.kde/share/config/kppprc

What this does, then, is make it so that your personal configuration for kppp is also used by root. Now kppp can be run from the command line (since it’s already open, right?) and you won’t have to go searching for the configuration in the menu tree:

$ kppp

After kppp is configured, you should be able to use your phone as a cellular modem. The configuration I used (for U.S. Cellular) worked, and has been described in these pages in the past.

Running Kubuntu Intrepid Ibex Alpha 6 on a Compaq nc4010

e1000e driver in most recent Linux kernel causes corruption!

Before you try using Intrepid Ibex Alpha 6: there seems to be a problem with the e1000e driver that causes the hardware to be corrupted and could render the e1000e card useless – and even unrepairable. Even if you are using a Linux system that currently uses an e1000 driver, the new Linux kernel shifted some of the e1000 support to the e1000e driver. If you are using the Compaq nc4010 as this article describes, you should be fine: the nc4010 uses the tigon3 driver. There is a bug entry in the Ubuntu bug lists, and the linux-net mailing list has a thread on the bug. It also would appear that a fix went into the -mm kernel tree (a recognized Linus tree spinoff) as of 2.6.27-rc5-mm1. If you are hardy enough to run Ubuntu Intrepid, perhaps you could tangle with the experimental -mm kernel as well – and sleep better at night knowing your hardware won’t be wasted.

This turned out to be quite a challenge. Firstly, there was no way to install it directly – the previously mentioned Billix didn’t accomodate Intrepid and I didn’t have any large enough USB sticks to put a USB bootable image onto – assuming there is one for Intrepid.

However, remaining undaunted, I was able to install Intrepid without too much trouble – though any nontechnical user would have been stopped right up front. How’d I do it?

First, I installed VirtualBox onto another available system – using VirtualBox 2.0 – and then downloaded the Kubuntu install CD to that system. Installing Kubuntu Intrepid Ibis to VirtualBox was not a problem; everything went well. It was, however, quite slow! It turned out I had much less memory in the system than I thought – so between the 512 Mb in the system and the 300+ Mb that was allocated to the VirtualBox instance, I did a lot of waiting (sigh).

I made sure that the hard disk that was created in the virtual environment was smaller than the actual disk used by the system I was going to put Intrepid onto.

Once the virtual environment was complete and the install was finished, I stopped the environment and reconfigured for less memory. I then restarted the environment with a DSL disk (so as to not use the created virtual disk in any form).

I extracted the hard disk from the Compaq nc4010 that was to have Intrepid installed onto it, and removed it from the hard disk cage that notebooks like to use. I then connected the hard disk to a USB port using a cable adapter.

Now – with the virtual environment running DSL and a unused disk configured with Kubuntu Intrepid, and the host system running OpenSUSE 10.3 with the target disk attached via USB, the only thing left was a disk copy over the network. Using nc (or netcat) permitted the copy going direct from the virtual guest to the virtual host.

The networking had to be set up, and required the bridged mode. It appeared that the host had to already be configured for networking and active in order for the guest to be able to talk to the host (using NAT), but perhaps that was just me.

Once networking was set, the only thing that was needed was to copy from the guest:

# nc -p 4117 < /dev/hda

And copy to the host:

# netcat -l -p 4117 > /dev/sda

Note that this copy will copy the entire disk, including partition tables. This was by design and worked fine (apparently).

The biggest problem (aside from speed and memory overcommitment) was the fact that nc did not stop after copying all of the data from the hard drive – and there was no visible progress report anywhere. The destination disk had no activity light; nc had no progress report available; and so forth.

Another “problem” was the fact that the command nc did not exist on OpenSUSE but did in DSL: OpenSUSE used the full name for the command, netcat.

Once everything was copied, I could shutdown the virtual guest environment, and put the hard drive back into the original host (using the cage as mentioned before).

This did work and worked beautifully. The biggest problems came not from the installation, but from the fact that the installation is Alpha 6. Some of the programs on the task bar don’t have a proper background, several things crashed, icons are missing for some programs in the menu (including some programs that have icons shown elsewhere, like Amarok). I don’t know whether to be aghast or just patient – it is alpha software, after all. I’d just expected the most obvious bugs to be gone, but whatever.

Lastly, every time I hear that name…. am I the only one who thinks of the man called Intrepid?

Installing OpenSUSE 10.3 onto a HP nc4010

I installed OpenSUSE 10.3 onto a HP nc4010, and it went smooth. I am still working out the problems (here and there) as well as creating a number of new problems as I keep piling on the software (I always do that….)

This time I downloaded Billix and expanded it to cover installs of OpenSUSE 10.3, OpenSUSE 11.0, and Fedora 9. Billix is not actually a Linux distribution; it is a collection of distributions that are installable from the USB stick or CDROM (as well as utilities such as chntpw, memtest+, and Darik’s Boot and Nuke).

The biggest problem with installation so far seems to be that the grub installers I’ve seen so far cannot cope with the situation that Billix presents:

  • The boot disk (install disk) is the first in the chain (i.e., hd0).
  • The operating system is being installed on the second disk (i.e., hd1).
  • The startup disk (after installation) will be what is hd1 during installation, but will become hd0 on startup.

The end result is that the operating system install does almost everything just right but then installs grub onto the USB stick, and configures it to boot the hard disk. Thus, if you boot normally, the process halts mysteriously with no message; if you boot with the USB stick in place (booting from USB), then the USB stick will boot the operating system located on the hard drive.

Recovering both Billix and the native operating system are easy enough. To recover Billix, just redo the master boot record initialization process:

  • Install the MBR: install-mbr -p1 /dev/usbstick
  • Reactivate and reinstall syslinux: syslinux -s /dev/usbstick1

Note that install-mbr requires the disk device (such as /dev/sdb) whereas syslinux requires the relevant partition (such as /dev/sdb1).

Once the disk is properly configured, it is just a matter of finishing the install process. The install process reboots to finish, and it is all quite straightforward. Installing OpenSUSE is a breeze, and the amount of work that has gone into making a very easy-to-use desktop Linux is obvious from start to finish.

Even the bluetooth daemon, which caused problems after hibernation under Kubuntu, had absolutely no problems in OpenSUSE. Even turning the device on and off using the button on the laptop worked beautifully.

One thing that stood out was that there is no way to pair a bluetooth device. Nope. But let me explain…. If you try to pair a device, there is no way to do it. If you try to use your bluetooth device (copy files to it, etc.) then the system will ask you to pair the device at that time. I would have prefered both options, but oh well.

The experience in using OpenSUSE has been a delight; everything has been designed to present you with the best possible Linux experience possible. The choice of task bar applications on startup, the configuration of the desktop, the entire experience shows an attention to detail that many distributions do not have.

As a system administrator, you should try Billix. As a user, you should try OpenSUSE. Simple, eh?