Linux Filesystem Comparison Test

Recently, Google announced it would move from ext3 to ext4 and also announced hiring the developer of ext4, Theodore Tso.

Now the technology site Phoronix has benchmarked ext3, ext4, XFS, ReiserFS, and Btrfs (no word on why JFS was not included).

The interesting thing about this set of benchmarks is that it seems to be quite a mixed set of results. Almost every filesystem came in first place at least once.

Another interesting thing is that the tests were done on a solid state SATA drive, the OCZ Agility EX 60GB SSD. One wonders what the tests would show with an IDE hard drive.

I’ve always been partial to XFS, given its capability for on-line expansion, its large capacity, and its reasonable performance. Of course, the targets are continuously moving onward – but XFS has always been respectable. Unfortunately, these benchmarks don’t show it as being the best.

The article recommends using ext3 over ext4 for now, given the loss in filesystem performance with ext4. This is particularly interesting given Google’s choice to migrate from ext3 to ext4; however, their choice was based on the fact that ext3 is showing data loss and they need better protection.

Now if someone would only benchmark OpenVMS ODS-5

Google Moves to ext4 Filesystem

Michael Rubin announced that internally Google had decided to move from ext2 to ext4 after careful consideration of ext4, IBM’s JFS, and SGI’s XFS.

Along with this, Google hired Theodore T’so, the man behind ext2 and ext4 to help with the migration.

The decision came down to between XFS and ext4, and the easier migration to ext4 was the deciding factor for Google. I am partial to XFS – it’s older and is perhaps more stable – but ext4 should be good as well.

I switched to OpenSUSE at one time because they offered XFS and Red Hat did not – and converted a Red Hat 7.1 install to XFS as well. Never had any problems with either installation at all.

Sparse files and Virtual Machines

Sparse files are files that take up less space on disk than they actually use. How is this possible? Any blocks with zeros in them are not stored but are silently skipped. When the system retrieves these blocks later, it returns a block of zeros – and if data is put into the block, it is saved onto the physical disk appropriately.

Sparse files require file system support; if the file system you use doesn’t support sparse files, then you have no recourse but to store every file in the normal fashion. Notable filesystems without support for sparse files are any FAT filesystem (MSDOS), HPFS (a FAT derivative used by OS/2), HFS+ (the Macintosh filesystem), and OCFS v1 (the Oracle Cluster File System). Modern file systems such as NTFS (Windows NT Filesystem), VxFS (Veritas File System), GFS, XFS, JFS, ext2, ext3, Reiser3, Reiser4, and many others all support sparse files. No word on whether or not ODS-5 (OpenVMS) supports sparse files or not.

Sparse file support is valuable for virtual machines as a virtual hard drive by its necessity will be of significant size – but by its nature will also have a lot of empty space.

However, over time, data gets written to the virtual machine hard drives then deallocated. The data remains in these blocks, and the blocks remain on disk. These blocks accumulate, and the file expands – filled with data that is no longer used.

The only way to free these unused blocks is to zero them out and copy the file as a sparse file. You might also want to defragment the disk if this is relevant to your virtual machine’s operating system.

First, zero the unused blocks. Typically, this is done with an erasing program suitable to the operating system in the virtual machine. Make sure that the program is a) erasing only unused data! and b) zeroing out the data, not using random patterns or other wipe patterns.

Once the program is done wiping, shut down the virtual machine. (Yes, this process means downtime.) Copy the original file to the new file using the appropriate flags to make the new file sparse. For Linux (using GNU cp) one could do this:

cp --sparse=always oldvm newvm
rm -f oldvm
mv newvm oldvm

These steps will replace the old VM file with a new sparse VM file that uses much less space.