Sparse files and Virtual Machines

Sparse files are files that take up less space on disk than they actually use. How is this possible? Any blocks with zeros in them are not stored but are silently skipped. When the system retrieves these blocks later, it returns a block of zeros – and if data is put into the block, it is saved onto the physical disk appropriately.

Sparse files require file system support; if the file system you use doesn’t support sparse files, then you have no recourse but to store every file in the normal fashion. Notable filesystems without support for sparse files are any FAT filesystem (MSDOS), HPFS (a FAT derivative used by OS/2), HFS+ (the Macintosh filesystem), and OCFS v1 (the Oracle Cluster File System). Modern file systems such as NTFS (Windows NT Filesystem), VxFS (Veritas File System), GFS, XFS, JFS, ext2, ext3, Reiser3, Reiser4, and many others all support sparse files. No word on whether or not ODS-5 (OpenVMS) supports sparse files or not.

Sparse file support is valuable for virtual machines as a virtual hard drive by its necessity will be of significant size – but by its nature will also have a lot of empty space.

However, over time, data gets written to the virtual machine hard drives then deallocated. The data remains in these blocks, and the blocks remain on disk. These blocks accumulate, and the file expands – filled with data that is no longer used.

The only way to free these unused blocks is to zero them out and copy the file as a sparse file. You might also want to defragment the disk if this is relevant to your virtual machine’s operating system.

First, zero the unused blocks. Typically, this is done with an erasing program suitable to the operating system in the virtual machine. Make sure that the program is a) erasing only unused data! and b) zeroing out the data, not using random patterns or other wipe patterns.

Once the program is done wiping, shut down the virtual machine. (Yes, this process means downtime.) Copy the original file to the new file using the appropriate flags to make the new file sparse. For Linux (using GNU cp) one could do this:

cp --sparse=always oldvm newvm
rm -f oldvm
mv newvm oldvm

These steps will replace the old VM file with a new sparse VM file that uses much less space.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: