More tips on using find

My entry on using find turned out to be popular; I thought I’d throw out some tidbits on using find for various things. So sit back and enjoy the list!

find . -type f | xargs ls -ld

List all standard files in the current hierarchy starting at the working directory.

find . -type d | xargs ls -ld

Get a list of directories, showing the tree structure starting at the working directory.

find . -mtime +2

Show all files older than two days.

find . -size +10000

Show all large files (possibly for freeing up some space?)

find /*bin /usr/*bin /usr/local/*bin -name "somebin"

Find a executable file by the name given in any of the usual locations (including both bin and sbin).

find . -cpio dev

Find files (as specified) and write to a cpio archive specified in the cpio parameter. This is a newer option; check if your find has it first.

find / -nogroup -o -nouser

Find all files on the system that have no group or owner; files such as these are a security risk and should be associated with groups and owners in /etc/group and /etc/passwd. Remember to run a fidn command that starts at / at a time when users will not be inconvenienced by the massive search.

find / -perm -s

Find all files that are suid or guid – again, these may be a security risk. You should know which binaries are (and need to be) suid and guid.

On using the -exec parameter:
if you use the “old school” form, with {} and \; (such as find . -exec rm -f {}\;) then the command will be executed once for each file found. If you use the “new school” form with {} followed by \+ (such as find . -exec rm -f {}\+) then the command will be executed for all files found in one go.

It is unclear whether it will limit the number of parameters (to keep the command line to within acceptable lengths). The xargs command is the usual way to do this; the biggest advantage of xargs over -exec is that no matter how many times the command is executed, it is never pulled again into memory: xargs restarts the command from the beginning without paging in any code. It also allows you to specify the line size limit as well, or a limit on the number of arguments for that matter.

Newer xargs appear to work with shell scripts as well; traditionally, xargs required binaries and shell scripts (or other scripts) would not work.

The UNIX find command and efficiency

The find command is a very demanding command, and can slow a system way down – and can take a long time as well. There are some ways to avoid these drawbacks.

First, when the urge strikes you to use find / -name "somename" – just don’t do it. This will take a very long time and will notably slow the system down. If you are using the GNU findutils, you may have the program locate on your system, which is much faster. Otherwise, you can use print the output from a find / command (run during off hours) and search the file with grep whenever you have a need to.

Secondly, you may wish to avoid using the -exec parameter. This parameter will run a command on each file that is found. Each time a file is found and printed, the find command loads this specified command, gives it the filename, and runs the command – which is very inefficient. GNU find has the ability to stack filenames together, but that still is not enough.

The most efficient way is to use xargs instead:

find . -mtime -1 | xargs ls -ld

This will combine all of the files together (as much as possible) and will run the binary image of the command as many times as necessary to handle it – without reloading the command at all. Of course, if you have GNU findutils, the builtin -ls option may be even faster!

find . -mtime -1 -ls

You can also manipulate where things go in the command output, or replicate items more than once, and so on. The operation of xargs is simple once you understand it, but the power is tremendous, and it is on all UNIX/Linux platforms. Check out the xargs(1) man page from OpenBSD.