Unix: Delete all but N most recent files in a directory

Here’s a handy little command to delete every file in a directory except for the N most recent files. It is helpful for including in a log rotation or db backup script.

find /path/to/files/ -maxdepth 1 -type -f -name '*' -print0 | xargs -r0 ls -t | tail -n +5 | tr '\n' '\0' | xargs -r0 rm

Breaking down and explaining each section of the command, we have:

find /path/to/files/ -maxdepth 1 -type -f -name '*' -print0

This will list all files in the specified directory. The reason we use ‘find’ rather than ‘ls’ is because we need the full path of the files when later passing the argument list to the ‘rm’ command. We specify a ‘maxdepth’ so we only search within the current directory. We also specify a ‘type’ of ‘f’ so that we only find files and not other items like directories, sockets, or symbolic links. It’s probably best not to use ‘*’ as your ‘find’ expression, as it could be dangerous if you accidentally point it to the wrong directory. Use something like ‘*.sql.gz’ or ‘*.log’ or whatever suits you. Also note that we are using ‘-print0’ so that the subsequent commands will be able to handle spaces and other special characters in filenames (see http://en.wikipedia.org/wiki/Xargs#The_separator_problem)

xargs -r0 ls -t

This will sort the list returned by the ‘find’ command in descending order by timestamp (newest to oldest). The ‘-r’ parameter instructs xargs not to run of no files are found by the first ‘find’ command. The ‘-0’ parameter gets around the separator problem described in the previous step.

tail -n +5

This will filter the list to only show from the 5th line onward. So if you want to keep the 10 most recent files, use +10 as your argument.

tr '\n' '\0'

This is similar to using ‘-print0’ with the ‘find’ command above. We need to clean up the output so that the subsequent xargs command can handle potential spaces in the filenames. So we use the ‘tr’ command to translate newlines to the null character.

xargs -r0 rm

Finally, this command will delete the files returned by the combination of the previous commands.

2 responses to “Unix: Delete all but N most recent files in a directory”

  1. Ole Tange says:

    Your use of xargs is dangerous because of the separator problem http://nd.gd/0t Consider using GNU Parallel instead http://nd.gd/0s

  2. Thanks for your feedback Ole. I have modified the post to reflect improved commands based on your feedback. Your GNU Parallel project looks interesting, but I am trying to solve my problem without having to rely on 3rd party tools or libraries.