Shell Person Help me keep the shell people alive.

29Apr/107

Using ls to Show Directory Size – Updated & Explained

Note - I've written about this before (here). In my opinion, this is a vast improvement over that earlier version.

This title of this post is not entirely accurate. The post is actually about a script that will list files and directories in the same format as ls -l (the "long listing" format of the ls command), except it will correctly report the size of directories (and all of the files within them). Technically, it is actually in the format of ls -lhL.

This is functionally very different than the output of ls. Typically ls lists the size of directories not including their contents. This produces a very small number, which is the amount of disk space the directory's meta-data takes up (i.e. the names of files within the directory). Here is an example:

ls -lhL
total 116K
drwxr-x--- 1 james users  44K 2010-04-22 17:21 movies
drwxr-x--- 1 james users  48K 2010-04-03 22:22 music
drwxr-x--- 1 james users    0 2009-12-11 16:20 photos
drwxr-x--- 1 james users 4.0K 2009-12-13 22:36 data
drwxr-x--- 1 james users  20K 2010-04-27 00:56 tv


And here is an example of the output of the script (note the size of the directories):

drwxr-x--- 1 james users 178G 2010-04-22 17:21 movies
drwxr-x--- 1 james users  26G 2010-04-03 22:22 music
drwxr-x--- 1 james users 9.9G 2009-12-11 16:20 photos
drwxr-x--- 1 james users  27G 2009-12-13 22:36 data
drwxr-x--- 1 james users 135G 2010-04-27 00:56 tv

That's a pretty big difference. Note that if all you're interested in is the permissions, modification time, and filename, using ls -lh is a much better choice - it's extremely fast and gives you all the information you need. However, if you want to know how much space the contents of each directory is using, you should use the following script:

#!/bin/bash
# script to display the sizes of files and directories in the format of ls -lhL
## note that large space on line 9 (the '	' that follows "cut -d") is a tab, created in a terminal with CTRL+v then TAB
for x in *; 
  do 
    y="$(echo "$x" | sed -e 's/\[/\\[/g' -e 's/]/\\]/g')"
    echo -e \
      "$(ls -lL | grep "[0-9]\{2\}:[0-9]\{2\} $y$" | sed 's/[ ][ ]*/ /g' | cut -d ' ' -f 1-4) \
        $(du -sh "$x" | cut -d '	' -f 1)" \
      "$(ls -lL | grep "[0-9]\{2\}:[0-9]\{2\} $y$" | sed 's/[ ][ ]*/ /g' | cut -d ' ' -f 6-20 )"; 
  done \
| column -t \
| sed -e 's/[ ]\([ ][ ]*\)/\1/g' \
| sed -e 's/[ ][ ]*/ /8g' \
| sed 's/\([ ][ ]*\)\([^ ]*\)\([ ]\)\([ ]*\)\([^ ]*[ ]*[^ ]*[ ]*\)\([^ ]*\)\([ ]\)\([ ]*\)/\1\4\2\3\5\8\6\7/'

NOTE - The script is not displayed entirely correctly in this post; on line 8, the multiple spaces in single-quotes (following cut -d) should be a TAB. Although it's displayed incorrectly on the page, if you use either the "view source" or "copy to clipboard" buttons (top-right of the script when you hover over it), it will use the correct character. You can also produce this character in a terminal with CTRL+v followed by TAB.

This script should be named something convenient and saved somewhere in your $PATH. I call the script lsd and save it in "/home/james/bin".

How the script works

If you're the kind of person who likes to know what a script does before copying it from the internet and running it, here's what's going on in this script. Paragraphs are labeled by the script's line numbers.

Lines 4-11. This is a for loop that is run on every (non-hidden) file in your current directory (*). $x is the variable that holds one of the directory's filenames per iteration.

Line 6. $y is a variable that equals $x, except brackets ("[" and "]") are replaced by escaped brackets ("\[" and "\]"), in order to avoid conflict with grep's regular expressions matching.

Lines 7-10. These lines are all part of a long echo command which arranges the data in the same format as ls -lhL. It is echoing three separate command substitution variables, separated by spaces.

Line 8. The first of three command substitutions. Gets a long directory listing, greps out the single line that matches $x, removes extra spaces between sections, and then chops off the 2nd part of the line at the point where the directory size will be displayed.

Line 9. The second of three command substitutions. Gets the size of $x using du -sh and then chops off the end of the line, leaving only the size.

Line 10. The third of three command substitutions. Gets a long directory listing, greps out the single line that matches $x, removes extra spaces between sections, and then chops off the 1st part of the line up to the point after where the directory size will be displayed. Script lines 8, 9, and 10 will form a single line of output for each iteration of the for loop.

Lines 12-15. Each of these lines formats the script's output to match the style of ls -lhL.

Line 12. Sorts the output into columns using column -t.

Line 13. Removes extra spaces according to the following rule: for every group of 2 or more consecutive spaces, subtract one space from that group.

Line 14. Removes extra spaces from the last part of the lines, ensuring that filenames containing spaces do not have multiple consecutive spaces.

Line 15. Swaps certain groups of spaces around in order to match the justification style of ls -lhL. Specifically, user and group names are justified left, while file sizes are justified right.

Please leave a comment if you've tried this script and found any bugs in it.

Comments (7) Trackbacks (1)
  1. Why not just simplify the whole mess and do the following?

    find -maxdepth 1 -mindepth 1 -type d -exec du -sh {} \;

    That should find all directories in your current directory, not including itself, and report to you the total sizes.

  2. Thanks michaelkuech, that’s an extremely efficient command. The above script is more of an exercise in formatting. While that is a bit superficial, I wanted a drop-in replacement for “ls -lh” (which I have aliased to “ll”), that gives me all of the same information, in addition to directory size.

    If you want the size of regular files as well as directories, and are not interested in the additional data that “ls -lh” gives, you can simplify things further with this:

    du -sh *

    (Note however, the find command does pick up hidden directories, whereas using “du -sh *” does not).

    My script is a fraction of a second slower, but for me the extra ls data and formatting is worth it.

  3. I will post again here in the next few days I have made changes to the ls source to add the option to show the dir size correctly. If anyone wants to test drop me an email lidder86 [at] gmail.com so far I have it working on FreeBSD

  4. @lidder – I’d love to see your changes. It would be helpful to have a drop-in replacement for ls with this functionality.

  5. Every time I try to do something with ls, I end using find. I was trying to backup up my home directory and it was 22 GIGS. What gives? I modded the line posted here to give me the offenders at the bottom:
    find -maxdepth 1 -mindepth 1 -type d -exec du -s {} \; | sort -n
    Notice I took the h option for du and use sort numeric. Turns out that I was using VirtualBox which creates a 10 GiG volume for your virtual hard drive. Now I have to use rsync with the –exclude option. I don’t want to backup virtual box.

  6. Would like a version for Freebsd please.

  7. Although there is no built-in way to print a directory listing


Leave a comment