The ls command
Part One: What is ls?
ls stands for LiSt files. ls is the basic command used to display files at a Linux (or Unix) shell. The ls command included with GNU/Linux systems is part of the GNU coreutils package. It differs in various ways from the ls command included with the various Unix distributions. The ls command used with GNU/Linux can, however, be used with Unix systems as well. It is important to note that ls is actually an executable program, rather than a command built into the shell. This is why it is possible to replace the ls command in a Unix system with the GNU version.
Part Two: Using the ls command as an equivalent of the simple dir command of DOS/Windows fame
The default output of the ls command is similar to the output of dir /w. Notable differences include smart column sizing and colorized output. Here's an example of the output of the ls command:
The file path following the ls command in the screenshot tells ls from which directory you wish to display the containing files. The colors used are to make the output distinctive. This is similar to the syntax highlighting commonly used for source files in editors used by programmers. In the image, you can see the following:
| green | executable files | |
| red | compressed archive files | |
| Blue | directories | |
| off-white | miscellaneous files |
- Compressed archive files means files such as .zip files, .jar (Java archive) files, gzip compressed files, bzip2 compressed files, etc.
- Miscellaneous files are uncategorized files, including plain text files and files of unknown type.
Other colors used, though not shown in the example, include:
| purple | image, sound and video files | |
| yellow | device special files | |
| light blue | soft links | |
| red | broken symbolic (soft) links | |
| off-white on a red background | files set SUID | |
| black on a brown background | files set SGID |
- Device special files are virtual files used for, e.g., communicating with device drivers, IPC, shared memory, etc.
- Soft links are virtual "files" which are symbolic pointers to real files, usefull for a wide variety of purposes.
- Files set SUID means people accessing the file do so with the same permissions as the owner of the file possesses.
- Files set SGID are files for which people accessing the file do so with the same permissions specified for the group to which the file belongs.
The colorized output is the same no matter how you choose to list the files. If you don't see colorized output, try typing ls --color=auto. If you see colorized output after typing that command, your Linux distro doesn't setup an alias for colorized output. Type:
alias ls='ls --color=auto'
To list files in a way that more closely resembles the default output of the MS-DOS/Windows dir command, try typing:
ls -lTo see the file sizes in a more human-friendly format, use:
ls -lhPart Three: Globbing
Globbing is the use of special operators interpreted by the shell to specify basic regular expressions for text matching. Unlike with MS-DOS/Windows systems, the globbing capabilities of the shells used in GNU/Linux (primarily Bash) are more sophisticated than simply * and ?. However, we'll start with the basics for the benefit of those unfamiliar with globbing.
The * character stands for any character or string of characters. Thus, to simply list all files, type:
ls *ls by itself, but not the same. Try it. If the output looks the same as ls by itself, try it in a directory which contains subdirectories.
You will notice the contents of the subdirectories are displayed as well when you use ls * rather than ls by itself. This is because the shell expands the globs, rather than the ls command. This is significant to remember, as it means globbing can be used with any command you execute from the shell. As an example, let's use the echo command to show this. Those familiar with MS-DOS/Windows will recognize this command. It is used to display output. Try this:
echo *
To see the kind of output from ls * you would expect to see, add the -d option:
ls * -d-d option can be specified before or after the *. The -d option tells ls to list directory entries rather than contents, and to not descend into symbolic links. As with all single-letter options to GNU/Linux commandline utilites, the options can be strung together:
ls * -ldOkay, but let's see something usefull...
Oh, I see how it is, you don't appreciate using globbing just to see every file anyway,eh? Of course you don't, why waste time?
It was important to show you that * stands for match anything before showing you how to use this. Now that you know that, we can make better use of it.
Say you want to only list the files in a directory which have the .txt extension. We can do this easily the same way it's done under MS-DOS/Windows:
- ls *.txt
Okay, well the same except you use ls in place of dir. Now say we only want to see the .txt files which start with the letter d. Here's how we accomplish this:
- ls d*.txt
Simple enough, right? A little limiting though, perhaps. Say we instead wish to see all the .txt files which start with the letters d through f. Here's how:
- ls [d-f]*.txt
How's this work? The letters contained within the square brackets [ and ] are giving the shell a range indicator. That is, they tell the shell to match any letter in the inclusive range d through f. An easy way to remember to use square brackets for this is to think about how you specified ranges in math. If you don't remember, in mathematical notation, the parenthesis () indicate exlusive, while the square brackets [] indicate inclusive ranges. In the bash shell, you can only use inclusive ranges, so don't try anything like (5,8]. It won't work. Unlike math, it doesn't work for a multidigit number either. It is used exclusively for single character matching.
If a simple range isn't enough, however, you can add more than just ranges inside the square brackets. I somewhat implied the square brackets to be range specifiers in the preceding paragraph in the interest of giving you a memory association to math. The square brackets don't actually specify a range, however, but rather the dash between two characters inside the brackets do. In fact, you don't even need to specify a range at all within the square brackets. You could have accomplished the previous feat thusly:
- ls [def]*.txt
Since there is no dash, the shell globbing interprets this as any one of the letters d, e, or f. This can be mixed and matched with ranges:
- ls [abg-i]*.txt
Will match any file starting with a, b, g, h or i and ending with .txt. You may also negate this effect:
- ls [^abg-i]*.txt
Will match any file not starting with a, b, g, h, or i! The caret ^ being specified as the first character within the brackets tells the shell globbing to exclude everything specified within the square brackets. If the caret is specified somewhere other than the first position within the square brackets, however, the caret will just be interpreted as another character for the globbing to match against. Those of you who have tinkered with Unix and Linux systems and created files somehow that you could not delete (because you couldn't figure out how to type in their name due to the funny characters their names contained) should be paying special attention right now. To list all files which don't contain normal characters (in the English language), type:
- ls [^a-zA-Z0-9_-]*
The reason a-z and A-Z are both specified is because the Unix and Linux filesystems are case sensitive. You must always remember that. This means you can have a file name linux.txt and a file named Linux.txt in the same directory without conflict. If you hadn't already guessed, ranges can be concatenated (combined) by specifying them back-to-back. The underscore is commonly used on Linux systems as well as any system used by programmers. The dash at the end actually matches a dash rather than specifying a range. This is because it is specified as the last character, and is how you match against a dash because the shell treats it specially when it is found to be the last character within square brackets.