IC221: Systems Programming (SP14)


Home Policy Calendar Syllabus Resources Piazza

Lecture 02: UNIX File System and Command Line Tools

Table of Contents

1 File System Preliminaries

The file system is a way to organize the data into files and directories for easy access. A file system is hierarchal, which means that everything is arranged in order of rank, like your chain of command. We describe hiearchical structures like this as trees.

fs_tree.png

Consider the tree above. All trees are composed of three components, root, sub-roots, and leaves. The root of the tree is the top-ranked item, in the example above, this would be foo. Sub-roots are roots of smaller trees, but are not at the top, for example, bar is a sub-root. Leaves exists at the bottom, and do not have any sub-ranked items.

We also describe items as having parents and children. What makes an item a root is that it has no parent, like foo. A sub-root, can be defined as an item that has a parent and children; like bar, whose parent is foo and children are xyzzy and bar. The leaves are items that only have parents and no children, like garply and baz.

In a file system, we have terms for roots and sub roots. They are called folders or directories. A folder can contain folders and files, which is a child in the tree. A folder may also be a child in the tree if it contains no sub-folders or files.

A path in the file system hierarchy for a given file or folder describes the parents all the way up to the root. For example, the path of baz is /foo/bar/baz. We designate the parent-child relationship using the forward slash.

2 The UNIX File System

You are likely familiar with the Windows file system structure, which is organized into drives, like the "C" drive, and then items fall out from there. There can be many drives attached to the Windows file system, and you navigate under various drives by letter name.

On Unix, instead of having a "C" drive, everything begins at the root directory or /, and unlike windows, Unix uses forward slashes instead of backslashed to designate different directories.

unixfs.png

2.1 The Key Components of UNIX File System

The base UNIX file systems always has the same basic structure:

  • /: The root of the file system. All files and directories fall under this.
  • /usr: Stands for Unix System Resources. Contains system utilities.
  • /sbin: System binaries. Contains essential system administration programs that are generally run by the superuser (on Windows the superuser is called the Administrator).
  • /opt: Optional software. Third party software is usually installed here. It's kind of like the "Program Files" directory in Windows.
  • /etc: System configuration files. This is where things like the password file, and global system configuration files live.
  • /home: Contains the user home directories. Like the "Documents and Settings" folder in Windows.
  • /tmp: Temporary files. When the system reboots, these go away.
  • /kernel: The core operating system. Like the "Windows" folder in Windows.
  • /usr/lib: Contains precompiled libraries for use by everyone on the system. For instance, in this directory, you can find the file libstdc++.so.6, which is needed for C++ programs. (Remember linking from IC210?). This is a little bit like the folder containing all of the .DLL files in Windows.
  • Any directory that ends in bin: Contains binary executable files or links to them.

2.2 Home Directories

Each user on the UNIX file system has a /home directory in the /home folder.

  • My username is aviv, then my home directory is /home/aviv.
  • Your home directory will be some thing like /home/m16XXXXX
  • The ~ (tilda) is short hand for your home directory.
  • You can also refer to someone else home directory via the tilda, ~aviv refers to aviv's home directory, and ~m16xxxx may refer to your home directory.

2.3 Unix Directory Paths

              +--- Root Directory
              |
              V
              /home/aviv/foo.txt
                 ^   ^    ^
Sub-Directory ---'---'    '-- Target

2.4 Parent and Current Directory

Every directory has two special sub-director:

  • . : ("dot") The current directory
  • .. : ("dot-dot") The parent directory

Another way to interpret the dot and dot-dot is by replacing them with "current" and "parent". Consider the path below:

/home/aviv/../m165678

Reading it from left to right, you might say: "From the root, go to home directory, then to the aviv directory, then to the parent of aviv, then to m165678, the target." Here is another path:

/home/aviv/./././../aviv/./foo.txt

The dot is replaced with current folder, and by stringing then together, it has no affect on the path. In the example above, all the dots refer to the current directory, aviv. Following the rest of the path, the dot-dot then refers to go to the parent directory, home, but that is followed by traversing back to the aviv directory. Then the dot means to stay put again, and then, finally, the target, foo.txt.

path_ex.png

3 The Shell Preliminaries

unix-parts.png

The shell or terminal is the primary user interface for interacting with the operating system through text input. You have already used various shell's previously, like the Windows command terminal as well as the Unix shell. You compile C++ programs by typing a command, and then the shell tells the OS to execute those commands.

A shell is just another program running on the OS, like any other application. Most shells are the program, bash, which we will look at in detail later. What makes a shell special is that it is a program designed specifically to enable the user to launch other programs. In many ways, it is the primary user interface of the OS. Additionally, the shell and the OS on Unix provide a simple set of command line tools that enable you to navigate the file system, manipulate the file system by creating or deleting files/folders, read and parse files on the system, and monitor current running process and programs.

Today, we'll focus on the command line tools associated with the navigating and manipulating the file system. As we work through this class we will dive into the Unix system by emphasizing shell scripting and interacting with the Unix kernel through C system call API.

3.1 Current Working Directory

The shell has a notion of location, or current working directory or present working directory, that indicates where in the file system the shell is currently operating. When you first log into a computer, a shell is started for you set to your home directory as the current working directory. The shell can change the current working directory to view a different parts of the file system.

3.2 Navigating the file system

There are three important commands for navigating the file system via the shell:

  • cd path : Change the current directory to the one specified by path or go to your home directory if path is ommited
  • ls path : List the contents of the directory at that path or the current directory if path is ommited
  • pwd : Print to the screen your current working directory.

Here is a sample of using these commands to explore the file system:

aviv@zee:~$ pwd
/home/scs/aviv
aviv@zee:~$ ls
aviv-local@  class/      Downloads/  local/     Public/     test.c   VBox-Map/        #VM-notes.txt#
#.bashrc#    Desktop/    ic221/      Music/     Templates/  test.c~  Videos/          VM-notes.txt
bin/         Documents/  id_rsa.pub  Pictures/  test*       tmp/     VirtualBox VMs/  VM-notes.txt~
aviv@zee:~$ cd tmp/
aviv@zee:~/tmp$ ls
aviv@zee:~/tmp$ cd ..
aviv@zee:~$ ls ~blenk
ls: cannot open directory /home/scs/blenk: Permission denied
aviv@zee:~$ ls class/ic221/
hw/     lab/    submit/ 
aviv@zee:~$ ls class/ic221/hw
hw1.pdf

Note that the OS manages who can view what directories. When I tried to read CDR Blenkhorn's home directory, I received a permission denied error.

3.3 Understanding a Shell Prompt:

All shells have a command prompt (or just "prompt"), which indicates to the user to provide input. The prompt, itself, also provides some useful information about the shell, including things like the current working directory, your user name, and the host you are working on.

Here is an example prompt:

     + User Name
     |        +--Current Working Directory
     |        |  
     V        V
     aviv@zee:~$  
            ^  ^  ^
Hostname-.__|  |  |_____.--- Where you enter commands
               |
 The prompt----+

You can see this command prompt change as you navigate the file system:

aviv@zee:~$ cd tmp/
aviv@zee:~/tmp$ ls
aviv@zee:~/tmp$ cd ..
aviv@zee:~$ 

While your shell may have a slightly different command prompt, all the same information is likely there.

4 File System Command Line Tools

Throughout this class, we will make use of a lot of standard Unix command line tools. These tools are common to nearly all Unix platforms. Above, we introduced three tools for navigating the file system, now we will explore some more, as well as those options.

4.1 Disecting a Command Line Argument

Some terminology regarding command line tools in shell

   +- Sell prompt, not included in the command
   |
   v                           
aviv@zee:~$ command arg1 arg2 arg3 ...
              ^      ^     ^    ^
              |      |_____|____|_____,-- The command argumets, 
              |       
              +-- The command, such as mv or cd

Most commands do not require arguments, but they are ways to provide a different set of information.

4.1.1 ls and it's arguments

First consider ls, which has a number of different options to display different listing of a directory.

  • ls path : list contents of directory at path
  • ls -l path : long list the contents of the directory at path, which include permission, ownership, last edited, and file size.
  • ls -a path : list all contents of directory at path including hidden files that start with a ".", such as .bashrc
  • ls -al path : long list all contents of directory at path including hidden files
aviv@zee:~$ ls
aviv-local@  class/      Downloads/  local/     Public/     test.c   VBox-Map/        #VM-notes.txt#
#.bashrc#    Desktop/    ic221/      Music/     Templates/  test.c~  Videos/          VM-notes.txt
bin/         Documents/  id_rsa.pub  Pictures/  test*       tmp/     VirtualBox VMs/  VM-notes.txt~
aviv@zee:~$ ls -a
./              bin/        .dmrc         .gnome2/          .launchpadlib/  .pulse-cookie  Videos/
../             .cache/     Documents/    .gnome2_private/  .lesshst        .ssh/          VirtualBox VMs/
aviv-local@     class/      Downloads/    .gtk-bookmarks    local/          .tcshrc        #VM-notes.txt#
.bash_history   .compiz/    .emacs        .gvfs/            .local/         Templates/     VM-notes.txt
.bash_profile   .compiz-1/  .emacs~       .hplip/           .mozilla/       test*          VM-notes.txt~
.bash_profile~  .config/    .emacs.d/     ic221/            Music/          test.c         .vmware/
.bashrc         .cshrc      .fontconfig/  .ICEauthority     Pictures/       test.c~        .Xauthority
.bashrc~        .dbus/      .gconf/       id_rsa.pub        Public/         tmp/           .xsession-errors
#.bashrc#       Desktop/    .gksu.lock    .inputrc          .pulse/         VBox-Map/      .xsession-errors.old

Notice all the files that start with . that are now visible with the -a option, as well as . and ..

When using ls -l you gate a lot of extra information.

aviv@zee:~$ ls -l
total 100
lrwxrwxrwx 1 aviv   17 2013-11-05 16:40 aviv-local -> /local/aviv-local
-rw------- 1 aviv 5969 2013-11-05 16:47 #.bashrc#
drwxr-xr-x 2 aviv 4096 2013-12-19 16:22 bin/
drwxr-xr-x 3 aviv 4096 2013-12-19 16:19 class/
drwx--x--x 2 aviv 4096 2013-11-11 18:03 Desktop/
drwx--x--x 2 aviv 4096 2013-10-17 19:06 Documents/
drwx--x--x 2 aviv 4096 2013-11-08 16:10 Downloads/
drwx--x--x 2 aviv 4096 2013-12-19 17:11 ic221/
-rw-r--r-- 1 aviv  412 2013-11-07 17:09 id_rsa.pub
drwx--x--x 2 aviv 4096 2013-11-05 16:48 local/
drwx--x--x 2 aviv 4096 2013-10-17 19:06 Music/
drwx--x--x 2 aviv 4096 2013-10-17 19:06 Pictures/
drwx--x--x 2 aviv 4096 2013-10-17 19:06 Public/
drwx--x--x 2 aviv 4096 2013-10-17 19:06 Templates/
-rwx--x--x 1 aviv 7848 2013-12-19 15:43 test*
-rw------- 1 aviv  147 2013-12-19 15:43 test.c
-rw------- 1 aviv  116 2013-12-19 15:43 test.c~
drwx--x--x 2 aviv 4096 2013-12-19 15:42 tmp/
drwxr-xr-x 2 aviv 4096 2013-11-11 17:59 VBox-Map/
drwx--x--x 2 aviv 4096 2013-10-17 19:06 Videos/
drwx--x--x 2 aviv 4096 2013-11-05 19:11 VirtualBox VMs/
-rw-r--r-- 1 aviv  595 2013-11-08 16:13 #VM-notes.txt#
-rw-r--r-- 1 aviv  465 2013-11-08 16:04 VM-notes.txt
-rw-r--r-- 1 aviv  270 2013-11-08 15:56 VM-notes.txt~

We can interpret this information as:

.- Directory?
|    .-------Permissions                   .- Directory Name
| ___|___     .----------- Owner           |
v/       \    V                            V
drwx--x--x 2 aviv 4096 2013-12-19 17:11 ic221/
-rw-r--r-- 1 aviv  412 2013-11-07 17:09 id_rsa.pub
                    ^  \______________/    ^   
File Size ----------'       |              '- File Name
  in bytes                  |              
                            |
   Last Modified -----------'

3 On your own: Try using the following variants of the ls command:

  • ls -h
  • ls -k
  • ls *

4.2 File System Manipulation Commands

So far, we've looked at commands for navigating the file system, now we are going to look at commands that can manipulate the file system.

If you think about it, there are four main ways we'd like to manipulate the file system:

  • Create an empty directories
  • Create an empty file
  • Copying directories or files from one directory to another
  • Remove directories or files

These actions match to a set of commands:

  • cp from to : Copy a file/directory from path from to path to
  • mv from to : Move a file/directory from path from to path to, also used to change the name of a file/directory
  • rm path : Remove a file from path
  • mkdir path : Make a directory at path
  • touch path : Create an empty file at path

Let's look at a demo of doing this

aviv@zee:~/class/ic221/demo$ mkdir NewDir               <--- Create a new directory
aviv@zee:~/class/ic221/demo$ ls                         <--- Show it was created
NewDir/

aviv@zee:~/class/ic221/demo$ cd NewDir/                 <--- Change into that directory
aviv@zee:~/class/ic221/demo/NewDir$ ls                  <--- List contents, it's empty
aviv@zee:~/class/ic221/demo/NewDir$ touch foo.txt       <--- Create an empty file foo.txt
aviv@zee:~/class/ic221/demo/NewDir$ cp foo.txt baz.txt  <--- Copy foo.txt to baz.txt
aviv@zee:~/class/ic221/demo/NewDir$ ls                  <--- List contents of directry with foo.txt and baz.txt
baz.txt  foo.txt                  

aviv@zee:~/class/ic221/demo/NewDir$ mv baz.txt ..       <--- Move baz.txt to parent directory
aviv@zee:~/class/ic221/demo/NewDir$ ls                  <--- List contents of directory, no baz.txt
foo.txt

aviv@zee:~/class/ic221/demo/NewDir$ cd ..               <--- Change to partent directory
aviv@zee:~/class/ic221/demo$ ls                         <--- List contents, show baz.txt and NewDir
baz.txt  NewDir/                

aviv@zee:~/class/ic221/demo$ rm baz.txt                 <--- Remove baz.txt
rm: remove regular empty file `baz.txt'? y              <--- confirm it's removal
aviv@zee:~/class/ic221/demo$ rm NewDir/foo.txt          <--- Remove foo.txt by using a path to it 

rm: remove regular empty file `NewDir/foo.txt'? y       <--- confirm it's removal
aviv@zee:~/class/ic221/demo$ rm NewDir/                 <--- Remove the direcotry
rm: cannot remove `NewDir/': Is a directory             <--- FAIL!

4.2.1 Handling Directories and Recursive (-r) Option

Note that rm cannot directrly remove a directory, instead you have to use a special form of remove, the rmdir command.

aviv@zee:~/class/ic221/demo$ rmdir NewDir/
aviv@zee:~/class/ic221/demo$ ls
aviv@zee:~/class/ic221/demo$ 

However, you cannot rmdir if there is contents in the directory

aviv@zee:~/class/ic221/demo$ mkdir NewDir
aviv@zee:~/class/ic221/demo$ touch NewDir/foo.txt
aviv@zee:~/class/ic221/demo$ rmdir NewDir/
rmdir: failed to remove `NewDir/': Directory not empty

There is an option to remove -r, which stands for recursive, that will recursively remove a directory and its contents.

aviv@zee:~/class/ic221/demo$ rm -r NewDir/
rm: descend into directory `NewDir/'? y
rm: remove regular empty file `NewDir/foo.txt'? y
rm: remove directory `NewDir'? y
aviv@zee:~/class/ic221/demo$ ls
aviv@zee:~/class/ic221/demo$ 

Similar issues occur when you are trying to copy a directory with cp, you need to specify the recursive option "-r". As you can see from the demo below, this also copies the entire contents of the directory.

aviv@zee:~/class/ic221/demo$ mkdir NewDir
aviv@zee:~/class/ic221/demo$ touch NewDir/foo.txt
aviv@zee:~/class/ic221/demo$ ls
NewDir/
aviv@zee:~/class/ic221/demo$ cp NewDir/ CopyDir
cp: omitting directory `NewDir/'
aviv@zee:~/class/ic221/demo$ ls
NewDir/
aviv@zee:~/class/ic221/demo$ cp -r NewDir/ CopyDir
aviv@zee:~/class/ic221/demo$ ls
CopyDir/  NewDir/
aviv@zee:~/class/ic221/demo$ ls CopyDir/
foo.txt

The move command does not require a recursive option with interacting with directories.

aviv@zee:~/class/ic221/demo$ ls
CopyDir/  NewDir/
aviv@zee:~/class/ic221/demo$ mv NewDir/ CopyDir/
aviv@zee:~/class/ic221/demo$ ls
CopyDir/
aviv@zee:~/class/ic221/demo$ ls CopyDir/
foo.txt  NewDir/
aviv@zee:~/class/ic221/demo$ ls CopyDir/NewDir/
foo.txt

5 Where Commands "Live"

Now you are more familiar with navigating and manipulating the Linux file system, let's return to the basic structure of the Linux root file system.

unixfs.png

When you type the command ls or rm these commands are really program binaries that have to exist somewhere in the file system. Since these are binaries, by convention they exist in a directory that ends in bin. The way the shell finds these commands is by searching through a sequence of bin folders until it finds it.

The search path for binaries is called the $PATH or just path. You can display your current path using the echo command, which just print to the screen.

aviv@zee:~$ echo $PATH
/home/scs/aviv/bin:/opt/local/bin:/opt/local/sbin:usr/lib/jvm/java-6-sun/bin:/home/scs/aviv/bin:

So when you type a command like ls, the shell looks in each of the folders for a program named ls to run. It happens that ls exists in the base /bin folder, which means you can run it in shorthand and using it's full path:

aviv@zee:~$ ls
aviv-local@  class/      Downloads/  local/     Public/     test.c   VBox-Map/        #VM-notes.txt#
#.bashrc#    Desktop/    ic221/      Music/     Templates/  test.c~  Videos/          VM-notes.txt
bin/         Documents/  id_rsa.pub  Pictures/  test*       tmp/     VirtualBox VMs/  VM-notes.txt~
aviv@zee:~$ /bin/ls
aviv-local  class      Downloads   local     Public     test.c   VBox-Map        #VM-notes.txt#
#.bashrc#   Desktop    ic221       Music     Templates  test.c~  Videos          VM-notes.txt
bin         Documents  id_rsa.pub  Pictures  test       tmp      VirtualBox VMs  VM-notes.txt~

5.1 The which command

Unix provides a command line utility for finding where a command lives, the which command.

aviv@zee:~$ which ls
/bin/ls
aviv@zee:~$ which rm
/bin/rm
aviv@zee:~$ which which
/usr/bin/which 

Most basic commands that are part of the Base system are in /bin but user system resources (usr) command line tools exist in /usr/bin, like which itself. What other commands have we looked at is in /usr/bin?