IC221: Systems Programming (SP17)


Home Policy Calendar Units Assignments Resources

Lab 02: Basic Bash Scripting

Table of Contents

Preliminaries

Lab Learning Goals

  1. Setting up a bash script
  2. Basic bash scripting with variables and control flows
  3. Storing the output of execution with sub shells
  4. Loops and iteration
  5. Mastering your bash environment

Lab Setup

Run the following command

~aviv/bin/ic221-up

Change into the lab directory

cd ~/ic221/lab/02

All the material you need to complete the lab can be found in the lab directory. All material you will submit, you should place within the lab directory. Throughout this lab, we refer to the lab directory, which you should interpret as the above path.

Submission Folder

For this lab, all scripts for submission should be placed in the following folder:

~/ic221/lab/02

Provided Source Files

At times in this lab writeup, you will see the names of provided source files to the upper right of the source blocks. You can find those source files in the lab directory at the associated paths.

Test Script

To help you complete the lab, I have provide a test script that will run basic tests against your program. The script is not designed to be comprehensive, and you will graded based on a larger array of tests. To execute the test script, run it from anywhere within the lab directory.

./test.sh

README file

You are required to submit and complet the README file.

Basics of Bash Scripts

What we have been doing so far is single operations per command line. Today, we are going to see how to combine multiple operations and some programming logic to build small scripts to perform basic utilities.

Hello World

Scripting is synonymous with programming — it is programming — except it's generally considered more light weight. For the scripts we'll write in this class, the commands will be stored in a file, but that file is interpreted by the bash shell. That means each line actually runs like a sequence of command lines executed in the order they apear in the file as if they were typed onto the command line.

You must indicate that you wish for the your script to treated as such by designating which program should be interpreting the command lines. To make that designation, we use a #! symbol followed by the program. For this lab, that program is going to be bash, namely the shell. Next, we want to write our first program, namely to print "Hello World," so let's take a look at that first.

#!/bin/bash

echo "Hello World"  # This is a comment

You can find the this program in helloworld.sh in the examples folder in the lab directory. By convention, all bash scripts, or shell scripts, have a file type .sh, and we will use that convention in this class. At the start of all bash scripts is the #! symbole and the path to the bash program to interpret the script, see line 1. The #!, pronounced "shebang" tells the shell when reading this file, treat it as bash.

On line 3, there is the actual command to print to standard out, "Hello World", as well as a comment, anything following a "#". The echo command echoes back anything it is given to stdout, in this case, the string "Hello World."

We still need to execute your program, you have to make the file executable.

#> chmod +x helloworld.sh

Then you execute with ./ indicator

#> ./helloworld.sh
Hello World

And there you have it, "Hello World" from bash.

Output to stdout vs. stderr

A bash script is just like other command line tools we've seen so far with respect to output and input. The same standard file names and file descriptor numbers are automatically provided as well as the ability to redirect them.

When you put a command in a script that writes to standard out, like say head or cat command, the output will also be part of the standard output of the script, just like echo was above. Similarly, if a command has an error and writes to standard error, that will also be a part of the script's standard error.

If you want to write to standard error directly in your script, you will redirect the output of the echo command.

echo "ERROR: Something bad happened"  1>&2  #<-- redirect file descriptor 1 to 2

Recall that the standard files have numeric file descriptors, 0 for stdin, 1 for stdout, and 2 for stderr. When you redirect, you can choose based on the file descriptor number, and you can even redirect to another file descriptor using the & following by the numeric descriptor.

Task 1 (15 points)

The file /etc/passwd contains all the login information (not passwords) for users on the system. Each line looks a little like this:

username     groupid            home dir
 |             |                      |
 v             v                      v
aviv:x:35001:10120:Adam Aviv {}:/home/scs/aviv:/bin/bash
        ^               ^                         ^
        |               |                         |
       uid             name                    default shell

Write a script, allusers.sh, that will parse the /etc/passwd file and print a list of all the names (not usernames).

Bash Variables

You can assign variables in bash just like in C++, but you refer to variables using a $ symbol.

name=adam
echo "Your name is $name"

Note that spacing is important, the assignment must occur immediately following the equal sign. If there are spaces in the string value, use quotes.

name="Adam Aviv"
echo "Your name is $name"

Finally, variable replacement, that is substituting a variable for its value, occurs throughout a script, including in quotes, like above.

Typing: Strings vs. Arithmetic

One major difference between bash, which is a scripting language, and C/C++, is that it is not strongly typed. That means, you don't need to identify the type of the data you're storing to a variable. Instead, you have to provide some context for how you want the data to be interpreted, otherwise, by default, it will be treated like strings and + is concatenation.

n=1
n=1+$n
echo $n   #<-- print "1+1" not 2!

To peform arithmetic operations, you use the let operations.

n=1
let n=1+$n #<-- use quotes if you need white space
echo $n   #<-- prints 2!

There is also no such things as floats in bash. Everything is a numeric integer.

Empty Variables

Finally, as in must scripting languages, you don't need to declare your variables ahead of time. That means a variable can be referenced before it ever received a value. In those cases, bash treats the variable as empty, equivalent to the empty string.

echo $foo

Will echo nothing, since variable foo doesn't exist.

Environment Variables

In a bash script, and in bash shells in general, there are a number of environment variables available for convenience. Once such environment variable we've seen previously is the PATH variable:

echo $PATH

By convention all environment variables are upper case. Below are some really useful ones to know:

  • $USER : the current user
  • $HOME : the home directory
  • $PWD : The current working directory
  • $SHELL : The name of the shell you are currently running

For example, this program prints a helpful message

#!/bin/bash

echo "Hello $USER!"
echo "You are currently here: $PWD"
echo "But, your home directory here: $HOME"

Additionally, there are environment variables for arguments passed to the shell script. These are $0, $1, $2, etc, where the number refers to the command line argument. For example, consider the script, printargs.sh

#!/bin/bash                                                                                                        

echo "arg 0: $0"
echo "arg 1: $1"
echo "arg 2: $2"
echo "arg 3: $3"
echo "arg 4: $4"

If we were to run it:

arg#  0          1 2 3 4
      |          | | | |
      v          v v v v
#>./printargs.sh x y z  
arg 0: ./printargs.sh
arg 1: x
arg 2: y
arg 3: z
arg 4: 

Argument 0 always refers to the script, and each $1 refers to the arguments to the script. The script prints the first 4 arguments, but there isn't a fourth argument. $4 is treated as the empty string. To refer to all arguments, not including $0, use $*. $# is special variable set to the number of arguments.

#!/bin/bash                                                                                                        

echo "There are $# number of args"
echo "Arguments _not_ including the name of the script: $*"

Task 2 (20 points)

Write a script, getname.sh that takes a username as an argument and prints the full name of that use. Here is some sample output.

aviv@saddleback: task2 $ ./getname.sh aviv
Adam Aviv {}
aviv@saddleback: task2 $ ./getname.sh tgarcia
Capt Tim Garcia {}
aviv@saddleback: task2 $ ./getname.sh m17XXXX
aviv@saddleback: task2 $ ./getname.sh roche
Daniel S. Roche {}
aviv@saddleback: task2 $

READ BELOW!!!! There are important instructions.

Because you want to match precise usernames, you can use the following grep regular expression in your script:

grep "^USERNAME:"

Where USERNAME is replaced by the username you are searching for as specified in the command line arguments, i.e., $1. If the user is not found, your script can print nothing.

For additional tests, you can compare your program to the finger command which does something similar. See the manual for more detail.

Conditional Control Flow

Like any reasonable programming environment, you want to do more than just print things out in an iterative fashion. We need a mechanism to change the program based on some condition, like if/else statements and looping.

If/Elif/Else Statements

The format of an if statement is like so:

if cmd
then 
    cmd
elif cmd
then
    cmd
else
    cmd  
fi        #<-- close off the if statemet with fi

The if cmd part proceeds if the cmd succeeds, that is, it doesn't have exit with failure. In general, the cmd used in if blocks and others are the [ command (see man [) which checks conditions. With the [ you can do simple comparators and other operations. Here is a an examples script that checks if the script was run from the home directory:

#!/bin/bash

echo "Hello $USER"

if [ $HOME == $PWD ]
then
    echo "Good, you're in your home directory: $HOME"
else
    echo "What are you doing away from home?!?"

    cd $HOME

    echo "Now you are in your home directory: $PWD"
fi

There are two things of note here. First, the [ $HOME = $PWD ]= is a command that succeeds only when $HOME is the present working directory. Also, when you change directories within a script, you are not changing the directory of the shell you ran the script from, just the present working directory of the script itself.

Comparators

Conditions in bash exist within brackets, [ ], but the [ is actually a command that returns true if the conditions were met. For string comparators, you can use = and ! but for numerics, you need to use options to the [ command. See below for the varied usages.

#!/bin/bash                                                                                                        

#string                                                                                                            
var=adam
if [ $1 = $var ] ; then echo "string $1 equals $var" ; fi
if [ $1 == $var ] ; then echo "string $1 equals $var" ; fi
if [ $1 != $var ] ; then echo "string $1 does not equals $var" ; fi
if [ -z $1 ] ; then echo "string $1 is empty!"; fi
if [ -n $1 ] ; then echo "string $1 is not empty!"; fi

#numeric                                                                                                           
a=1
if [ $a -eq $1 ] ; then echo "number $1 equals $a" ; fi
if [ $a -ne $1 ] ; then echo "number $1 does not equal $a" ; fi
if [ $a -gt $1 ] ; then echo "$a is greater than $1" ; fi
if [ $a -lt $1 ] ; then echo "$a is less than $1" ; fi

#file/dir properties                                                                                               
if [ -d $1 ] ; then echo "$1 exists and is a directory!" ; fi
if [ -e $1 ] ; then echo "$1 exists!" ; fi
if [ -f $1 ] ; then echo "$1 exists and is not a directory!" ; fi
if [ -r $1 ] ; then echo "$1 exists and is readable!" ; fi
if [ -s $1 ] ; then echo "$1 exists and has size greater than zero!" ; fi
if [ -w $1 ] ; then echo "$1 exists and is writable!" ; fi
if [ -x $1 ] ; then echo "$1 exists and is executable!" ; fi

# NOTE: Since everything's on the same line, I need ;s between                                                     
# the if and the then and between the then and the fi.

Additionally, you can use the standard set of conditional operators:

if [ $condition1 ] || [ $condition2 ] ; do echo "either condition1 or condition2 are true"; done
if [ $condition1 ] && [ $condition2 ] ; do echo "condition1 and condition2 are true"; done
if [ ! $condition1 ] ; do echo "condtion1 is not true"; done

Task 3 (20 points)

Write a script, getsize.sh, which takes a path as an argument and prints out the size of the file/dir at that path. Your script must do error checking and it must print error messages to STDERR. Here is some sample output:

aviv@saddleback: task3 $ ./getsize.sh empty.txt 
0
aviv@saddleback: task3 $ ./getsize.sh larger.txt 
4000
aviv@saddleback: task3 $ ./getsize.sh medium.txt 
1847
aviv@saddleback: task3 $ ./getsize.sh getsize.sh 
253
aviv@saddleback: task3 $ ./getsize.sh file_does_not_exist
getsize.sh: ERROR: File file_does_not_exist does not exist
aviv@saddleback: task3 $ ./getsize.sh file_does_not_exist > /dev/null
getsize.sh: ERROR: File file_does_not_exist does not exist
aviv@saddleback: task3 $ ./getsize.sh file_does_not_exist 2> /dev/null

You should be able to use a cut, ls, and/or wc to get the information you need. All errors should be written to stderr such that:

aviv@saddleback: task3 $ ./getsize.sh file_does_not_exist > /dev/null
getsize.sh: ERROR: File file_does_not_exist does not exist
aviv@saddleback: task3 $ ./getsize.sh file_does_not_exist 2> /dev/null

(HINT) You may find it useful to use the tr command and the squeeze option -s, in particular, could be useful to get rid of extra whitespace so that your cut fields are more consistent. For example:

CMD1 | tr -s ' ' | cut -d ' ' -f X

Will reduce two spaces into a single space before sending the data to cut is easier to parse.

Looping Control Flow

Like in C++, there are two standard loop structures: iterate until a condition is met (while loops) and iterate for each item (for loops).

While Loops

The syntax of a while loop is as follows:

while cmd
do
   #commands   
done

For loops

A for loop has a similar structure, but unlike for loops in C++, for loops in bash iterate for each item provided:

for var in str1 str2 .. strn
do
   # commands
   # $var is available for references, set to str1 -> strn in each loop
done

Here's a loop that iterates through all arguments.

#!/bin/bash                                                                                                        

for arg in $*
do
    echo "arg: $arg"
done

Note that the special variabl $* stores a list of the command line arguments.

Task 4 (20 points)

Create a script called getallsizes.sh, which takes in any number of files on the command line and prints their sizes as follows. Here's some sample usage:

aviv@saddleback: task4 $ ls -l
total 12
-rw-r----- 1 aviv scs    0 Dec 29 14:56 empty.txt
-rwxr-x--- 1 aviv scs  277 Dec 29 14:56 getallsizes.sh
-rw-r----- 1 aviv scs 4000 Dec 29 14:56 larger.txt
-rw-r----- 1 aviv scs 1847 Dec 29 14:56 medium.txt
aviv@saddleback: task4 $ ./getallsizes.sh empty.txt 
empty.txt 0
aviv@saddleback: task4 $ ./getallsizes.sh *.txt
empty.txt 0
larger.txt 4000
medium.txt 1847
aviv@saddleback: task4 $ ./getallsizes.sh empty.txt doesnotexist.txt larger.txt 
empty.txt 0
getallsizes.sh: ERROR: File doesnotexist.txt does not exist
larger.txt 4000
aviv@saddleback: task4 $ ./getallsizes.sh empty.txt doesnotexist.txt larger.txt 2> /dev/null
empty.txt 0
larger.txt 4000
aviv@saddleback: task4 $ ./getallsizes.sh empty.txt doesnotexist.txt larger.txt > /dev/null
getallsizes.sh: ERROR: File doesnotexist.txt does not exist

(HINT) Check out the man page for echo to print without a trailing new line using the -n options so you can align the file name with their size.

Sub shells

While it is simple to have a shell script execute a set of commands and just write to the output, often there are situations where you wish to store the output of some command or parse the output of some command as a variable name. To do that you use a subshell.

The idea behind a subshell is that you can store the result of the computation, as outputted to stdout to a variable:

local_files=$(ls)
permissions=$(ls -l | cut -d " " -f 1)

The variable local_files and permissions are now have the results of the computation.

Task 5 (25 points)

Write a script, isbiggerthan.sh, which takes a path and a size and determines if the file or directory is bigger (or equal to) the given size. Here is the usage:

./isbiggerthan.sh size path

And here is some sample output

aviv@saddleback: task5 $ ./isbiggerthan.sh 10 empty.txt 
no
aviv@saddleback: task5 $ ./isbiggerthan.sh 0 empty.txt 
yes
aviv@saddleback: task5 $ ./isbiggerthan.sh 10 empty.txt 
no
aviv@saddleback: task5 $ ./isbiggerthan.sh 10 medium.txt
yes
aviv@saddleback: task5 $ ./isbiggerthan.sh 2000 medium.txt
no
aviv@saddleback: task5 $ ./isbiggerthan.sh 2000 larger.txt
yes

As before, you must do error checking. Here is the error settings:

aviv@saddleback: task5 $ ./isbiggerthan.sh 
isbiggerthan.sh: ERROR: Require path and size 
aviv@saddleback: task5 $ ./isbiggerthan.sh num
isbiggerthan.sh: ERROR: Require path and size 
aviv@saddleback: task5 $ ./isbiggerthan.sh num empty.txt
isbiggerthan.sh: ERROR: Require a number for num
aviv@saddleback: task5 $ ./isbiggerthan.sh -1 empty.txt
isbiggerthan.sh: ERROR: Require a positive number for -1

READ BELOW!!!! There are important instructions.

Because checking if a variable is a number is non-trivial, I have provided that code in starter code, duplicated below.

if [ "$var" -eq "$var" ] 2> /dev/null # check for a number pipe error to /dev/null
then
    echo "it's a number"
else
    echo "it's *not* a number"
fi

Exit Status

Every program that runs in the shell, and every program generally, when it completes returns an exit code. For example, in bash if statements, the condition that is true is a successful exit status.

On Unix, a successful exit status is 0, and all other values are different errors. There is an environment variable $? to test an exit status. Here you can see exit statuses from the command line

#> [ 1 -eq 2 ]
#> echo $?
1
#> [ 1 -eq 1 ]
#> echo $?
0
#> cat empty.txt
#> echo $?
0
#> cat doesnotexist
cat: doesnotexist: No such file or directory
#> echo $?
1

In a bash script, by default, the script exits with the same exit status as the prior exit status. However, you can set the exit status explicitly in cases of errors using the exit command:

exit 0  #exit  succes
exit 1  #exit error

Task 6 BONUS (+5 points)

Copy your isbiggerthan.sh script from the task5 directory to the task6 directory. Update your isbiggerthan.sh script to exit with different status codes dependent on if the file is bigger than the size or if there is an error. For example:

  • exit 0 : if the file is bigger (or equal) to the size
  • exit 1 : if the file is not bigger (or equal) to the size
  • exit 2 : if not enough arguments
  • exit 3 : did not receive a number for size
  • exit 4 : recieved a negative number for size
  • exit 5 : file does not exist

Once complete, the script isbiggerthanall.sh should function properly with these arguments.

isbiggerthanall.sh size path [path [...]]

the output is the files at the paths that are bigger then the specified size.

Here is some sample output:

isbiggerthan size path

And here is some sample output

aviv@saddleback: task6 $ ls -l
total 16
-rw-r----- 1 aviv scs    0 Dec 29 15:09 empty.txt
-rwxr-x--- 1 aviv scs  870 Dec 29 15:14 isbiggerthanall.sh
-rwxr-x--- 1 aviv scs  748 Dec 29 15:09 isbiggerthan.sh
-rw-r----- 1 aviv scs 4000 Dec 29 15:09 larger.txt
-rw-r----- 1 aviv scs 1847 Dec 29 15:09 medium.txt
aviv@saddleback: task6 $ ./isbiggerthanall.sh 0 *.txt
empty.txt
larger.txt
medium.txt
aviv@saddleback: task6 $ ./isbiggerthanall.sh 1 *.txt
larger.txt
medium.txt
aviv@saddleback: task6 $ ./isbiggerthanall.sh 2000 *.txt
larger.txt
aviv@saddleback: task6 $ ./isbiggerthanall.sh 9999 *.txt
aviv@saddleback: task6 $

And the error conditions should also still work:

viv@saddleback: task6 $ ./isbiggerthanall.sh num larger.txt 
isbiggerthanall.sh: ERROR: Require a number for size not num 
aviv@saddleback: task6 $ ./isbiggerthanall.sh -1 larger.txt 
isbiggerthanall.sh: ERROR: Require a positive number 
aviv@saddleback: task6 $ ./isbiggerthanall.sh 
isbiggerthanall.sh: ERROR: Require a size and at least one file
aviv@saddleback: task6 $ ./isbiggerthanall.sh 1 doesnotexist.txt
isbiggerthanall.sh: ERROR: File doesnotexist.txt does not exist

In your README file read provide a description of how isbiggerthanll.sh works. DO NOT FORGET TO DO THIS.

(EXTRA on our own) Mastering your shell environment

Your shell environment is much like a script, and you can set many different environment variables to suite your needs. To make any of these changes permanent, you can place modifications to your shell environment in a file called .bashrc in your home directory.

Changing the prompt with PS1

The prompt is set via the eviroment variable $PS1. Let's take a look at my $PS1.

aviv@mich342csdtestu:~$echo $PS1
\[\033[01;32m\]\u@\h\[\033[00m\]:\[\033[01;34m\]\w\[\033[00m\]

Some of those extra stuff maps to color, but I can change my prompt as I see fit.

aviv@mich342csdtestu:~$PS1="Command me baby$ "
Command me baby$ ls
accesslog.gz  bin      Documents  ic221  Pictures     syslog.dat  VBox-Map        VM-notes.txt
authlog.dat   class    Downloads  local  Public       Templates   Videos
aviv-local    Desktop  git        Music  public_html  tmp         VirtualBox VMs
Command me baby$ pwd
/home/scs/aviv
Command me baby$ 

Let's set it to something a bit more aproproiate:

Command me baby$ PS1="\u@\h: \w $"
aviv@mich342csdtestu: ~ $

The escape characters, \u and \h stand for username and hostname, respectively, and \w is the current working directory. There are many cool things you can do with your shell prompt. Check out this link for more info.

Aliasing

Another really useful tool is to create command aliases. These are shortcuts to more complicated commands. For example, let's suppose you're a lot like me and do a lot of typos in the terminal, then you often type l and instead of ls and you just want l to alias for ls. You can setup a simple alias to do that:

aviv@mich342csdtestu: ~ $l
l: command not found
aviv@mich342csdtestu: ~ $alias l=ls
aviv@mich342csdtestu: ~ $l
accesslog.gz  bin      Documents  ic221  Pictures     syslog.dat  VBox-Map        VM-notes.txt
authlog.dat   class    Downloads  local  Public       Templates   Videos
aviv-local    Desktop  git        Music  public_html  tmp         VirtualBox VMs

Another really useful alias is ll to map to ls -l

aviv@mich342csdtestu: ~ $alias ll="ls -l"
aviv@mich342csdtestu: ~ $ll
total 212
-rw-r--r-- 1 aviv scs  13524 Dec 23 10:24 accesslog.gz
-rw-r----- 1 aviv scs 115373 Dec 23 10:22 authlog.dat
lrwxrwxrwx 1 aviv scs     17 Nov  5 16:40 aviv-local -> /local/aviv-local
drwxr-xr-x 2 aviv scs   4096 Jan  9 15:12 bin
drwxr-xr-x 3 aviv scs   4096 Dec 19 16:19 class
drwx--x--x 2 aviv scs   4096 Dec 22 11:58 Desktop
drwx--x--x 2 aviv scs   4096 Oct 17 19:06 Documents
drwx--x--x 2 aviv scs   4096 Dec 22 12:50 Downloads
drwx------ 3 aviv scs   4096 Dec 23 11:27 git
drwxr-xr-x 5 aviv scs   4096 Jan  7 13:08 ic221
drwx--x--x 4 aviv scs   4096 Dec 24 09:20 local
drwx--x--x 2 aviv scs   4096 Oct 17 19:06 Music
drwx--x--x 2 aviv scs   4096 Dec 22 14:51 Pictures
drwx--x--x 2 aviv scs   4096 Jan  7 07:40 Public
drwxr-xr-x 3 aviv scs   4096 Jan  7 07:43 public_html
-rw-r----- 1 aviv scs   4530 Dec 23 10:22 syslog.dat
drwx--x--x 2 aviv scs   4096 Oct 17 19:06 Templates
drwx--x--x 2 aviv scs   4096 Jan  8 15:12 tmp
drwxr-xr-x 2 aviv scs   4096 Nov 11 17:59 VBox-Map
drwx--x--x 2 aviv scs   4096 Oct 17 19:06 Videos
drwx--x--x 2 aviv scs   4096 Nov  5 19:11 VirtualBox VMs
-rw-r--r-- 1 aviv scs    465 Nov  8 16:04 VM-notes.txt

Adding to your $PATH

Finally, as you develop new tools for this class, you'd like to be able to call on them directly, like you do other commmand line tools. To do this, those scripts/programs need to exist on the $PATH.

The standard, Unix way to do this, is to create a directory in your home folder called bin and place all newly create commands there. Unfortunately, your bin directory may not be on the search path yet. To add it, you have to update the enviroment variable PATH

export PATH=$HOME/bin:$PATH

This will place your bin at the start of the search path while preserving prior values on the path. The export command sets the assignment globally, across all scripts and programs that are to be run from the shell.