Lab 06: Makefiles and File Statistic System Calls
Table of Contents
Preliminaries
In this lab you will complete C programs that use O.S. system calls, focusing on file system and device management aspects of the operating system. There are four tasks, and you will likely complete the first 3 tasks in lab, finishing the remaining task outside of lab.
Lab Learning Goals
In this lab, you will learn the following topics and practice C programming skills.
- Writing simple Makefiles
- Learning about file statistics
- File statistics:
fstat()
system call - Altering file modes:
chmod()
system call - Printing file information:
getpwuid()
,getgrgid()
,strmode()
library functions - Altering file modification times:
utimes()
,getimeofday()
system calls - Printing System Call Errors:
perror()
anderrno
Lab Setup
Run the following command
~aviv/bin/ic221-up
Change into the lab directory
cd ~/ic221/lab/06
All the material you need to complete the lab can be found in the lab directory. All material you will submit, you should be place within the lab directory. Throughout this lab, we refer to the lab directory, which you should interpret as the above path.
Submission Folder
For this lab, all scripts for submission should be placed in the following folder:
~/ic221/lab/06
This directory contains 5 sub-directories; examples
, makefile
,
mycp
, myls
, and mytouch
. In the examples
directory you will
find any source code in this lab document. All lab work should be
done in the remaining directories directory.
- Only source files found in the folder will be graded.
- Do not change the names of any source files
Finally, in the top level of the lab directory, you will find a
README
file. You must complete the README
file, and include any
additional details that might be needed to complete this lab.
Compiling your programs with gcc
and make
You are required to provide your own Makefiles for this lab. Each of
the source folders, mycp
, myls
, and mytouch
, must have a
Makefile. We should be able to compile your programs by typing
make
in each source directory.
README
In the top level of the lab directory, you will find a README
file. You must fill out the README file with your name and alpha.
Please include a short summary of each of the tasks and any other
information you want to provide to the instructor.
Testing
You are provided a test script to test your submission. It is found
in the base of the lab directory: test.sh
.
Part 1: Makefiles
In the last lab, you were provided with a Makefile, but for this lab
you are required to submit your own Makefiles. All subfolders for
this submission must have their own Makefile. The only thing
required to compile your programs is for the user (and grader, i.e.,
your instructor) to simply type make
in that directory.
A Makefile is a small program that describes a compilation process. There are three main elements of a Makefile:
- Targets: This is the goal of a compilation process, such as an executable or object file
- Dependencies: Files which the target depends on, such as the source files
- Commands: What should be run to actually compile a file to produce a target.
Once the Makefile is in place, in that directory, you run the make
command which looks for a Makefile and attempts to make a specific
target.
Let's look at a very simple example: Here's how we would use a
makefile to compile a helloworld.c
program.
#helloworld/Makefile all: hellworld helloworld: helloworld.c gcc helloworld.c -o helloworld clean: rm -f helloworld
All targets are set to the left and designated with the ":", so the
targets in this Makefile are all
, helloworld
, and clean. Dependencies
are found to the right of the targets. For example, the all
target
depends on generating the helloworld
target, which in turn,
depends on the helloworld.c
source file. Finally, commands are on
lines below target and must be tabbed in using the Tab key (very
important).
Reading the Makefile, the key thing is to follow the targets through
their dependencies to the commands needed to do the execution. For
example, when we type make
, the all
target is executed by
default. The all
target dependents on producing the helloworld
target, which depends on helloworld.c
. Now, the file
helloworld.c
is not a target, it's a file, and by listing it as a
dependency, we are saying "this target is not met whenever the file
changes," like when we edit the source code. Assuming the
helloworld.c
source had changed, thus the helloworld
target is
not met, then the command is executed, which (re)compiles
helloworld.c
to produce the helloworld
executable.
The last target, clean
, does not have any dependencies. Instead,
it just has the shell command to remove the executable. It is good
practice to have a clean
target in your Makefiles. You will often
need to clean up the source by removing extraneous files, and the
Makefile is a fast and convenient way to do this.
When we use the Makefile, we can just type make
, which will
compile all the targets associate with the all
target. Or, we can
type make target
, which will just execute the commands to reach
the given target. For example, to execute the clean
target, we
type make clean
Task 1
Change into the makefile/simple directory. In there you will find a
program called, compileme.c
.
- Write a Makefile that will compile
compileme.c
by typingmake
and also will clean up any stray executables by typingmake clean
. - Test your makefile by typing
make
, and then executing the program. What is the output? Typemake
again after executing the program, what happened? - To test your makefile dependencies, add an additional format
print to the
compileme.c
source, and typemake
again. Ifcompileme.c
recompiles, you've done this right. - Finally, add some options to the compilation so that you compile
compileme.c
with the debug flag (-g
) and the warning all flag (-Wall
), which is always a good thing to do.
You will submit your Makefile for grading.
Multipart Compilation
One of the advantages of C is that you can stage your compilation
process. You did this already in the last lab when we had to
compile simplefs
into an object file filesystem.o
and then
compile that object file with other source, like the shell or
testfile.
Let's review the compilation process. When we have source broken across multiple file, we first have to compile those files to object code, an intermediate compilation stage.
gcc -c source.c -o source.o
Next we can compile multiple object files to assemble an executable.
gcc source.o main.o -o executable
If you look at the above compilation command, you can see the
target and dependancies. The target is the executable,
executable
, the three dependancies are the object files, source.o
,
and main.o
, which each have dependencies, the associate source
(and header) files. Let's translate that into a Makefile.
all: executable executable: source.o main.o gcc source.o main.o -o executable source.o: source.c source.h gcc -c source.c -o source.o main.o: main.c gcc -c main.c -o main.o
Tracing the dependencies and the commands starting with all
, we
can see that to reach the compilation command for executable
,
first source.c
and main.c
must be compiled to object files,
source.o
and main.o
. You will also notice the header file,
source.h
, is listed as a dependency for source.o
, which is
common so that recompilation will occur whenever the header file
changes.
Task 2
Change into the makefile/multi directory. In there you will find four
source files and two header file, two of the source files have
main()
function.
- Write a Makefile to compile the binary executable called
runme
. You will need to also compile the dependencies, and inspect the source file and the associated headers to determine what that might be. - Add another target to the
all
target so that it now compiles two executables,runme
andrunme_too
. - Include a
clean
target to remove all object files, those that end in.o
, and executables, e.g.,runme
andrunme_too
.
You will submit this Makefile for grading.
Part 2: Retrieving and Altering File Statistics
In this part of the lab, we will use system calls to read and write
files, retrieve file stats, and modify file properties. This will be
done in three tasks. First you will reimplement your copy command line
tool, mycp
, but this time you will use system calls to copy the file
with buffered I/O and preserve the permission mode. Next, you will
implement a ls
like command line tool, myls
, which will list the
contents of the current directory. And finally, you will implement a
touch
-like tool that will update the current modification time of
file.
All of these tasks, while individually do not require a lot of code,
will expose you to the variety of system calls that support both
device I/O (read(), write()
, open()
, close()
) and file
management (fstat()
, chmod()
, utimes()
, getimeofday()
). You
will also learn about some library tools that allow you to interpret
file properties in a human readable way (strmode()
, getpwuid()
,
getgrgid()
). Finally, you will learn how to easily check and report
errors for system calls via the error number reporting interface
(errno
, perror()
)
stat()
The operating system maintains file information for each file on the
system. You can retrieve this information with the stat()
and
fstat()
system call, as follows:
#include <sys/types.h> #include <sys/stat.h> #include <unistd.h> int stat(const char *path, struct stat *buf); int fstat(int fd, struct stat *buf);
The stat()
system call takes a path to a file and a pointer to a
struct stat
. It will then set the value of the struct stat
pointed to by buf
with the file statistics. The fstat()
system
call does the same, but takes an open file descriptor rather than a
path.
The struct stat
, which is defined in the man pages, has the
following fields, and a longer description of each is provided in
the man pages.
struct stat { dev_t st_dev; /* ID of device containing file */ ino_t st_ino; /* inode number */ mode_t st_mode; /* protection */ nlink_t st_nlink; /* number of hard links */ uid_t st_uid; /* user ID of owner */ gid_t st_gid; /* group ID of owner */ dev_t st_rdev; /* device ID (if special file) */ off_t st_size; /* total size, in bytes */ blksize_t st_blksize; /* blocksize for file system I/O */ blkcnt_t st_blocks; /* number of 512B blocks allocated */ time_t st_atime; /* time of last access */ time_t st_mtime; /* time of last modification */ time_t st_ctime; /* time of last status change */ };
Of particular relevance to this lab is the st_mode
and st_mtime
fields. The former, st_mode
, is the mode for the file which
defines the permissions of the file, who can read/write/exec the
file, as well as the disposition of the file, such as directory of
file status. There are a number of macros define in the sys/stat.h
header file for interpreting the mode of the file. For example, here
is a small program to test the provided path is a directory.
#include <sys/types.h> #include <sys/stat.h> #include <unistd.h> #include <stdio.h> #include <stdlib.h> int main(int argc, char * argv[]){ struct stat st; if( argc < 2){ fprintf(stderr, "ERROR: Require a path\n"); return 2; //return status error } if( stat(argv[1], &st) < 0){ //error, cannot stat file perror(argv[0]); //report erro with perror return 2; //return status error } if ( S_ISDIR(st.st_mode) ){ printf("It's a directory!\n"); return 0; //return status true } printf("Not a directory :(\n"); return 1; //return status false }
chmod()
The mode of a file, since it is maintained by the operating system,
can only be changed via a system call. The system call to do that
is chmod()
, which is orally familiar to the command line tool
chmod
you've already used. To view the man page, be sure to look
in section 2 of the manual:
#> man 2 chmod
Here are the two forms, one taking a file descriptor and the other
taking a file path, just like stat()
#include <sys/stat.h> int chmod(const char *path, mode_t mode); int fchmod(int fd, mode_t mode);
The mode_t mode
argument is the same type as the st_mode
from
the stat()
output, and is defined using an ORing like file
creation. Here are the relevant constants:
S_IRUSR (00400) read by owner S_IWUSR (00200) write by owner S_IXUSR (00100) execute/search by owner ("search" applies for directories, and means that entries within the directory can be accessed) S_IRGRP (00040) read by group S_IWGRP (00020) write by group S_IXGRP (00010) execute/search by group S_IROTH (00004) read by others S_IWOTH (00002) write by others S_IXOTH (00001) execute/search by others
So to set the mode of a file to read/write own, and read group:
chmod( "path/to/file", S_IRUSR | S_IWUSR | S_IRGRP);
But, as you can see from the constants, they are also defined as
octets, like how we use chmod
on the command line, and the
following is equivalent to the above:
chmod( "path/to/file", 0640);
It's a octet, so the leading 0 is important and tells C that this number should be interpreted in octal.
Error checking system calls
All system calls have the same general function prototype for their return value. They always return an integer: On success, 0 is return, and on failure, a negative value is returned. This means we can always check for system call errors using the same pattern:
if( stat(argv[1], &st) < 0){ //error, cannot stat file perror(argv[0]); //report error with perror return 2; //return status error }
Simply place the system call in an if statement and check that the return value is less then 0. If so, we want to report that error. We can then exit the program, if that is the appropriate action to take; it isn't always, depending on the task.
Due to the simplicity of the return value, the actual cause of the
error is not reported via the return value. Instead, there exists a
global variable errno
which is set to the value of the error. For
example, here are the possible values of errno
for a fairly of
stat()
(note these #define
'ed constants):
EACCES Search permission is denied for one of the directories in the path prefix of path. (See also path_resolution(7).) EBADF fd is bad. EFAULT Bad address. ELOOP Too many symbolic links encountered while traversing the path. ENAMETOOLONG path is too long. ENOENT A component of path does not exist, or path is an empty string. ENOMEM Out of memory (i.e., kernel memory). ENOTDIR A component of the path prefix of path is not a directory. EOVERFLOW (stat()) path refers to a file whose size cannot be represented in the type off_t. This can occur when an application compiled on a 32-bit platform without -D_FILE_OFF‐ SET_BITS=64 calls stat() on a file whose size exceeds (1<<31)-1 bits.
The most likely error to occur is EACCES
, cannot access the file
at the path, or ENOENT
, the file or directory doesn't exist. It's
good practice to report the precise error to the user so that the
error can be corrected, but it's a huge pain to have to type these
error messages yourself into every program. Instead, the C standard
library has a built in error reporting tool: perror()
or print
error. It will automatically check the value of the errno
and
print an appropriate message. Here is a sample output of isdir
with a bad path.
#> ./isdir bad/path ./isdir: No such file or directory
Notice, that it prints useful information. I passed to perror()
the name of the program, argv[0]
, so that perror()
will output
the name of the program in addition to printing the error. This way
it looks like a real command line tool.
Task 3: mycp
Change into the mycp
directory. In there you will find skeleton code
for the start of the mycp
command. The usage of the mycp
command
is as follows
#> mycp source dest
Your mycp
must be able to complete the following tasks.
- You must use buffered I/O to complete the copy. That means
using
read()
andwrite()
system calls and opening the source a destination file withopen()
andclose()
. The buffer size should be 4096, which is the optimal buffer size for fast writes. Check out APUE for some sample code. It should be able to copy a source file to a destination file, preserving the mode of the file. You can directly use the
st_mode
from thestat()
of the source file and use that as the argument tochmod()
of the destination file:fchmod(dest_fd, src_stat.st_mode);
Here is some sampel output:
#> ls -l total 12 drwx--x--x 2 aviv scs 4096 Feb 4 11:17 sub -rw-r--r-- 1 aviv scs 9022 Feb 1 17:07 test.txt #> ../mycp test.txt test_cp.txt #> ls -l total 28 drwx--x--x 2 aviv scs 4096 Feb 4 11:17 sub -rw-r--r-- 1 aviv scs 9022 Feb 4 11:15 test_cp.txt -rw-r--r-- 1 aviv scs 9022 Feb 1 17:07 test.txt #> diff test_cp.txt test.txt #> ../mycp sub/ sub_cp ../mycp: sub/: Is a directory
- If the destination file already exists,
mycp
should truncate the file and overwrite it with the source file, likecp
does. If the source file is a directory, you should exit with the error message based on the executable name and the src directory using. Use the macro
S_ISDIR()
for that check given thest_mode
of the source file. For example:if( S_ISDIR(fs.st_mode) ){ fprintf(stderr, "%s: %s: Is a directory\n", argv[0], argv[1]); return 1; }
- The error conditions of all system calls should be checked and
will greatly help your debugging. Any errors should be
reported. Use
perror()
liberally.
Part 3: Human readable forms of the stat fields
We want to continue exploring the output of the stat()
command,
investigating the other fields of the struct stat
data type.
Unfortunately, we aren't a computer, and we'd like to view these
fields in a human readable way. Of particular relevance are the
following fields:
mode_t st_mode; /* protection */ uid_t st_uid; /* user ID of owner */ gid_t st_gid; /* group ID of owner */
strmode()
We've already discussed the st_mode
field, which stores the
various dispositions of the file. It's just a number really, but
we'd like to view that information in a human readable, which can
be done with the library function strmode()
, which will convert a
mode_t
into a -rwxrwxrwx
string, just like in ls
. Here is an
example program:
/*examples/printmode.c*/ #include <stdio.h> #include <stdlib.h> #include <bsd/string.h> int main(int argc, char * argv[]){ char smode[12]; //mode strings are always 11 chars long // +1 for the NULL, makes 12! strmode(0644, smode); printf("0644 : %s\n", smode); strmode(0742, smode); printf("0742 : %s\n", smode); }
To compile a program that calls strmode()
on a linux
system, which you will be, you need to use the bsd
library. Add
the -lbsd
option to gcc to link the library to your executable.
Here is the sample compilation from the Makefile for printmode
:
printmode: printmode.c
gcc printmode.c -o printmode -lbsd
pwgetuid()
Each file has an owner, a user, and a group. As humans, we like to
refer to these values as strings and not numbers. The owner is
aviv
and the group is scs
, for example. The operating system
doesn't think like a human, and instead stores these values a
numbers. Each user has a uid
and can be member of any number of
groups, identified by a number gid
. Similarly, files also have
an associated user and group for ownership purposes, st_uid
and
st_gid
in the struct stat
.
To convert these numbers to human readable formats, we could look
in the /etc/passwd
and /etc/group
files like we did when
programming bash, but that's way, way too much work. Fortunately, C
provides two library functions to do that conversion for us.
Let's start with retrieving the username. We first need to look up
the password file entry for that user. We use getpwuid()
for
that.
#include <sys/types.h> #include <pwd.h> struct passwd *getpwuid(uid_t uid);
Provided the uid
, which we have from the stat()
output,
getpwuid()
returns a pointer to a struct passwd
data type. It
has the following fields:
struct passwd { char *pw_name; /* username */ char *pw_passwd; /* user password */ uid_t pw_uid; /* user ID */ gid_t pw_gid; /* group ID */ char *pw_gecos; /* user information */ char *pw_dir; /* home directory */ char *pw_shell; /* shell program */ };
The relevant field for this lab is pw_name
, which is a string
referencing the user name. Here is a sample program to print a
username. My uid
on the linux lab is 35001. (BTW, I know what
you're thinking, "oohh, password!" But, the pw_passwd
field is
the encrypted password not the plaint text password. Nice try,
though, and even then, the actual encrypted password is stored in
a different file called /etc/shadow
.)
/*examples/printusername.c*/ #include <sys/types.h> #include <pwd.h> #include <stdio.h> #include <stdlib.h> int main(int argc, char * argv[]){ uid_t my_uid = 35001; struct passwd * pwd; pwd = getpwuid(my_uid); printf("My username is: %s\n", pwd->pw_name); return 0; }
getgrgid()
Retrieving the string version of the gid
uses a similar process
to that of retrieving the username. We use the getgrgid()
library
function:
#include <sys/types.h> #include <grp.h> struct group *getgrgid(gid_t gid);
Given a gid
, the getgrid()
function returns a pointer to a
struct group
, which has the following fields:
struct group { char *gr_name; /* group name */ char *gr_passwd; /* group password */ gid_t gr_gid; /* group ID */ char **gr_mem; /* group members */ };
The relevant field is gr_name
, which is the human readable name
of the group. Here is a sample program to print group name for the
SCS group, which is group id 10120:
#include <sys/types.h> #include <grp.h> #include <stdio.h> #include <stdlib.h> int main(int argc, char * argv[]){ uid_t my_gid = 10120; struct group * grp; grp = getgrgid(my_gid); printf("My groupname is: %s\n", grp->gr_name); return 0; }
Task 4: myls
Change into the myls
directory. In there you will find skeleton code
for the start of the myls
command. The usage of the myls
command is as follows:
#> myls
it takes no arguments, and will only list the contents of the current
directory. Actually iterating through a contents of a directory is
beyond the scope of this lab, and code for doing that is provided for
you. Upon each iteration of the while loop, the entry
structure will
reference a different file/dir. You can retrieve the name of that
file/dir with entry->d_name
. Read the comments for more details.
Your myls
must be able to complete the following tasks.
- It should list all the contents of the current working directory,
from which
myls
is run. The code for iterating through the current directory is provided for you, so the task is parsing thestat()
structures when called on each of the files/directories therein. The
myls
program should do a long list, likels -l
, which outputs the permission modes, name of file, username of the owner, groupname of the file, the size of the file, and the last modification time (st_mtime
). You can usectime()
to print the time. Each item must be separated with tabs, i.e., "\t". Sample output below, run from thetest_dir
in themyls
directory.#> ../myls -rw------- rand aviv scs 7331 Tue Feb 4 09:32:34 2015 -rw-r--r-- a.txt aviv scs 38 Tue Feb 4 09:34:26 2015 drwx--x--x . aviv scs 4096 Tue Feb 4 09:34:45 2015 drwx--x--x .. aviv scs 4096 Tue Feb 4 09:33:21 2015 drwx--x--x subdir aviv scs 4096 Tue Feb 4 09:32:59 2015 -rw-rw---- empty.txt aviv scs 0 Tue Feb 4 09:32:07 2015
You're output may look different, e.g., different user names, group names, and time values, but that is fine and to be expected. Also, don't worry about misalignment. As long as the output is tab separated, you're good. Here's how to use
ctime()
again:ctime(&(st->st_mtime)); //returns a reference to a string lik "Tue Feb 4 09:34:45 2015\n" //Note it has a newline for free.
- You should check all error conditions from system calls and
alike, and exit on error reporting useful information. Use
perror()
liberally.
Part 4: Modifying File Access Times
The last part of the stat()
output that we want to interpret and
manipulate is the creation, access, and modification time. The
relevant fields of the sruct stat
are below:
time_t st_atime; /* time of last access */ time_t st_mtime; /* time of last modification */ time_t st_ctime; /* time of last status change */
As usual, these time values are just large numbers, long
's, which
counts the number of seconds since the epoch, Jan. 1st 1970. We are
not allowed to alter the creation time, those are managed
automatically by the operating system, but we know that we can
alter the modification and access time. We've done this already
with the touch
command, and here is an associated system call
that can alter the modification time, like touch
.
utimes()
The utimes()
system call changes a files last access and
modification time. Here is the prototype from the man page:
#include <sys/types.h> #include <sys/time.h> int utimes(const char *filename, const struct timeval times[2]);
It takes an argument filename
, which is the path to the file, and
a array of struct timeval
. The size of the array is 2, and
times[0]
is the new access time and times[1]
is the new
modification time. Note, that struct timeval
is a different time
that the time_t
data types we've been using for managing time
stamps.
getimeofday()
A struct timeval
has the following fields:
struct timeval { long tv_sec; /* seconds */ long tv_usec; /* microseconds */ };
The tv_sec
is like a time_t
, seconds since the epoch, and the
timeval
offers even finer precision. The tv_usec
is the
additional microsecond calculation. To get the current tiemval
from the system clock, you use the gettimeofday()
system call.
#include <sys/time.h> int gettimeofday(struct timeval *tv, struct timezone *tz);
The gettimeofday()
system call takes a pointer to a timeval
and
a timezone
. It will set the current time at the memory referenced
by those pointers. We don't really care about the timezone, so
we'll call gettimeofday()
like this, generally:
struct timeval tv; gettimeofday(&tv, NULL);
Putting it all together, here is a sample program to print the
current time using ctime()
and getimeofday()
:
#include <stdio.h> #include <stdlib.h> #include <sys/time.h> #include <time.h> int main(int argc, char * argv[]){ struct timeval tv; gettimeofday(&tv,NULL); //ctime takes a pointer to the seconds since epoch printf("%s", ctime(&(tv.tv_sec))); }
Task 5: mytouch
Change into the mytouch
directory. In there you will find skeleton
code for the start of the mytouch.c
code. The usage of mytouch
is as
follows:
#> mytouch path
Where path
is the file path to the file to be touched. Your
mytouch
must be able to complete the following tasks:
- If a file exists, it should update the modification of the file
using
utimes()
andgetitmeofday()
. It should output the modification prior to the call to
utimes()
and after using thectime()
. Sample output is below:#> ./mytouch mytouch.c Last Modified: Tue Feb 4 08:59:44 2014 New Modified: Tue Feb 4 12:41:11 2014
- If the file does not exist, an error should be reported rather than creating a new file.
- All errors for system calls should be checked.
- (5 pts EXTRA CREDIT): If the file does not exist, create it if possible, and report an error if not possible.