IC221: Systems Programming (SP14)


Home Policy Calendar Syllabus Resources Piazza

Lab 10: Linking and File Seeking

Table of Contents

1 Preliminaries

In this lab you will complete a set of C programs to expose you to file seeking, using both fseek() and lseek(). Additionally, you will use ln and ls to set and identify file links.

1.1 Lab Learning Goals

In this lab, you will learn the following topics and practice C programming skills.

  1. Using ln and identifying links
  2. Using lseek() and manipulat file descriptor offsets
  3. using fseek() and manipulating FILE * pointer offsets

1.2 Lab Setup

Run the following command

~aviv/bin/ic221-up

Change into the lab directory

cd ~/ic221/labs/10

All the material you need to complete the lab can be found in the lab directory. All material you will submit, you should place within the lab directory. Throughout this lab, we refer to the lab directory, which you should interpret as the above path.

1.3 Submission Folder

For this lab, all ubmission should be placed in the following folder:

~/ic221/labs/10/

This directory contains 4 sub-directories; examples, timer, term-status, and mini-sh. In the examples directory you will find any source code in this lab document. All lab work should be done in the remaining directories.

  • Only source files found in the folder will be graded.
  • Do not change the names of any source files

Finally, in the top level of the lab directory, you will find a README file. You must complete the README file, and include any additional details that might be needed to complete this lab.

1.4 Compiling your programs with clang and make

You are not required to provide your own Makefiles for this lab.

1.5 README

In the top level of the lab directory, you will find a README file. You must fill out the README file with your name and alpha. Please include a short summary of each of the tasks and any other information you want to provide to the instructor.

1.6 Test Script

You are provided a test script which prints pass/fail information for a set of tests for your programs. Note that passing all the tests does not mean you will receive a perfect score: other tests will be performed on your submission. To run the test script, execute test.sh from the lab directory.

./test.sh

You can comment out individual tests while working on different parts of the lab. Open up the test script and place comments at the bottom where appropriate.

1.7 Working versions for comparisons

Working versions of all the programs described in this lab document can be found here:

~aviv/lab10-bin/llines
~aviv/lab10-bin/flines

Run these programs as a comparison point.

2 PART 1: File Linking

In this part of the lab you will create hard and symbolic links to match some specification as well as analyze links. Recall that to create a link on the command line we use the ln command:

2.1 Hard Links

ln path1 path2

This will create a new file at path2 that is hard linked with the file at path1.

#> touch f
#> ls -l
total 0
-rw-r----- 1 aviv scs 0 Mar 25 08:55 f
#> ln f hl
#> ls -l
total 0
-rw-r----- 2 aviv scs 0 Mar 25 08:55 f
-rw-r----- 2 aviv scs 0 Mar 25 08:55 hl

2.2 Symbolic Links

A symbolic link is created in a similar way, but with ln -s.

#> ln -s f sl
#> ls -l
total 0
-rw-r----- 2 aviv scs 0 Mar 25 08:55 f
-rw-r----- 2 aviv scs 0 Mar 25 08:55 hl
lrwxrwxrwx 1 aviv scs 1 Mar 25 08:55 sl -> f

2.3 Hard Links to Symbolic Links

And we can also mix linking by creating a hard link to a symbolic link, which is the same as duplicating the symbolic link:

#> ln sl g
#> ls -l
total 0
-rw-r----- 2 aviv scs 0 Mar 25 08:55 f
lrwxrwxrwx 2 aviv scs 1 Mar 25 08:55 g -> f
-rw-r----- 2 aviv scs 0 Mar 25 08:55 hl
lrwxrwxrwx 2 aviv scs 1 Mar 25 08:55 sl -> f

Finally, we can analyze all the linking by viewing the i-node information with ls -li

#> ls -il
total 0
33426412 -rw-r----- 2 aviv scs 0 Mar 25 08:55 f
33426413 lrwxrwxrwx 2 aviv scs 1 Mar 25 08:55 g -> f
33426412 -rw-r----- 2 aviv scs 0 Mar 25 08:55 hl
33426413 lrwxrwxrwx 2 aviv scs 1 Mar 25 08:55 sl -> f

Both f and hl have the same i-node (33426412) and g and sl have the same i-node (33426413).

2.4 Relative vs. Absolute Symbolic Links

Symbolic links can be relative or absolute. We will use relative links for this lab. Recall that the following does not work:

#> echo "Hello World" > a
#> cat a
Hello World
#> mkdir dir1
#> ln -s a dir1/a
#> ls -l dir1
total 0
lrwxrwxrwx 1 aviv scs 1 Mar 25 09:03 a -> a

The link a -> a refers to itself, not the a in the top level directory. Trying to follow this link results in a loop and an error:

#> cat dir1/a 
cat: dir1/a: Too many levels of symbolic links

Instead, the relative path needs to be established:

#> ln -s ../a dir1/a
#> ls -l dir1
total 0
lrwxrwxrwx 1 aviv scs 4 Mar 25 09:07 a -> ../a

It is conceptually easier to always change into the directory where the link is to be created to make the relatively linking more obvious:

#> cd dir1/
#> ln -s ../a b
#> ls -l
total 0
lrwxrwxrwx 1 aviv scs 4 Mar 25 09:07 a -> ../a
lrwxrwxrwx 1 aviv scs 4 Mar 25 09:12 b -> ../a

2.4.1 Task 1 Creating Linked Files

For this task, change into the makelinks directory, where you will find to bash script files:

  • clean.sh : execute this to reset the directory
  • setup.sh : place all your shell commands here to setup the proper linking.

You will add all bash commands to setup.sh, which will be graded. The command should be able to generate a directory structure with the following linking:

#> ./setup.sh
#> ls -li *
33425506 -rw-r----- 4 aviv scs    0 Mar 25 09:23 a
33426379 -rw-r----- 1 aviv scs    0 Mar 25 09:23 b
33426380 -rw-r----- 1 aviv scs    0 Mar 25 09:23 c
33425522 -rwxr-x--- 1 aviv scs  192 Mar 24 18:29 clean.sh
33426384 -rw-r----- 1 aviv scs    0 Mar 25 09:23 d
33426385 -rw-r----- 1 aviv scs    0 Mar 25 09:23 e
33426421 lrwxrwxrwx 1 aviv scs    1 Mar 25 09:23 s -> a
33425528 -rwxr-x--- 1 aviv scs  201 Mar 24 18:29 setup.sh
33426427 lrwxrwxrwx 1 aviv scs    1 Mar 25 09:23 t -> s

dir1:
total 0
33425506 -rw-r----- 4 aviv scs 0 Mar 25 09:23 a

dir2:
total 0
33425506 -rw-r----- 4 aviv scs 0 Mar 25 09:23 a

dir3:
total 0
33425506 -rw-r----- 4 aviv scs 0 Mar 25 09:23 a
33426429 lrwxrwxrwx 1 aviv scs 4 Mar 25 09:23 c -> ../c
33426428 lrwxrwxrwx 1 aviv scs 4 Mar 25 09:23 d -> ../d
33426430 lrwxrwxrwx 1 aviv scs 4 Mar 25 09:23 gonavy -> ../t

Note that the i-node values you see will not be the same, but the structure, that is two files are hard links, must be the same. The whole directory can be reset with a call to clean.sh

#> ./clean.sh 
#> ls -li *
33425506 -rwxr-x--- 1 aviv scs 192 Mar 25 09:24 clean.sh
33426379 -rwxr-x--- 1 aviv scs 201 Mar 25 09:24 setup.sh

2.4.2 Task 2: Identifying Linked Files

For this task, change ino the cleanlinks directory where you will find 2 scripts and 1 binary executable:

  • clean.sh : execute this to reset the directory
  • init : execute this to initialize the directory
  • setup.sh : place all your shell commands here to setup the proper linkin.

You will add all bash commands to setup.sh, which will be graded. After running init, you wil find the following setup:

#> ls -li *
33426408 -rw-r----- 6 aviv scs    8 Mar 25 09:27 a
33426408 -rw-r----- 6 aviv scs    8 Mar 25 09:27 b
33426408 -rw-r----- 6 aviv scs    8 Mar 25 09:27 c
33425530 -rwxr-x--- 1 aviv scs  332 Mar 25 08:49 clean.sh
33426408 -rw-r----- 6 aviv scs    8 Mar 25 09:27 d
33426409 lrwxrwxrwx 2 aviv scs    1 Mar 25 09:27 e -> d
33425578 -rwxr-x--- 1 aviv scs 8168 Mar 25 08:49 init
33426409 lrwxrwxrwx 2 aviv scs    1 Mar 25 09:27 k -> d
33425577 -rwxr-x--- 1 aviv scs  161 Mar 25 08:49 setup.sh

dir1:
total 4
33426408 -rw-r----- 6 aviv scs 8 Mar 25 09:27 f

dir2:
total 4
33426408 -rw-r----- 6 aviv scs 8 Mar 25 09:27 g

dir3:
total 12
33426410 -rw-r----- 1 aviv scs 10 Mar 25 09:27 h
33426421 -rw-r----- 2 aviv scs  6 Mar 25 09:27 i
33426421 -rw-r----- 2 aviv scs  6 Mar 25 09:27 j

Your task is to replace all hard links with symbolic links where the linking file is the one greater by alphabet. For example:

33426408 -rw-r----- 6 aviv scs    8 Mar 25 09:27 a
33426408 -rw-r----- 6 aviv scs    8 Mar 25 09:27 b

a and b are hard links, but since b is greater than a alphabetically, then b should symbolically link to a.

-rw-r----- 5 aviv scs    8 Mar 25 09:27 a
lrwxrwxrwx 1 aviv scs    1 Mar 25 09:29 b -> a

You must convert all links including those in subdirectories being mindful of relative paths. All commands to do so should be in setup.sh so that executing setup.sh completes the task.

3 PART 2: File Seeking

In this part of the lab, you will write two programs that manipulate the file read/write offset, both for the low level file descriptor interface and the standard library FILE * interface.

3.1 REVIEW: File Descriptors vs. File Streams

Recall that there exists two primary ways for interacting with files in C in unix. The first is the standard low level way of file descriptors. To open a file as a file descriptor we use open() and then to read and write with read() and write(), all of these are system calls. For example:

int main(){
  char hello[] = "Hello World!\n";
  int fd = open("hello.txt", O_CREATE | O_WRONLY, 0644);

  write(fd, hello, strlen(hello));

  close(fd);

}

There also exists a standard library interface for files called file streams which we also use. Here, we open a file with fopen() and we can read and write from it using an assortment of standard library routines, like fprintf() fscanf() fgetc() fputc() and etc. Here is an example:

int main(){
  FILE * stream = fopen("hello.txt", "w");

  fprintf(stream, "Hello World!\n");

  fclose(stream);
}

Depending on the task, you may chose to use file descriptors or file streams, but it is important to understand how to do both. Of course, under the surface somewhere, a file stream must interact with a file descriptor since that is the standard interface for read/writing files on Unix.

3.2 lseek() adjusting the offset for file descriptors

For file descriptors the command to adjust an offset is lseek() which has the following function prototype:

off_t lseek(int fd, off_t offset, int whence);

It takes in a file descriptor fd and off_t offset, which is just a long integer, and an integer whence. The best way to think about the process of adjusting the offset is that you are moving the cursor within the file to a new offset as measured by whence, or offset + whence is where the cursor is places. To help with this process, there are three constants you can use for whence:

  • SEEK_SET : Offset set to offset bytes, that is, whence refers to the start of the file
  • SEEK_CUR : Offset is set based on the current location of offset, that is whence refers to the current location when adjusting the offset.
  • SEEK_END : Offset is set based on the size of the file, that is whence refers tot he end of the current file. You can seek beyond the end of a file, but it is not advisable for this lab.

The return value of lseek() is the new offset in the file, or the current location of the read/write cursor. Let's look at some example uses of lseek():

  • lseek(fd, 0, SEEK_SET) will reset the offset to the start of the file since SEEK_SET is used for whence with at an additive offset.
  • lseek(fd, 10, SEEK_SET) will set the offest 10 bytes into the file.
  • lseek(fd, -10, SEEK_CUR) will move the offest 10 bytes towards from the start of the file from the current location
  • lseek(fd,0,SEEK_END) will move the offset to the end of the file.
  • lseek(fd,0,SEEK_CUR) will not move the offset at all, but will return the current offset.

3.2.1 Task 3: llines

Change into the llines directory, where you will find the following files:

  • llines.c : source file you will complete
  • Makefile : the makefile
  • BeatArmy.txt : a text file for testing
  • GoNavy.txt : a text file for testing
  • BeatArmy.txt : a text file for testing

You goal is to complete the source for llines which is a program that mixes head and tail and will print the bytes between a start offset and end offset. You must use tfile descriptors and the lseek() command as well as read() and write() to complete this task.

The llines command has the following specification:

llines -s 10 -e 20 file : print between the bytes between offset 10 and 20
llines -s 20 -e 10 file : ERROR: end offset can be greater than start offset
llines -s 10 file : print bytes between offset 10 and the end of the file
llines -e 10 file : print the last 10 bytes of the file
llines file : print all bytes of file, like cat
llines -e -10 : ERROR: cannot have negative offsets

In the source code, the following structure is generated for you which contains the command line options.

typedef struct{
  int start; //start offset, -1 if not set
  int end; //end offset, -1 if not set
  int f_index; //ignore
} opt_t;

Here is some sample runs:

#> ./llines GoNavy.txt 
Beat Army!
#> ./llines -s 3 GoNavy.txt 
t Army!
#> ./llines -s 10 GoNavy.txt 

#> ./llines -s 3 -e 5 GoNavy.txt 
t #> ./llines -e 5 GoNavy.txt 
rmy!
#> ./llines -s 7 -e 5 GoNavy.txt 
ERROR: Invalid start and end (end < start)
#> ./llines 
ERROR: Require a file argument

3.3 fseek() with file streams

The standard library mechanism for adjusting the offset is fseek() which has a nearly identical function declaration:

int fseek(FILE *stream, long offset, int whence);

The stream is the file stream to adjust the offset, the long is the offset to set based on whence. Again, SEEK_SET, SEEK_CUR, and SEEK_END are constants that can be used for whence. There are two additional useful functions:

long ftell(FILE *stream);

void rewind(FILE *stream);

ftell() returns the current offset in the file, while rewind() will set the offset to the start of the file.

3.3.1 Task 4: flines

Change into the flines directory, where you will find the following files:

  • flines.c : source file you will complete
  • Makefile : the makefile
  • BeatArmy.txt : a text file for testing
  • GoNavy.txt : a text file for testing
  • BeatArmy.txt : a text file for testing

You goal is to complete the source for llines which is a program that mixes head and tail and will print the bytes between a start offset and end offset. You must use file streams fseek() command as well as the appropriate file stream reading and writing mechanism to complete this task. The specification for flines, otherwise, is the same as llines. See above for details and sample output. #+ENDEXAMPLE