IC221: Systems Programming (SP14)


Home Policy Calendar Syllabus Resources Piazza

Lab 8: Pipeline Unrolling

Table of Contents

1 Preliminaries

In this lab you will complete a set of C programs that will expose you to the set of system calls that are used in shell pipelines, such as process grouping, pipes, and filed descriptor duplication. There are 2 tasks, and you will likely complete neither of the tasks in lab, You will need to finish the remaining asks outside of the lab period.

1.1 Lab Learning Goals

In this lab, you will learn the following topics and practice C programming skills.

  1. Unroll a pipeline with all children in the same process group
  2. Use the setpgid() and getpgid() system calls
  3. Basic pipeline parsing with strtok()
  4. Inter process communication via pipe() and dup()
  5. Setting up a pre-parsed pipeline with inter process communcation
  6. Use of preprocessor and #if for compilation

1.2 Lab Setup

Run the following command

~aviv/bin/ic221-up

Change into the lab directory

cd ~/ic221/labs/08

All the material you need to complete the lab can be found in the lab directory. All material you will submit, you should place within the lab directory. Throughout this lab, we refer to the lab directory, which you should interpret as the above path.

1.3 Submission Folder

For this lab, all ubmission should be placed in the following folder:

~/ic221/labs/08/

This directory contains 4 sub-directories; examples, timer, term-status, and mini-sh. In the examples directory you will find any source code in this lab document. All lab work should be done in the remaining directories.

  • Only source files found in the folder will be graded.
  • Do not change the names of any source files

Finally, in the top level of the lab directory, you will find a README file. You must complete the README file, and include any additional details that might be needed to complete this lab.

1.4 Compiling your programs with clang and make

You are not required to provide your own Makefiles for this lab.

1.5 README

In the top level of the lab directory, you will find a README file. You must fill out the README file with your name and alpha. Please include a short summary of each of the tasks and any other information you want to provide to the instructor.

1.6 Test Script

You are provided a test script which prints pass/fail information for a set of tests for your programs. Note that passing all the tests does not mean you will receive a perfect score: other tests will be performed on your submission. To run the test script, execute test.sh from the lab directory.

./test.sh

You can comment out individual tests while working on different parts of the lab. Open up the test script and place comments at the bottom where appropriate.

1.7 Working versions for comparisons

Working versions of all the programs described in this lab document can be found here:

~aviv/lab8-bin/unroll
~aviv/lab8-bin/babypipe

Run these programs as a comparison point.

2 PART 1: The Sleeping Pipeline

In the previous lessons, we discussed the following pipeline in detail:

sleep 10 | sleep 20 | sleep 30 | sleep 50

We identified that the entire pipeline runs as single job in the shell, and each part of the pipeline is a seperate process running in parallel within a process group. The process group is important for terminal signaling, and foreground process are identified based on process group.

In this part of the lab, you implement a standard pipeline unrolling, which is the process of parsing a pipeline based on | and forking each of the children in their own process group. The unrolling code will only concern itself with the individual forks, and not inter-process communication; that's the next part.

2.1 Task 1 sleep-unroll

For this task, change into the unroll directory in the lab directory.In there, you will find the source file sleep-unroll.c, which you will complete. Your sleep-unroll program most meet the following specification:

  1. It must fork each part of the pipeline as its own process with proper arguments.
  2. All process in the pipeline must be in its own process group identified by the pid of the first fork child process. This process group should be different than the parent's process group.
  3. Your program must be able to execute a set of sleep commands, at a mininum, but should be general enough to execute other commands.

A general outline of the algorithm you need to complete is provided in the source file. Parsing code for tokenizing a string based on "|" is provided, but you must complete the parsing of individual commands youself. We recomend you use a do-while loop, which will simplifie some process, but you may use other strategies if you like. Also, we recommend you take advantage of the 0 argument options to setpgid(), which can help simplify logic.

Below, is some sample runs of the program to compare against. I use the command line tool time to demonstrate how long a pipeline should take, and notice that the pipeline is specified as the first argument to sleep-unroll and must be quote designated:

#> ./sleep-unroll "sleep 1"
#> time ./sleep-unroll "sleep 1"

real	0m1.010s
user	0m0.003s
sys	0m0.005s
#> time ./sleep-unroll "sleep 1 | sleep 3"

real	0m3.011s
user	0m0.004s
sys	0m0.006s
#> time ./sleep-unroll "sleep 1 | sleep 3 | sleep 2"

real	0m3.012s
user	0m0.006s
sys	0m0.009s
#> time ./sleep-unroll "sleep 1 | BAD | sleep 2"
exec: No such file or directory

real	0m2.012s
user	0m0.004s
sys	0m0.008s
#> time ./sleep-unroll "head -c 5 /dev/zero"

real	0m0.013s
user	0m0.002s
sys	0m0.005s
#> ./sleep-unroll "sleep 200 | sleep 300 | sleep 400" &
[1] 17012
#> ps -o pid,pgid,args
  PID  PGID ARGS
17012 17012 ./sleep-unroll sleep 200 
17016 17016 sleep 200
17017 17016 sleep 300
17018 17016 sleep 400
(...)
#> fg
./sleep-unroll "sleep 200 | sleep 300 | sleep 400"
^C
#> ps -o pid,pgid,args
  PID  PGID ARGS
17016 17016 sleep 200
17017 17016 sleep 300
17018 17016 sleep 400
(...)
#> killall sleep
#> ps -o pid,pgid,args
  PID  PGID ARGS
(... NO sleep running ...)

3 PART 2: babypipe

In this part of the lab, you will focus on the inter-process communication of the pipeline by implementing a pipeline of commands. This part is not focused on parsing the pipeline, and, instead, you are provided with three pre-parsed pipelines to work with that are defined under pre-compiler directives.

To complete this task, you will need to think about how to unroll a pipeline such that there exists a pipe between each of the processes. This requires one call a sequence of calls to pipe() to and calls to dup2() for duplicating the write/read end of the pipes to stdin/stdout of each of the processes appropriately. This will require careful attention to the start, end and middle of the pipeline.

Here is some sample code demonstrating a one step pipeline, child to parent, with an exec. You should be able to adapt this code to complete the lab:

#include <unistd.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>

int main(){

  // cat sample-db | grep MD
  char * cmd1[] = {"cat", "sample-db.csv", NULL};
  char * cmd2[] = {"grep", "MD", NULL};

  int pipe_fd[2];
  int status;

  pid_t cpid1,cpid2;

  //open the pipe in parent
  if(pipe(pipe_fd) < 0){
    perror("pipe");
    _exit(1);
  }

  cpid1 = fork();
  if( cpid1 ==  0){
    /* CHILD 1*/

    //redirect output to the pipe
    close(pipe_fd[0]); //close read end of pipe    
    close(1);//close stdout
    dup2(pipe_fd[1], 1); //duplicate pipe to stdout

    //execute the comand
    execvp(cmd1[0], cmd1);
    perror("exec");
    _exit(1);
  }else if(cpid1 < 0){
    /* ERROR */
    perror("fork");
    _exit(1);
  }

  cpid2 = fork();
  if(cpid2 == 0){
    /* CHILD 2*/

    //redirect input from the pipe
    close(pipe_fd[1]); //close write end of pipe
    close(0); //close stdin
    dup2(pipe_fd[0],0); //duplicate pipe to stdin

    //execute the comand
    execvp(cmd2[0], cmd2);
    perror("exec");
    _exit(1);
  }else if(cpid2 < 0){
    /* ERROR */
    perror("fork");
    _exit(1);
  }

  /* PARENT */

  //widow pipe by closing write end of pipe
  close(pipe_fd[1]);

  //wait on all children
  while(wait(&status) > 0 );

  //sucess
  return 0;
}

The program above generate a one-step pipe line for the command:

cat sample-db | grep MD

And if you execute it, you get the desired result:

#>./one-step 
Kris,Marrier,King, Christopher A Esq,228 Runamuck Pl #2808,Baltimore,Baltimore City,MD,21224,410-655-8723,410-804-4694,kris@gmail.com,http://www.kingchristopheraesq.com
Ezekiel,Chui,Sider, Donald C Esq,2 Cedar Ave #84,Easton,Talbot,MD,21601,410-669-1642,410-235-8738,ezekiel@chui.com,http://www.siderdonaldcesq.com
Ilene,Eroman,Robinson, William J Esq,2853 S Central Expy,Glen Burnie,Anne Arundel,MD,21061,410-914-9018,410-937-4543,ilene.eroman@hotmail.com,http://www.robinsonwilliamjesq.com
Fernanda,Jillson,Shank, Edward L Esq,60480 Old Us Highway 51,Preston,Caroline,MD,21655,410-387-5260,410-724-6472,fjillson@aol.com,http://www.shankedwardlesq.com
Sylvia,Cousey,Berg, Charles E,287 Youngstown Warren Rd,Hampstead,Carroll,MD,21074,410-209-9545,410-863-8263,sylvia_cousey@cousey.org,http://www.bergcharlese.com
Loreta,Timenez,Kaminski, Katherine Andritsaki,47857 Coney Island Ave,Clinton,Prince Georges,MD,20735,301-696-6420,301-392-6698,loreta.timenez@hotmail.com,http://www.kaminskikatherineandritsaki.com
Kaitlyn,Ogg,Garrison, Paul E Esq,2 S Biscayne Blvd,Baltimore,Baltimore City,MD,21230,410-665-4903,410-773-3862,kaitlyn.ogg@gmail.com,http://www.garrisonpauleesq.com
Izetta,Dewar,Lisatoni, Jean Esq,2 W Scyene Rd #3,Baltimore,Baltimore City,MD,21217,410-473-1708,410-522-7621,idewar@dewar.com,http://www.lisatonijeanesq.com

The key to properly setting up a pipe between process is widow'ing the pipe. This is where one end of the pipe is closed, normally the end that is not in use. For example:

//redirect input from the pipe
 close(pipe_fd[1]); //close write end of pipe
 close(0); //close stdin
 dup2(pipe_fd[0],0); //duplicate pipe to stdin

Here, the write end of the pipe is closed in the second child since it is not in use. This makes the read end of the pipe a widow in the second child, which means, once the write end of the pipe is closed in the first child, EOF is generated. All instances of the pipe must be widowed, this includes the copy that exists in the parent, and if the pipe isn't widowed, the entire pipeline will hang.

3.1 Task 2 babypipe

For this task, change into the pipedup directory in the lab directory. In there, you will find the source file babypipe.c, which you will complete. Your program must meet the following specification:

  1. It must use pipe() and dup2() to set up the pipeline for inter-process communication between the child process.
  2. All process must run in parallel, but you are not required to have all process run in a seperate group.
  3. The first process in the pipeline must leave the stdin file descriptor unaltered, and the last process in the pipeline must leave the stdout file descriptor unaltered.

To assist in development, we have provided four pre-parsed pipelines that can be used with different pre-compiler directives. To compile each of the pipeline choices you can use:

  • make 0 compile with the following pipeline:
    cat /etc/passwd | cut -d : -f 7 | sort | uniq 
    
  • make 1 compile with the following pipeline:
    cat sample-db.csv | cut -d , -f 8 | sort | uniq | wc -l
    
  • make 2 compile with the following pipeline:
    cat sample-db.csv | cut -d , -f 10 | cut -d - -f 1 | sort | uniq |wc -l
    
  • make 3 compile with the following pipeline:
    cat | cut -d , -f 10 | cut -d - -f 1 | sort | uniq |wc -l
    

Here are sample compilation and executions of the program to compare against:

#> ./babypipe
/bin/sh
/usr/bin/false
/bin/nologin
/bin/bash
/bin/csh
#> make 1
clang -g -Wall -DPIPE=1 babypipe.c -o babypipe
#> ./babypipe
      32
#> make 2
clang -g -Wall -DPIPE=2 babypipe.c -o babypipe
#> ./babypipe
      87
#> make 3
clang -g -Wall -DPIPE=3 babypipe.c -o babypipe
#> cat sample-db.csv | ./babypipe
      87
#> make clean