Lab 8: Pipeline Unrolling
Table of Contents
1 Preliminaries
In this lab you will complete a set of C programs that will expose you to the set of system calls that are used in shell pipelines, such as process grouping, pipes, and filed descriptor duplication. There are 2 tasks, and you will likely complete neither of the tasks in lab, You will need to finish the remaining asks outside of the lab period.
1.1 Lab Learning Goals
In this lab, you will learn the following topics and practice C programming skills.
- Unroll a pipeline with all children in the same process group
- Use the
setpgid()
andgetpgid()
system calls - Basic pipeline parsing with
strtok()
- Inter process communication via
pipe()
anddup()
- Setting up a pre-parsed pipeline with inter process communcation
- Use of preprocessor and
#if
for compilation
1.2 Lab Setup
Run the following command
~aviv/bin/ic221-up
Change into the lab directory
cd ~/ic221/labs/08
All the material you need to complete the lab can be found in the lab directory. All material you will submit, you should place within the lab directory. Throughout this lab, we refer to the lab directory, which you should interpret as the above path.
1.3 Submission Folder
For this lab, all ubmission should be placed in the following folder:
~/ic221/labs/08/
This directory contains 4 sub-directories; examples
, timer
,
term-status
, and mini-sh
. In the examples
directory you will
find any source code in this lab document. All lab work should be
done in the remaining directories.
- Only source files found in the folder will be graded.
- Do not change the names of any source files
Finally, in the top level of the lab directory, you will find a
README
file. You must complete the README
file, and include any
additional details that might be needed to complete this lab.
1.4 Compiling your programs with clang
and make
You are not required to provide your own Makefiles for this lab.
1.5 README
In the top level of the lab directory, you will find a README
file. You must fill out the README file with your name and alpha.
Please include a short summary of each of the tasks and any other
information you want to provide to the instructor.
1.6 Test Script
You are provided a test script which prints pass/fail information
for a set of tests for your programs. Note that passing all the
tests does not mean you will receive a perfect score: other tests
will be performed on your submission. To run the test script,
execute test.sh
from the lab directory.
./test.sh
You can comment out individual tests while working on different parts of the lab. Open up the test script and place comments at the bottom where appropriate.
1.7 Working versions for comparisons
Working versions of all the programs described in this lab document can be found here:
~aviv/lab8-bin/unroll ~aviv/lab8-bin/babypipe
Run these programs as a comparison point.
2 PART 1: The Sleeping Pipeline
In the previous lessons, we discussed the following pipeline in detail:
sleep 10 | sleep 20 | sleep 30 | sleep 50
We identified that the entire pipeline runs as single job in the shell, and each part of the pipeline is a seperate process running in parallel within a process group. The process group is important for terminal signaling, and foreground process are identified based on process group.
In this part of the lab, you implement a standard pipeline
unrolling, which is the process of parsing a pipeline based on |
and forking each of the children in their own process group. The
unrolling code will only concern itself with the individual forks,
and not inter-process communication; that's the next part.
2.1 Task 1 sleep-unroll
For this task, change into the unroll
directory in the lab
directory.In there, you will find the source file sleep-unroll.c
,
which you will complete. Your sleep-unroll
program most meet the
following specification:
- It must fork each part of the pipeline as its own process with proper arguments.
- All process in the pipeline must be in its own process group identified by the pid of the first fork child process. This process group should be different than the parent's process group.
- Your program must be able to execute a set of sleep commands, at a mininum, but should be general enough to execute other commands.
A general outline of the algorithm you need to complete is provided
in the source file. Parsing code for tokenizing a string based on
"|" is provided, but you must complete the parsing of individual
commands youself. We recomend you use a do-while
loop, which will
simplifie some process, but you may use other strategies if you
like. Also, we recommend you take advantage of the 0
argument
options to setpgid()
, which can help simplify logic.
Below, is some sample runs of the program to compare against. I use
the command line tool time
to demonstrate how long a pipeline
should take, and notice that the pipeline is specified as the first
argument to sleep-unroll
and must be quote designated:
#> ./sleep-unroll "sleep 1"
#> time ./sleep-unroll "sleep 1"
real 0m1.010s
user 0m0.003s
sys 0m0.005s
#> time ./sleep-unroll "sleep 1 | sleep 3"
real 0m3.011s
user 0m0.004s
sys 0m0.006s
#> time ./sleep-unroll "sleep 1 | sleep 3 | sleep 2"
real 0m3.012s
user 0m0.006s
sys 0m0.009s
#> time ./sleep-unroll "sleep 1 | BAD | sleep 2"
exec: No such file or directory
real 0m2.012s
user 0m0.004s
sys 0m0.008s
#> time ./sleep-unroll "head -c 5 /dev/zero"
real 0m0.013s
user 0m0.002s
sys 0m0.005s
#> ./sleep-unroll "sleep 200 | sleep 300 | sleep 400" &
[1] 17012
#> ps -o pid,pgid,args
PID PGID ARGS
17012 17012 ./sleep-unroll sleep 200
17016 17016 sleep 200
17017 17016 sleep 300
17018 17016 sleep 400
(...)
#> fg
./sleep-unroll "sleep 200 | sleep 300 | sleep 400"
^C
#> ps -o pid,pgid,args
PID PGID ARGS
17016 17016 sleep 200
17017 17016 sleep 300
17018 17016 sleep 400
(...)
#> killall sleep
#> ps -o pid,pgid,args
PID PGID ARGS
(... NO sleep running ...)
3 PART 2: babypipe
In this part of the lab, you will focus on the inter-process communication of the pipeline by implementing a pipeline of commands. This part is not focused on parsing the pipeline, and, instead, you are provided with three pre-parsed pipelines to work with that are defined under pre-compiler directives.
To complete this task, you will need to think about how to unroll a
pipeline such that there exists a pipe between each of the
processes. This requires one call a sequence of calls to pipe()
to
and calls to dup2()
for duplicating the write/read end of the
pipes to stdin/stdout of each of the processes appropriately. This
will require careful attention to the start, end and middle of the
pipeline.
Here is some sample code demonstrating a one step pipeline, child to parent, with an exec. You should be able to adapt this code to complete the lab:
#include <unistd.h> #include <fcntl.h> #include <stdio.h> #include <stdlib.h> int main(){ // cat sample-db | grep MD char * cmd1[] = {"cat", "sample-db.csv", NULL}; char * cmd2[] = {"grep", "MD", NULL}; int pipe_fd[2]; int status; pid_t cpid1,cpid2; //open the pipe in parent if(pipe(pipe_fd) < 0){ perror("pipe"); _exit(1); } cpid1 = fork(); if( cpid1 == 0){ /* CHILD 1*/ //redirect output to the pipe close(pipe_fd[0]); //close read end of pipe close(1);//close stdout dup2(pipe_fd[1], 1); //duplicate pipe to stdout //execute the comand execvp(cmd1[0], cmd1); perror("exec"); _exit(1); }else if(cpid1 < 0){ /* ERROR */ perror("fork"); _exit(1); } cpid2 = fork(); if(cpid2 == 0){ /* CHILD 2*/ //redirect input from the pipe close(pipe_fd[1]); //close write end of pipe close(0); //close stdin dup2(pipe_fd[0],0); //duplicate pipe to stdin //execute the comand execvp(cmd2[0], cmd2); perror("exec"); _exit(1); }else if(cpid2 < 0){ /* ERROR */ perror("fork"); _exit(1); } /* PARENT */ //widow pipe by closing write end of pipe close(pipe_fd[1]); //wait on all children while(wait(&status) > 0 ); //sucess return 0; }
The program above generate a one-step pipe line for the command:
cat sample-db | grep MD
And if you execute it, you get the desired result:
#>./one-step Kris,Marrier,King, Christopher A Esq,228 Runamuck Pl #2808,Baltimore,Baltimore City,MD,21224,410-655-8723,410-804-4694,kris@gmail.com,http://www.kingchristopheraesq.com Ezekiel,Chui,Sider, Donald C Esq,2 Cedar Ave #84,Easton,Talbot,MD,21601,410-669-1642,410-235-8738,ezekiel@chui.com,http://www.siderdonaldcesq.com Ilene,Eroman,Robinson, William J Esq,2853 S Central Expy,Glen Burnie,Anne Arundel,MD,21061,410-914-9018,410-937-4543,ilene.eroman@hotmail.com,http://www.robinsonwilliamjesq.com Fernanda,Jillson,Shank, Edward L Esq,60480 Old Us Highway 51,Preston,Caroline,MD,21655,410-387-5260,410-724-6472,fjillson@aol.com,http://www.shankedwardlesq.com Sylvia,Cousey,Berg, Charles E,287 Youngstown Warren Rd,Hampstead,Carroll,MD,21074,410-209-9545,410-863-8263,sylvia_cousey@cousey.org,http://www.bergcharlese.com Loreta,Timenez,Kaminski, Katherine Andritsaki,47857 Coney Island Ave,Clinton,Prince Georges,MD,20735,301-696-6420,301-392-6698,loreta.timenez@hotmail.com,http://www.kaminskikatherineandritsaki.com Kaitlyn,Ogg,Garrison, Paul E Esq,2 S Biscayne Blvd,Baltimore,Baltimore City,MD,21230,410-665-4903,410-773-3862,kaitlyn.ogg@gmail.com,http://www.garrisonpauleesq.com Izetta,Dewar,Lisatoni, Jean Esq,2 W Scyene Rd #3,Baltimore,Baltimore City,MD,21217,410-473-1708,410-522-7621,idewar@dewar.com,http://www.lisatonijeanesq.com
The key to properly setting up a pipe between process is widow'ing the pipe. This is where one end of the pipe is closed, normally the end that is not in use. For example:
//redirect input from the pipe close(pipe_fd[1]); //close write end of pipe close(0); //close stdin dup2(pipe_fd[0],0); //duplicate pipe to stdin
Here, the write end of the pipe is closed in the second child since it is not in use. This makes the read end of the pipe a widow in the second child, which means, once the write end of the pipe is closed in the first child, EOF is generated. All instances of the pipe must be widowed, this includes the copy that exists in the parent, and if the pipe isn't widowed, the entire pipeline will hang.
3.1 Task 2 babypipe
For this task, change into the pipedup
directory in the lab
directory. In there, you will find the source file babypipe.c
,
which you will complete. Your program must meet the following
specification:
- It must use
pipe()
anddup2()
to set up the pipeline for inter-process communication between the child process. - All process must run in parallel, but you are not required to have all process run in a seperate group.
- The first process in the pipeline must leave the stdin file descriptor unaltered, and the last process in the pipeline must leave the stdout file descriptor unaltered.
To assist in development, we have provided four pre-parsed pipelines that can be used with different pre-compiler directives. To compile each of the pipeline choices you can use:
make 0
compile with the following pipeline:cat /etc/passwd | cut -d : -f 7 | sort | uniq
make 1
compile with the following pipeline:cat sample-db.csv | cut -d , -f 8 | sort | uniq | wc -l
make 2
compile with the following pipeline:cat sample-db.csv | cut -d , -f 10 | cut -d - -f 1 | sort | uniq |wc -l
make 3
compile with the following pipeline:cat | cut -d , -f 10 | cut -d - -f 1 | sort | uniq |wc -l
Here are sample compilation and executions of the program to compare against:
#> ./babypipe
/bin/sh
/usr/bin/false
/bin/nologin
/bin/bash
/bin/csh
#> make 1
clang -g -Wall -DPIPE=1 babypipe.c -o babypipe
#> ./babypipe
32
#> make 2
clang -g -Wall -DPIPE=2 babypipe.c -o babypipe
#> ./babypipe
87
#> make 3
clang -g -Wall -DPIPE=3 babypipe.c -o babypipe
#> cat sample-db.csv | ./babypipe
87
#> make clean