IC221: Systems Programming (SP15)


Home Policy Calendar

Lec. 14: exec()/fork()/wait() cycles for Process Management

Table of Contents

1 Executing a Programing

In the last lesson, we briefly discussed how a program loads into a process. We continue that discussion now by overviewing the exec family of system calls.

Recall that an exec call will load a new program into the process and replace the current running program with the one specified. For example, consider this program, which will execute the ls -l command in the current directory:

There are three main versions of exec which we will focus on:

  • execv(char * path, char * argv[]) : given the path to the program and an argument array, load and execute the program
  • execvp(char * file, char * argv[]) : given a file(name) of the program and an argument array, find the file in the environment PATH and execute the program
  • execvpe(char * file, char * argv[], char * envp[]) given a file(name), an argument array, and the enviroment settings, within the enviroment, search the PATH for the program named file and execute with the arguments.

Each version of execute provides slightly different functionality. For this discussion, we will focus on execv and execvp; we will discuss execvpe latter in the semester.

1.1 Using execv and execvp

The primary difference between execv and execvp is that with execv you have to provide the full path to the binary file (i.e., the program). The argv[] array is the same otherwise. For example:

#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>

int main(int argc, char * argv[]){

  //argv array for: /bin/ls -l
  char * ls_args[] = { "/bin/ls" , "-l", NULL};
  //                                  ^
  //       all argv arrays must be ___| 
  //       NULL terminated       

  //execute the program
  execv(   ls_args[0],     ls_args);
  //           ^              ^
  //           |              |
  // Name of program        argv array
  // is ls_args[0]          for ls_args


  //only get here on error
  perror("execv");
  return 2;
}
aviv@saddleback: demo $ ./execv_ls-l 
total 120
-rwxr-x--- 1 aviv scs  9890 Feb 24 14:13 exec_other
-rw-r----- 1 aviv scs   151 Feb 24 11:43 exec_other.c
-rwxr-x--- 1 aviv scs  9977 Feb 24 14:13 execv_ls-l
-rw-r----- 1 aviv scs   559 Feb 24 11:42 execv_ls-l.c
-rwxr-x--- 1 aviv scs  9979 Feb 24 14:13 execvp_ls-l
-rw-r----- 1 aviv scs   360 Feb 24 11:59 execvp_ls-l.c
-rw-r----- 1 aviv scs   559 Feb 24 11:58 execvp_ls-l.c~
-rwxr-x--- 1 aviv scs 10023 Feb 24 14:13 first_fork
-rw-r----- 1 aviv scs   532 Feb 23 08:06 first_fork.c
-rwxr-x--- 1 aviv scs 10345 Feb 24 14:13 fork_exec_wait
-rw-r----- 1 aviv scs  1158 Feb 23 08:06 fork_exec_wait.c
-rwxr-x--- 1 aviv scs 10278 Feb 24 14:13 get_exitstatus
-rw-r----- 1 aviv scs  1379 Feb 23 08:06 get_exitstatus.c
-rwxr-x--- 1 aviv scs  9985 Feb 24 14:13 get_pid_ppid
-rw-r----- 1 aviv scs   294 Feb 23 08:06 get_pid_ppid.c
-rw-r----- 1 aviv scs    99 Feb 23 08:06 Makefile

With execvp, you do not need to specify the full path because execvp will search the local environment variable PATH for the executable. Recall, that this is how the shell command which works:

aviv@saddleback: demo $ which ls
/bin/ls

which will find the name of the command along the path:

aviv@saddleback: demo $ echo $PATH
/home/scs/aviv/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games

In the case for ls this occurs in /bin. Using execvp will perform this look up for you, and so can simplify the code some:

aviv@saddleback: demo $ cat execvp_ls-l.c
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>

int main(int argc, char * argv[]){

  //argv array for: ls -l
  char * ls_args[] = { "ls" , "-l", NULL};
  //                    ^ 
  //  use the name ls
  //  rather than the
  //  path to /bin/ls
  execvp(   ls_args[0],     ls_args);


  //only get here on error
  perror("execv");
  return 2;
}
aviv@saddleback: demo $ ./execvp_ls-l 
total 120
-rwxr-x--- 1 aviv scs  9890 Feb 24 14:13 exec_other
-rw-r----- 1 aviv scs   151 Feb 24 11:43 exec_other.c
-rwxr-x--- 1 aviv scs  9977 Feb 24 14:13 execv_ls-l
-rw-r----- 1 aviv scs   559 Feb 24 11:42 execv_ls-l.c
-rwxr-x--- 1 aviv scs  9979 Feb 24 14:13 execvp_ls-l
-rw-r----- 1 aviv scs   360 Feb 24 11:59 execvp_ls-l.c
-rw-r----- 1 aviv scs   559 Feb 24 11:58 execvp_ls-l.c~
-rwxr-x--- 1 aviv scs 10023 Feb 24 14:13 first_fork
-rw-r----- 1 aviv scs   532 Feb 23 08:06 first_fork.c
-rwxr-x--- 1 aviv scs 10345 Feb 24 14:13 fork_exec_wait
-rw-r----- 1 aviv scs  1158 Feb 23 08:06 fork_exec_wait.c
-rwxr-x--- 1 aviv scs 10278 Feb 24 14:13 get_exitstatus
-rw-r----- 1 aviv scs  1379 Feb 23 08:06 get_exitstatus.c
-rwxr-x--- 1 aviv scs  9985 Feb 24 14:13 get_pid_ppid
-rw-r----- 1 aviv scs   294 Feb 23 08:06 get_pid_ppid.c
-rw-r----- 1 aviv scs    99 Feb 23 08:06 Makefile

You might be wondering: why use execv at all when you have execvp? There are a few good reasons, but the most relevant is for security. The PATH can be changed by the user to circumvent which programs are found during lookup. For example, what happens if there was another program called ls along the path, but this time that program removed the whole file system. execvp would call the wrong ls … and boom. execv forces issues and ensures that the whole path to the executable is provided.

1.2 The argv[] argument to execv and execvp

The last item to consider in the exec calls is the argv array. This is the same as the argv array argument to main; essentially, when you call exec you are calling that programs main function.

Just like in main, the argv array must be NULL terminated. So when we do this:

char * ls_args[] = { "ls" , "-l", NULL};

We are setting up the argv array like so:

            .-----.
ls_args ->  |  .--+--> "/bin/ls"
            |-----|
            |  .--+--> "-l"
            |-----|
            |  .--+--> NULL
            '-----'

Because the argv array for exec is the same as main, it becomes quite trivially to write a program that just executes another program as specified on the command line. For example:

aviv@saddleback: demo $ ./exec_other ls -l
total 120
-rwxr-x--- 1 aviv scs  9890 Feb 24 14:13 exec_other
-rw-r----- 1 aviv scs   151 Feb 24 11:43 exec_other.c
-rwxr-x--- 1 aviv scs  9977 Feb 24 14:13 execv_ls-l
-rw-r----- 1 aviv scs   559 Feb 24 11:42 execv_ls-l.c
-rwxr-x--- 1 aviv scs  9979 Feb 24 14:13 execvp_ls-l
-rw-r----- 1 aviv scs   360 Feb 24 11:59 execvp_ls-l.c
-rw-r----- 1 aviv scs   559 Feb 24 11:58 execvp_ls-l.c~
-rwxr-x--- 1 aviv scs 10023 Feb 24 14:13 first_fork
-rw-r----- 1 aviv scs   532 Feb 23 08:06 first_fork.c
-rwxr-x--- 1 aviv scs 10345 Feb 24 14:13 fork_exec_wait
-rw-r----- 1 aviv scs  1158 Feb 23 08:06 fork_exec_wait.c
-rwxr-x--- 1 aviv scs 10278 Feb 24 14:13 get_exitstatus
-rw-r----- 1 aviv scs  1379 Feb 23 08:06 get_exitstatus.c
-rwxr-x--- 1 aviv scs  9985 Feb 24 14:13 get_pid_ppid
-rw-r----- 1 aviv scs   294 Feb 23 08:06 get_pid_ppid.c
-rw-r----- 1 aviv scs    99 Feb 23 08:06 Makefile
aviv@saddleback: demo $ ./exec_other cat exec_oth
exec_other    exec_other.c  
aviv@saddleback: demo $ ./exec_other cat exec_other.c 
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>

int main(int argce, char * argv[]){

  execvp( (argv+1)[0], argv+1);
  perror("execvp");
}

As you can see in the program (which used cat to output itself!) we are using pointer manipulation to set up the argv array. At the start, the argv is like:

         .-----.
argv ->  |  .--+--> "./exec_other"
         |-----|
         |  .--+--> "ls"
         |-----|
         |  .--+--> "-l"
         |-----|
         |  .--+--> NULL
         '-----'

After point manipulation:

           .-----.
           |  .--+--> "./exec_other"
           |-----|
argv+1 ->  |  .--+--> "ls"
           |-----|
           |  .--+--> "-l"
           |-----|
           |  .--+--> NULL
           '-----'

Which is a valid argv array for executing ls.

2 Creating a new Process

So far, we've only loaded programs and executed them as an already running process. This is not creating a new process, and for that we need a new system call. The fork() system call will duplicate the calling process and create a new process with a new process identifier.

2.1 fork()

With the exception of two O.S. processes, the kernel and init process, all process are spawned from another process. The procedure of creating a new process is called forking: An exact copy of the process, memory values and open resources, is produced. The original process that forked, is called the parent, while the newly created, duplicate process is called the child. Let's look at an example of a process forking using the fork() system call:

#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>

int main(){

  pid_t c_pid;


  c_pid = fork(); //duplicate                                                                                                                                                
  if( c_pid == 0 ){
    //child: The return of fork() is zero                                                                                                                                    
    printf("Child: I'm the child: %d\n", c_pid);

  }else if (c_pid > 0){
    //parent: The return of fork() is the process of id of the child                                                                                                         

    printf("Parent: I'm the parent: %d\n", c_pid);

  }else{
    //error: The return of fork() is negative                                                                                                                                

    perror("fork failed");
    _exit(2); //exit failure, hard                                                                                                                                           

  }

  return 0; //success                                                                                                                                                        
}

The fork() system call is unlike any other function call you've seen so far. It returns twice, once in the parent and once in child, and it returns different values in the parent and the child.

To follow the logic, you first need to realize that once fork() is called, the Operating System is creating a whole new process which is an exact copy of the original process. At this point, fork() still hasen't returned because the O.S. is context switched in, and now it must return from fork() twice, once in the child process and once in the parent, where execution in both process can continue.

2.2 Process identifiers or pid

Every process has a unique identifier, the process identifier or pid. This value is assigned by the operating system when the process is created and is a 2-byte number (or a short). There is a special typedef for the process identifier, pid_t, which we will use.

In the above sample code, after the call to fork(), the parent's return value from fork() is the process id of the newly created child process. The child, however, has a return value of 0. On error, fork(), returns -1. Then you should bail with _exit() because something terrible happened.

One nice way to see a visual of the parent process relationship is using the bash command pstree:

#> pstree -ah
init
  ├─NetworkManager
  │   ├─dhclient -d -4 -sf /usr/lib/NetworkManager/nm-dhcp-client.action -pf /var/run/sendsigs.omit.d/network-manager.dhclient-eth0.pid -lf...
  │   └─2*[{NetworkManager}]
  ├─accounts-daemon
  │   └─{accounts-daemon}
(...)

At the top is the init process, which is the parent of all proces. Somewhere down tree is my login shell

(...)
├─sshd -D
  │   └─sshd   
  │       └─sshd    
  │           └─bash
  │               ├─emacs get_exitstatus.c
  │               ├─emacs foursons.c
  │               ├─emacs Makefile
  │               ├─emacs get_exitstatus.c
  │               ├─emacs fork_exec_wait.c
  │               ├─emacs mail_reports.py
  │               └─pstree -ah
(...)

And you can see that the process of getting to a bash shell via ssh requires a number of forks and child process.

2.3 Retrieving Process Identifiers: getpid() and getppid()

With fork(), the parent can learn the process id of the child, but the child doesn't know its own process id (or pid) after the fork nor does it know its parents process id. For that matter, the parent doesn't know its own process id either. There are two system calls to retrieve this information:

//retrieve the current process id
pid_t getpid(void);

//retrieve the parent's process id
pid_t getppid(void);

There is no way for a process to directly retrieve its child pid because any process may have multiple children. Instead, a process must maintain that information directly through the values returned from a fork(). Here is a sample program that prints the current process id and the parent's process id.

#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>

int main(){
  pid_t pid, ppid;

  //get the process'es pid
  pid = getpid();

  //get the parrent of this process'es pid
  ppid = getppid();


  printf("My pid is: %d\n",pid);
  printf("My parent's pid is %d\n", ppid);

  return 0;
}

If we run this program a bunch of times, we will see output like this:

#> ./get_pid_ppid 
My pid is: 14307
My parent's pid is 13790

#> ./get_pid_ppid 
My pid is: 14308
My parent's pid is 13790

#> ./get_pid_ppid 
My pid is: 14309
My parent's pid is 13790

Every time the program runs, it has a different process id (or pid). Every process must have a unique pid, and the O.S. applies a policy for reusing process id's as processes terminate. But, the parent's pid is the same. If you think for a second, this makes sense: What's the parent of the program? The shell! We can see this by echo'ing $$, which is special bash variable that stores the pid of the shell:

#> echo $$
13790

Whenever you execute a program on the shell, what's really going on is the shell is forking, and the new child is exec'ing the new program. One thing to consider, though, is that when a process forks, the parent and the child continue executing in parallel: Why doesn't the shell come back immediately and ask the user to enter a new command? The shell instead waits for the child to finish process before prompting again, and there is a system call called wait() to just do that.

3 Waiting on a child with wait()

The wait() system call is used by a parent process to wait for the status of the child to change. A status change can occur for a number of reasons, the program stopped or continued, but we'll only concern ourselves with the most common status change: the program terminated or exited. (We will discuss stopped and continued in later lessons.)

#include <sys/types.h>
#include <sys/wait.h>

pid_t wait(int *status);

Once the parent calls wait(), it will block until a child changes state. In essence, it is waiting on its children to terminate. This is described as a blocking function because it blocks and does not continue until an event is complete.

Once it returns, wait() will returns the pid of the child process that terminated (or -1 if the process has no children), and wait() takes an integer pointer as an argument. At that memory address, it will set the termination status of the child process. As mentioned in the previous lesson, part of the termination status is the exit status, but it also contains other information for how a program terminated, like if it had a SEGFAULT.

3.1 Checking the Status of children

To learn about the exit status of a program we can use the macros from sys/wait.h which check the termination status and return the exit status. From the main page:

WIFEXITED(status)
       returns true if the child terminated normally, that is, 
       by calling exit(3) or _exit(2), or by returning from main().

WEXITSTATUS(status)
       returns  the  exit  status of the child.  This consists of the least significant 
       8 bits of the status argument that the child specified in a call to exit(3) or _exit(2) or as the
       argument for a return statement in main().  
       This macro should only be employed if WIFEXITED returned true.

There are other checks of the termination status, and refer to the manual page for more detail. Below is some example code for checking the exit status of forked child. You can see that the child delays its exit by 2 seconds with a call to sleep.

#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>

#include <sys/types.h>
#include <sys/wait.h>

int main(){

  pid_t c_pid, pid;
  int status;

  c_pid = fork(); //duplicate

  if( c_pid == 0 ){
    //child
    pid = getpid();

    printf("Child: %d: I'm the child\n", pid, c_pid);
    printf("Child: sleeping for 2-seconds, then exiting with status 12\n");

    //sleep for 2 seconds
    sleep(2);

    //exit with statys 12
    exit(12);

  }else if (c_pid > 0){
    //parent

    //waiting for child to terminate
    pid = wait(&status);

    if ( WIFEXITED(status) ){
      printf("Parent: Child exited with status: %d\n", WEXITSTATUS(status));
    }

  }else{
    //error: The return of fork() is negative
    perror("fork failed");
    _exit(2); //exit failure, hard
  }

  return 0; //success                                                                                                                                                        
}

4 Fork/Exec/Wait Cycle

We now have all the parts to write a program that will execute another program and wait for that program to finish. This reminds me of another program we've already used in this class… the shell, but you'll get to that later in the lab.

For now, consider the example code below which executes ls on the /bin directory:

#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>

#include <sys/types.h>
#include <sys/wait.h>

int main(int argc, char * argv[]){
  //arguments for ls, will run: ls  -l /bin                                                                                                                                  
  char * ls_args[3] = { "ls", "-l", NULL} ;
  pid_t c_pid, pid;
  int status;

  c_pid = fork();

  if (c_pid == 0){
    /* CHILD */

    printf("Child: executing ls\n");

    //execute ls                                                                                                                                                               
    execvp( ls_args[0], ls_args);
    //only get here if exec failed                                                                                                                                             
    perror("execve failed");
  }else if (c_pid > 0){
    /* PARENT */

    if( (pid = wait(&status)) < 0){
      perror("wait");
      _exit(1);
    }

    printf("Parent: finished\n");

  }else{
    perror("fork failed");
    _exit(1);
  }

  return 0; //return success
}

And the execution:

aviv@saddleback: demo $ ./fork_exec_wait
Child: executing ls
total 5120
-rwxr-xr-x  2 root  wheel    18480 Sep  9 18:44 [
-r-xr-xr-x  1 root  wheel   628736 Sep 26 22:03 bash
-rwxr-xr-x  1 root  wheel    19552 Sep  9 18:57 cat
-rwxr-xr-x  1 root  wheel    30112 Sep  9 18:50 chmod
-rwxr-xr-x  1 root  wheel    24768 Sep  9 18:49 cp
-rwxr-xr-x  2 root  wheel   370096 Sep  9 18:40 csh
-rwxr-xr-x  1 root  wheel    24400 Sep  9 18:44 date
-rwxr-xr-x  1 root  wheel    27888 Sep  9 18:50 dd
-rwxr-xr-x  1 root  wheel    23472 Sep  9 18:49 df
-r-xr-xr-x  1 root  wheel    14176 Sep  9 19:27 domainname
-rwxr-xr-x  1 root  wheel    14048 Sep  9 18:44 echo
-rwxr-xr-x  1 root  wheel    49904 Sep  9 18:57 ed
-rwxr-xr-x  1 root  wheel    19008 Sep  9 18:44 expr
-rwxr-xr-x  1 root  wheel    14208 Sep  9 18:44 hostname
-rwxr-xr-x  1 root  wheel    14560 Sep  9 18:44 kill
-r-xr-xr-x  1 root  wheel  1394560 Sep  9 19:59 ksh
-rwxr-xr-x  1 root  wheel    77728 Sep  9 19:32 launchctl
-rwxr-xr-x  2 root  wheel    14944 Sep  9 18:49 link
-rwxr-xr-x  2 root  wheel    14944 Sep  9 18:49 ln
-rwxr-xr-x  1 root  wheel    34640 Sep  9 18:49 ls
-rwxr-xr-x  1 root  wheel    14512 Sep  9 18:50 mkdir
-rwxr-xr-x  1 root  wheel    20160 Sep  9 18:49 mv
-rwxr-xr-x  1 root  wheel   106816 Sep  9 18:49 pax
-rwsr-xr-x  1 root  wheel    46688 Sep  9 18:59 ps
-rwxr-xr-x  1 root  wheel    14208 Sep  9 18:44 pwd
-r-sr-xr-x  1 root  wheel    25216 Sep  9 19:27 rcp
-rwxr-xr-x  2 root  wheel    19760 Sep  9 18:49 rm
-rwxr-xr-x  1 root  wheel    14080 Sep  9 18:49 rmdir
-r-xr-xr-x  1 root  wheel   628800 Sep 26 22:03 sh
-rwxr-xr-x  1 root  wheel    14016 Sep  9 18:44 sleep
-rwxr-xr-x  1 root  wheel    28064 Sep  9 18:59 stty
-rwxr-xr-x  1 root  wheel    34224 Sep  9 21:59 sync
-rwxr-xr-x  2 root  wheel   370096 Sep  9 18:40 tcsh
-rwxr-xr-x  2 root  wheel    18480 Sep  9 18:44 test
-rwxr-xr-x  2 root  wheel    19760 Sep  9 18:49 unlink
-rwxr-xr-x  1 root  wheel    14112 Sep  9 19:32 wait4path
-rwxr-xr-x  1 root  wheel   551232 Sep  9 19:19 zsh
Parent: finished

The parent first forks a child process. In the child process, the execution is replaced by ls which prints the output. Meanwhile, the parent wait's for the execution to complete before continuing.

Imagine now this process occurring in a loop, and instead of running ls, the user provides the program that should run. That's a shell, and that's what you will be doing in the next lab.