Lab 08: Process State Monitoring
Table of Contents
Preliminaries
In this lab you will complete a set of C programs that will expose you to termination status, fork-exec-wait loops, and tokenizing strings. There are 3 tasks, you will likely complete only task 1 in lab, and begin with task 2. You will need to finish the remaining taks outside of lab.
Lab Setup
Run the following command
~aviv/bin/ic221-up
Change into the lab directory
cd ~/ic221/labs/08
All the material you need to complete the lab can be found in the lab directory. All material you will submit, you should place within the lab directory. Throughout this lab, we refer to the lab directory, which you should interpret as the above path.
Submission Folder
For this lab, all ubmission should be placed in the following folder:
~/ic221/labs/08
This directory contains 4 sub-directories; examples
, timer
,
term-status
, and mini-sh
. In the examples
directory you will
find any source code in this lab document. All lab work should be
done in the remaining directories.
- Only source files found in the folder will be graded.
- Do not change the names of any source files
Finally, in the top level of the lab directory, you will find a
README
file. You must complete the README
file, and include any
additional details that might be needed to complete this lab.
Compiling your programs with clang
and make
You are required to provide your own Makefiles for this lab. Each of
the source folders, myps
, mypstree
, and fg-shell
, must have a
Makefile. We should be able to compile your programs by typing
make
in each source directory. The following executables should be
generated:
myps
: executable formyps
directorymypstree
: execute formypstree
directory
When compiling fg-shell
, you will need to link against the
readline
library. Add the -lreadline
to your compile command,
and you can refer to the previous lab (Lab 6) for an example. Y
README
In the top level of the lab directory, you will find a README
file. You must fill out the README file with your name and alpha.
Please include a short summary of each of the tasks and any other
information you want to provide to the instructor.
Test Script
You are provided a test script which prints pass/fail information
for a set of tests for your programs. Note that passing all the
tests does not mean you will receive a perfect score: other tests
will be performed on your submission. To run the test script,
execute test.sh
from the lab directory.
./test.sh
You can comment out individual tests while working on different parts of the lab. Open up the test script and place comments at the bottom where appropriate.
Process State Monitoring via /proc/[pid]
We survived the "nuclear" apocalypse, and now on safer ground, we
can start our investigation into process state. In this part of the
lab, you will write two small programs that, given a process id,
will retrieve status information to display … a lot like ps
and
pstree
The /proc
File System
The /proc
file system is special on Linux. It does not exist on
disc, but rather is a pseudo-file system. It's the place where the
O.S. allows users to take a peak at the internal workings. For
example, you can check out information about memory usage:
cat /proc/meminfo MemTotal: 7862628 kB MemFree: 6207280 kB Buffers: 408632 kB Cached: 865164 kB SwapCached: 0 kB Active: 634824 kB Inactive: 724844 kB Active(anon): 86628 kB Inactive(anon): 38332 kB Active(file): 548196 kB Inactive(file): 686512 kB Unevictable: 4 kB Mlocked: 4 kB SwapTotal: 12753916 kB SwapFree: 12753916 kB Dirty: 0 kB (...)
You can also see information about the processor:
cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 58 model name : Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz stepping : 9 microcode : 0xc cpu MHz : 1600.000 cache size : 8192 KB physical id : 0 siblings : 8 core id : 0 cpu cores : 4 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms bogomips : 6783.84 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: (...)
If we take a look at the directory more closely, we can see that there is all sorts of information in there:
#> ls -F /proc/ 1/ 1257/ 1375/ 1637/ 1841/ 250/ 36/ 477/ 674/ acpi/ ioports sched_debug 10/ 1267/ 1393/ 1640/ 1871/ 253/ 37/ 48/ 675/ asound/ irq/ schedstat 1003/ 1339/ 14/ 1656/ 19/ 254/ 38/ 499/ 68/ buddyinfo kallsyms scsi/ 1010/ 1349/ 15/ 1662/ 1924/ 257/ 39/ 50/ 69/ bus/ kcore self@ 1014/ 1353/ 1525/ 1666/ 2/ 259/ 4/ 51/ 7/ cgroups key-users slabinfo 1021/ 1360/ 1528/ 1670/ 20/ 26/ 406/ 52/ 7059/ cmdline kmsg softirqs 1024/ 1361/ 1545/ 1695/ 2010/ 262/ 41/ 53/ 7066/ consoles kpagecount stat 1027/ 1362/ 15609/ 17/ 2040/ 268/ 412/ 54/ 7067/ cpuinfo kpageflags swaps 1038/ 1363/ 16/ 1700/ 22/ 27/ 42/ 549/ 8/ crypto latency_stats sys/ 1040/ 1364/ 16077/ 1702/ 23/ 278/ 428/ 55/ 88/ devices loadavg sysrq-trigger 1042/ 1365/ 16095/ 1704/ 2354/ 28/ 43/ 558/ 887/ diskstats locks sysvipc/ 1052/ 1366/ 16119/ 1709/ 2361/ 281/ 44/ 56/ 888/ dma mdstat timer_list 1071/ 1367/ 16130/ 1725/ 2362/ 282/ 441/ 566/ 89/ dri/ meminfo timer_stats 11/ 1368/ 16131/ 1727/ 24/ 29/ 448/ 6/ 897/ driver/ misc tty/ 1110/ 1369/ 1618/ 1729/ 245/ 3/ 45/ 623/ 9/ execdomains modules uptime 11209/ 1370/ 1623/ 1735/ 246/ 30/ 450/ 639/ 90/ fb mounts@ version 11511/ 1371/ 1624/ 1737/ 247/ 31/ 455/ 65/ 907/ filesystems mtrr version_signature 12/ 1372/ 1626/ 18/ 248/ 32/ 46/ 660/ 923/ fs/ net@ vmallocinfo 1200/ 1373/ 1632/ 1835/ 249/ 34/ 469/ 661/ 953/ interrupts pagetypeinfo vmstat 1224/ 1374/ 1635/ 1840/ 25/ 35/ 47/ 67/ 957/ iomem partitions zoneinfo
Most of this is not of interest to us, but the numerical
directories are. Every time a process is created, the kernel
creates a new directory in the /proc
file system. The process id
is the same as the directory name. For example, I can run ps and
see the process id's for the processes in my terminal.
#> ps PID TTY TIME CMD 7067 pts/1 00:00:00 bash 16133 pts/1 00:00:00 ps
I can also look in the directory for the bash process:
ls -F /proc/7067/ attr/ cmdline environ latency mem ns/ pagemap sessionid status autogroup comm exe@ limits mountinfo numa_maps personality smaps syscall auxv coredump_filter fd/ loginuid mounts oom_adj root@ stack task/ cgroup cpuset fdinfo/ map_files/ mountstats oom_score sched stat wchan clear_refs cwd@ io maps net/ oom_score_adj schedstat statm
Each of these files provide some information about that
program. For example, the comm
file is the name of the comand and
the fd/
stores the open file descriptors.
#> cat /proc/7067/comm bash #> ls -F /proc/7067/fd 0@ 1@ 2@ 255@
Parsing proc[pid]/stat
Of relevance to this lab task is the /proc/[pid]/stat
file. This
file contains the current status of a running process. Here is the
status for bash
:
cat /proc/7067/stat 7067 (bash) S 7066 7067 7067 34817 16543 4202496 15913 163332 0 5 35 8 176 78 20 0 1 0 112555708 21057536 1440 18446744073709551615 4194304 5111460 140734366799328 140734366797904 139797077691534 0 65536 3686404 1266761467 18446744071579207412 0 0 17 0 0 0 0 0 0 7212552 7248528 7630848 140734366801521 140734366801527 140734366801527 140734366801902 0
While this may seem like a just a bunch of numbers, to the trained
eye, there is a tone of information in here. You can read about it
using man 5 proc
. The relevant information is below, as well as
the scanf()
code for reading it.
- pid : "%d" : process id
- comm : "%s" : command name, with parenthesis
- status : "%c" : status id, such as S for sleeping/susspened, R for running
- ppid : "%d" : parent process id
In code, it's fairly straight forward to open the stat
file and scan in this information:
FILE * stat_f = fopen("/proc/7067/stat","r"); fscanf(stat_f, "%d %s %c %d", &pid, comm, &stat, &ppid); close(stat_f);
And, that is exactly what you are going to do in the next two tasks.
String Format Printing
For the tasks below, you will need to be able to perform a format
print into a sting. So far, we've been format printing to the
terminal or open file, but you can also format print into a string
to save it for other uses. Here is the function deffinition for snprintf()
the
string printf function:
int snprintf(char *str, size_t size, const char *format, ...);
The first argument, str
, is a pointer to the string to store the
result of the formatting. The second argument size
is the size of
str
so we don't overflow the string buffer. The rest is the same
as other format printing. Here's a sample program:
#include <string.h> #include <stdio.h> #include <stdlib.h> int main(int argc, char * argv[]]){ char hello[1024]; snprintf(hello, 1024, "You said, \"%s\"\n", argv[1]); printf("%s\n",hello); }
You'll find snprintf()
very useful because you'll receive a
process id from the command line, and need to open the stat file
/proc/[pid]/stat
, which means you'll need to write a number into a
string before using fopen()
. This is exactly what snprintf()
does best.
Task 1 myps
Change into the myps
directory in the lab directory, in which
you'll find the source code myps.c
. You should provide a Makefile
to compile the source code to myps
.
The myps
command will accept process id's on the command line and
will print information about those process in a tab separated
format. You should use the format print below in order to pass the
tests:
printf("%d\t%s\t%c\t%d\n", pid, clean_comm(comm), state, ppid);
The clean_comm()
function is provided for you to remove the "("
and ")". You must read all information from /proc/[pid]/stat
file.
Here is some sample usage.
aviv@saddleback: myps $ ps
PID TTY TIME CMD
15201 pts/2 00:00:30 emacs
17724 pts/2 00:00:00 ps
27538 pts/2 00:00:09 bash
aviv@saddleback: myps $ ./myps 15201
PID COMM STATE PPID
15201 emacs T 27538
aviv@saddleback: myps $ ./myps 27538
PID COMM STATE PPID
27538 bash S 27537
aviv@saddleback: myps $ ./myps 27538 15201
PID COMM STATE PPID
27538 bash S 27537
15201 emacs T 27538
aviv@saddleback: myps $ ./myps BAD_PID
PID COMM STATE PPID
ERROR: Invalid pid BAD_PID
aviv@saddleback: myps $ ./myps 27538 BAD_PID 15201
PID COMM STATE PPID
27538 bash S 27537
ERROR: Invalid pid BAD_PID
15201 emacs T 27538
aviv@saddleback: myps $ ./myps 11111
PID COMM STATE PPID
ERROR: Invalid pid 11111
If a bad process id is provided, an error message should be printed
to stderr
.
The process family hierarchy
All processes start with the kernel, but we can also view the process
as a tree, from init down. The pstree
function is really quite good
at this. Here's an example just for the process I'm using:
pstree -ua aviv sshd └─bash sshd └─bash ├─cat ├─emacs ├─emacs myps.c └─pstree -ua aviv
If you think about it, all this is doing is traversing from a given
process to the parent process, all the way up to init. The /proc
file system has all this information, and it's fairly simple to adapt
the code above to recurse the parent-child tree.
Double Linked Lists
To store and mange paths from a leaf to a root in the process tree, we will use a double linked list. A double linked list is just like the linked lists we used in previous labs except that instead of having only a forward pointer, from one node to the next node, it also has a previous pointer, from one node to the previous node in the list. Additionally, the list structure references both the head of the list and the tail of the list.
Visually, a double linked list looks like so:
list .--------. .--+- head | | | tail -+-------------------------------. | '--------' | '----. | V V .------. .------. .------. .------. | next +-->| next +-> ... | next +-->| next +->NULL NULL<-+ prev |<--+ prev | <-+ prev |<--+ prev | '------' '------' '------' '------'
This structure is particularly useful for the task of implementing
pstree
because you will traverse from the child up to the parent,
and then to print out the path, you have to iterate backwards. A
double linked list can easily iterate in both the forward and reverse
direction because of the double pointer structure.
Task 2 mypstree
Change into the mypstree
directory in the lab directory, in which
you'll find the source code mypstree.c
as well as the header file
linkedlist.h
and the source file linkedlist.c
. You should
provide a Makefile to compile the source code for mypstree
that
will depend on compiling both source files..
Your program must meet the following requirements:
- You should complete the
append()
anddel_list()
functions for the double linked list in thelinkedlist.c
source file. You will use this structure to store process parent/child paths. - In
mypstree
you should complete the functionspath_to_parent()
andmain()
- The
path_to_parent()
function is a recursive function, as in it will call itself. The recursive step occurs for each parent process until the base case is reached, i.e., init is reached. For each recursive call, a process will be added to the list. - In the main function, you will need to create a new list, call =pathtoparent(), print the list with the function provided, and deallocated the list. Use the linked list library functions where appropriate.
- The
- The
pid
is the command line argument, and multiplepid
's can be provided, which will print multiple trees. Errors should be reported and printed to STDERR, using the following format print below. This can occur, for example, if you are unable to open the stat file.
fprintf(stderr, "ERROR: Invalid pid %d\n", parent);
Some sample output is provided below:
aviv@saddleback: mypstree $ ps
PID TTY TIME CMD
15201 pts/2 00:00:30 emacs
20955 pts/2 00:00:00 cat
20956 pts/2 00:00:00 ps
27538 pts/2 00:00:09 bash
[2]+ Stopped cat
aviv@saddleback: mypstree $ ./mypstree 15201
init
└─sshd
└─sshd
└─sshd
└─bash
└─emacs
aviv@saddleback: mypstree $ ./mypstree 20955
init
└─sshd
└─sshd
└─sshd
└─bash
└─cat
aviv@saddleback: mypstree $ ./mypstree 15201 20955
init
└─sshd
└─sshd
└─sshd
└─bash
└─emacs
init
└─sshd
└─sshd
└─sshd
└─bash
└─cat
aviv@saddleback: mypstree $ ./mypstree BAD_PID
ERROR: Invalid pid BAD_PID
aviv@saddleback: mypstree $ ./mypstree 15201 BAD_PID 20955
init
└─sshd
└─sshd
└─sshd
└─bash
└─emacs
ERROR: Invalid pid BAD_PID
init
└─sshd
└─sshd
└─sshd
└─bash
└─cat