IC221: Systems Programming (SP15)


Home Policy Calendar

Lec 11: User Space, Kernel Space, and the System Call API

Table of Contents

1 The Operating System as a Resource

In the past lessons, we've been learning about the C programming language, but now we turn attention back to the operating system and the relationship between the programmer and the operating system.

Recall from earlier lessons that an operating system is a software unit that controls and manages the hardware and system resources of a computer. The operating systems provides two primary features for the programmer:

  1. Abstraction : The OS provides an abstract execution environment for the programmer to view their program running and using system resources through a unified interface, regardless of the underlying hardware.
  2. Isolation : The OS ensures that the execution of one program doesn't interfere with the execution of other programs, and that actions of programs can occur concurrently.

To achieve these components, the OS applies a security policy that controls and coordinates access to system resources so that programers do not unintentionally break the abstraction and isolation requirements. The OS'es enforcement of the security policy is implemented through the system call API. Instead of having the programmer directly access resources, an API is used by which the programmer asks the OS to perform protected actions on its behalf. The OS, unlike the programer, is trusted in performing these actions in a way that will not break anything, and the unified framework also simplifies the programmers experience.

The separation between the actions that can be performed by the programmer and those that must be performed by the OS is divided between user-space and kernel-space. Understanding this boundary from a system programming and OS resource perspective is the theme of this lesson.

1.1 OS System Resources

The functions of the OS is to manage system resources. What are system rescues? These are the hardware components of the computer that support the execution of a programmer or organization of information. Typically, we describe the set of system resources coordinated by the OS as:

  • Device Management : Hardware devices, such as keyboard, monitors, printers, hard drives, etc., are resources managed by the computer. When a programmer wishes to interact with these devices, a unified interface provided by the OS is used.
  • Process Management : The invocation and execution of a program, a process, is managed by the OS, including managing its current state, running or stopped, as well as the loading of code.
  • Memory Management : The access to physical and virotual memory is controlled by the OS, and a programs memory layout and current allocations is carefully managed.
  • File System Management : The OS is also responsible for ensuring that programs can read and write from the filesystem, but also that programs don't corrupt the file sysystem or access files/directory that they do not have permission to.

So far in this class, we've written programs (either in Bash or C) that have required access to those resources. For example, we've read user-input through keyboard (device management); we've invoked and executed programs through bash (process management); we've allocated and deallocated memory in C using malloc() and calloc() (memory management); and, we've read, written, and created files in both C and Bash (file system management).

In each of those cases, while it is nice to think that we, as the programmer, have done these things, in fact, the operating system has performed these actions on our behalf in a supervisory role. This is mostly for our protection and convenience. Would you really want to have to read directly from the keyboard driver in order to get input from the user? Would you want to write to the display driver to print information back? Maybe you do, if you're nuts about computing, but most of us don't. And further, if you do want to perform these low level actions yourself, it's really easy to mess it up, at which point, your computer may be broken forever. For example, if you had to manipulate the filesystem directly, and you made a mistake —oops, you just lost all your files!

1.2 Kernel Space vs. User Space

The kernel of the OS is a program that is trusted to perform all the protected system resource actions. The kernel is trusted software and executes in supervisory mode, and all the basic OS functionality is implemented from with the kernel software. We describe the domain of the kernel as kernel-space. Actions that can be performed without privilege, that is, do not require the kernel, are described as part of the user domain, or user-space.

The distinction between these two domains is important. For example, adding two numbers together, a process completed by an add instruction on the processor, is unpriviledged and is performed in user-space. Similarly, the action of iterating through an array and reading and writing data already allocated in memory is also unprivileged and performed in user-space. But, the allocation of new memory, by adjusting the break point, for example, is a privileged process, and must be completed by the kernel.

2 System Calls

When a privileged access is required, a context-switch between the user program and the kernel must be performed. A context switch occurs when the user program execution is stopped, the current state is saved and offloaded from the processor, and the kernel is swapped in to complete the protected task. Once the operating system completes the request, the kernel will stage any results to be returned to the user process, and the kernel is swapped out in favor of the user process. Execution continues from that point.

A system call is a function stub that is the entry point for requesting OS services. So far, we've been using functions that are defined in the C standard library, stdlib.h, but supporting these operations are system calls, defined in unistd.h, the unix standard library. For example, managing memory allocation is the domain of the operating system, but so far we've just been using malloc() and calloc() to perform these tasks.

images.004.jpg

The C memory allocation routine is really about how to manage the memory that has already been allocated. As programs free and allocate new memory all the time, malloc() attempts to find contiguous memory to fufill those new reqeust. There are many ways to do this, for example, find the first region of unallocated space, even if it is too big, and use that (first fit), or the allocator can look a region of unallocated memory that is as close to the request size (best fit). Both strategies are fine, but the operating system is not inovled in that process; however, when there is no more space in the heap, the break point needs to be adjusted, then the Operating System needs to get involved. The system call that moves the break point is called sbrk(), and it is a function from the unix system library. Whenever malloc() cannot fill an allocation request, it calls sbrk() which adjust the break point, effectively allocating more memory.

2.1 Kernel Traps

The invocation of the kernel to perform a context switch occurs through a trap. A trap is a special instruction to the processor that an operation is needed from the kernel. The processor iterupts the current execution of the program, saves the state, and invokes the kernel with the trap information. In the running example, this will be a trap for the kernel function sys_sbrk() which was invoked via the system call sbrk().

images.003.jpg

The kernel will then fulfill the request via the kernel function. Once that function returns, the kernel is context switched out, the user process is context switched in, and execution continues as before.

2.2 How to recognize a system calls using the man pages

So far in this class we haven't been using the system call interface directly, but rather we have used the C standard library to interface for us. This is going to change as we explore the Unix system, and it is important that you can identify the differences between library functions and system calls.

The easiest way to do this is via the manuals. The man pages are divided into sections to better organize the plethora of manuals available. There is a total of 8 sections, and below are the relevant ones.

  • Section 1: General commands, such as those found in the bash enviroment
  • Section 2: System calls, such as sbrk()
  • Section 3: Library functions, such as malloc()
  • Section 8: System Administration, … get to that later

For example, if we type, man malloc, and inspect the header of the manual, we can learn a lot of information:

MALLOC(3)           Linux Programmer's Manual                    MALLOC(3)

NAME
       malloc, free, calloc, realloc - Allocate and free dynamic memory

SYNOPSIS
       #include <stdlib.h>

       void *malloc(size_t size);
       void free(void *ptr);

First we can see that malloc() is in section 3 of the manual via MALLOC(3) header. Also, from the synopsis, we see that it is a part of the C standard library via the #include <stdlib.h>. As a comparison, let's look at the manual for sbrk().

BRK(2)              Linux Programmer's Manual                        BRK(2)

NAME
       brk, sbrk - change data segment size

SYNOPSIS
       #include <unistd.h>

       int brk(void *addr);

       void *sbrk(intptr_t increment);

The sbrk() command, with brk(), is in section 2 of the manual via the BRK(2) header, and it also in the unix standard library, which we know from #include <unistd.h>.

One problem you may encounter is that there are manuals that have the same name. For example, there is the read command for bash, which is a general command in section 1 of the manual, and there is also the system call read(), which is in section 2 of the manual. The preference for the man command is to always retrieve lower numbered manuals. For example,

man 2 read

will display the bash read command and not the system call read(). To access the system call manual for read(), use

#> man 2 read
READ(2)                  Linux Programmer's Manual                     READ(2)

NAME
       read - read from a file descriptor

SYNOPSIS
       #include <unistd.h>

       ssize_t read(int fd, void *buf, size_t count);