Processes

Definition

One of the most fundamental concepts in an operating system is the process. There are many definitions that have been used; here's my favorite definition:

A process is a program, a protection domain, and one or more threads of execution.
Looking at each of these parts separately:

Process Creation

The text talks in general terms about process creation, and gives four ways a process can be created. In a modern OS, only two of these actually happen: an initial process is created as part of the boot process, and a process can create a child process.

In a Unix-derived OS, the first process created is called init, and it has process ID 1. When init begins executing, it looks in a file called /etc/inittab to find specifications regarding an initial set of processes to create. Details are in the init man page.

In Unix-derived operating systems, the mechanism to create a new process is the fork() system call, which creates an identical child process. To run a program typically requires a fork() and an exec().

Linux handles this a little bit differently. New processes are created in Linux with a system call called clone(). When you call this routine, you are able to specify a function that it will execute (it terminates after it's executed it), and the location of the stack in memory space. You can also use a whole bunch of flags to determine what parts of the parent's protection domain will be shared with the child. This gives you the ability to create new threads (by sharing pretty much everything), or to create a totally new process. fork() is implemented as a call to clone(). There is also a library called pthreads that implements threaded programming on top of the clone() call (and does a lot more besides!).

In the Windows world, a process creates a child using the CreateProcess() system call. This combines the functions of the Unix fork() and exec().

Process States

We can model process activity in terms of a state machine. At its very simplest, a process has two states: running and not-running (only as many processes as we have CPUs can actually be running at a time; the rest have to be not-running).

The text uses three possible states for a process.

Ready
A Ready process is one that has all of the resources it needs to run, save the CPU.
Running
A Running process is, well, running. You can only have as many running processes as you have CPUs.
Blocked
A Blocked process is one that is waiting for an event, such as input to arrive or output to finish

We can certainly have more complex models than this. When you run the ps command under Linux, it will show a process as being in one of seven states:

R    Running or runnable (on run queue)
This corresponds to Tanenbaum's "Running" and "Ready" states.
D    Uninterruptible sleep (usually IO)
S    Interruptible sleep (waiting for an event to complete)
These correspond to Tanenbaum's 'blocked'; the idea is that a process waiting on a device slow device such as keyboard input should be put in an interruptible sleep; it will be awakened if a signal arrives. Processes waiting on fast devices such as disks should be in an uninterruptible sleep; the process won't be awakened by the delivery of a signal.
T    Stopped, either by a job control signal or because it is being traced.
So if you give a process a ^Z it goes into this state.
W    paging (not valid since the 2.6.xx kernel)
X    dead (should never be seen)
Z    Defunct ("zombie") process, terminated but not reaped by its parent.
(from the ps man page).
Last modified: Mon Aug 29 10:53:15 MDT 2005