CS 474 - Deadlock

A new problem: Dining Philosophers. We have five philosophers sitting around a table, eating spaghetti and thinking. There is a fork to each side of each philosopher; in order to eat, the philosopher must pick up both forks. A naive solution would have each philosopher pick up the fork to his left first, then the one to his right. So it goes something like


while (1) {
    think();
    P(fork[me]);
    P(fork[(me+1) % numphil];
    eat();
    V(fork[me]);
    V(fork[(me+1) % numphil];
}

If all the philosophers pick up their left forks at the same time, nothing ever happens again. This is called deadlock. For deadlock to occur, four conditions must be met. The text divides these four conditions into three policy decisions, and one circumstance that can occur as a result of the actual execution of processes.

  1. Mutual exclusion. Only one process can have a given resource at a time (if two philosphers were able to somehow share a fork, the problem would go away).
  2. Hold and wait. A process may hold a resource at the same time it requests another one (if the philosophers were constrained to pick up both of their forks at the same time, the problem would go away).
  3. No preemption. Resources are only released by explicit action of the process, not by somebody else (if one philosopher could yank a fork out of another's hand, the problem would, again, go away. Faculty luncheons would also be much more lively).
  4. Circular waiting. If we define a graph with two node types (square for resource, round for process) with an edge from resource to process if the process holds the resource, and an edge from process to resource if the process is trying to get the resource, and there is a cycle in the graph, we have deadlock (if the philosophers grab the lower-numbered of their two forks first, the cycle would be broken).

What do we do about deadlock?

Prevention

Deadlock can be prevented a priori by using a system design that prevents at least one of the four conditions from ever occurring. Of the four

A prevention strategy tends to under-utilize resources.

Avoidance

The difference between a prevention strategy and an avoidance strategy is subtle: where a prevention strategy prevents deadlock by making policy decisions that make sure that at least one of the requirements for deadlock can never occur, an avoidance strategy lets any of the four conditions occur, but examines each resource request to make sure that the four conditions don't become simultaneously true.

The text gives a particular implementation of an avoidance strategy, called ``maximum claim.'' In this strategy, when a process is created it has to announce the maximum amount of resources of each type the process may ever use; this information is used by the resource manager. Now, whenever the process requests a resource, it is able to determine if there is some sequence of requests (and releases) that will let every process run to completion eventually.

Whenever any process requests a resource, the resource manager determines whether, if that process were to go on and request all the resources it has staked in its maximum claim, deadlock would occur. If so, the state is ``unsafe;'' if not, it's ``safe.'' Requests are permited only if they leave the system in a safe state. A well-known avoidance strategy is the Banker's Algorithm, modeled after lending policies at banks (I expect my father the accountant would take serious exception to the claim this looks anything like how a bank does business). A bank only has so much money to lend; customers have a line of credit. Now, if a customer borrows some money against their credit line, and then requests more money, the first amount borrowed will only be paid back if the new loan is approved.

If there is some sequence of loans and repayments in which at least one customer could borrow up to their full line of credit and then repay the loan, assume they will. After this customer has repaid, go on to the other customers. If all the customers could exercise their full line of credit, the state is safe.

We can implement this by using an allocation table in which a row represents a process and a column represents a resource class; an element of the table is the amount of the resource held by the process (allocation[h][j]).

Now we can have another table in which an element is the max claim (claim[h][j]).

We also need two vectors, resource[j] which says the total amount of resource j that exists, and available[j] which says how much is available at the moment.

Now, the algorithm works like this:

  1. copy the allocation table to a new table (allocation').
  2. Compute the available based on allocation'.
  3. Find a process such that it can successfully exercise its maximum claim to all resources. If there is no such process you've got an unsafe state. If allocation' is 0, you've got a safe state.
  4. Set allocation'[i][j] (for all j) to 0, and go back to step 2 (of course, the process you just cleared out can't be used for later iterations of the algorithm.

Deadlock Detection

Deadlock avoidance is conservative; it assumes that every process will need, at some point, to have all of its resources at the same time. A less conservative strategy watches to see if a deadlock does occur.

We can do this by using the same allocation matrix and available vector as in the avoidance strategy. We'll also add a matrix we'll call Q; q[i][j] is the amount of resource j requested by process i (but not yet granted). We don't need a claim matrix any more.

The idea is to look through all the processes for the ones that aren't deadlocked. We'll let M be a vector of processes that have been marked as non-deadlocked. Here's how we go:

  1. Initially, M is empty.
  2. Every process whose row in the allocation matrix is all 0 can be marked.
  3. Make a temporary vector avail', initially equal to the avail vector.
  4. Find a process i such that i is currently unmarked, and every element in the i-th row of Q is less than or equal to the corresponding element of avail'. If you can't find one, you're done.
  5. Mark process i and add the corresponding row of the allocation matrix to avail'. Return to step 3.
If there are any unmarked processes left when you finish this, you've got a deadlock.

The idea behind this algorithm is that you look for any processes whose outstanding resource requests can be satisfied, and assume that once the resources are granted the process will run to completion and release them (there's no guarantee it will really do this, of course, but if it could then the process is not involved in a deadlock).

You could do this occasionally if deadlock is not expected to occur frequently, or as often as every resource allocation request.

Recovery

Of course, discovering you've got a deadlock does you no good whatever unless you have some sort of recover mechanism. Some possibilities, ranging from least-intrusive (but most difficult to implement) to most-intrusive but easy, include:

Final Note

Final note on deadlock: as a practical matter in modern interactive environments, we distinguish between deadlocks in the OS vs. in user processes. Within the OS, it becomes possible to sort resources by some criterion, and either eliminate the possibility of deadlock or reduce it to a small number of resources. So, in the case of Minix, so far as I know the only possibility of deadlock is on messages. This special case is watched for, and handled in an ad hoc manner.

Deadlock within user processes is considered to be a user error, and the users are on their own.

A Resource Allocation Spreadsheet

I've built a little spreadsheet in gnumeric, which accounts for resources in Tanenbaum's example on pp. 179-180. You're welcome to use it, or modify it, as you'd like. It's at http://www.cs.nmsu.edu/~pfeiffer/classes/474/notes/resources.gnumeric


Last modified: Sun Oct 2 12:54:38 MDT 2005