CS474 - Semaphores

Dijkstra defined the basic critical section interface that serves as the foundation for low-level process synchronization. He defined two primitives, called P and V (these are abbreviations for words in Dutch) that operate on an abstract data type called a semaphore. The text translates the operations into English as down (replacing P) and up (replacing V). They are defined to be atomic operations.

There are two sub-types of semaphores: binary semaphores can only take the values 0 or 1 and are used strictly for mutual exclusion; counting semaphores can take any non-negative value, and are also used for resource management (note that the text's description of mutexes pretty much follows the definition of a binary semaphore). The text presents a variety of classic problems, and their solutions using semaphores; here's one of them, which uses both types of semaphore: the producer-consumer problem (this is a very slightly different formulation of the solution than that given in the book). Here, mutex is a binary semaphore, used to control access to the buffer. space and full are counting semaphores, used to keep track of how much space is available, and how many slots are already in use, respectively.
ProducerConsumer
while (1) {
    item = produce();
    down(space);
    lock(mutex);
    put(item);
    unlock(mutex);
    up(full);
}
        
while (1) {
    down(full);
    lock(mutex);
    item = get();
    unlock(mutex);
    up(space);
    consume(item);
}
        

Semaphores in Unix

Unix also provides a semaphore data type which processes can use to communicate. There are three OS calls under Linux which manipulate them: semctl, semget, and semop.

Implementing Semaphores

One of the most important contributions Dijkstra made with semaphores was his decoupling of the notions of "what happens when you've got mutual exclusion" from "how do you do mutual exclusion?" He simply assumed a mutual exclusion mechanism that worked, and went on from there. However... I feel a bit more comfortable with these constructs if I can see how they can be done.

What we have to recognize is just that a semaphore is, itself, a critical section. It's just a really small one. So we can use a less-efficient mechanism to guard access to the semaphore...

We can do a good job of this if we consider the single-processor and the multiple-processor cases separately. In the single-processor case, we can guard the code implementing the semaphore by disabling interrupts. In the multiple-processor case, we can guard the code by disabling interrupts (on the local processor), and executing a spin-lock implemented by something like a TSL instruction. This, in fact, is what Linux does to implement kernel locks.


Last modified: Fri Sep 9 11:24:31 MDT 2005