CS 574

Homework 3: Aggie DSM

Due Last Day of Class

For this assignment, you will implement a simple form of software distributed shared memory. A practical DSM system faces many issues that add a huge degree of complexity to the implementation; this system quite deliberately fails to deal with nearly all of them in the interest of simplicity. We simply won't worry about load balancing, trying to reduce the number of messages sent, integrating the DSM with the memory protection mechanisms, and a host of similar issues.

Processing Model

The basic idea here is that client processes will be able to allocate and deallocate DSM objects of arbitrary sizes. A daemon process will be responsible for keeping track of locks, and maintaining a master copy of the DSM objects.

When the first process to want to use a particular DSM object requests its creation the daemon will allocate space for the daemon's master copy of the object, and the process will also allocate local space for it. Any further clients wishing to use the object will also be able to request it; they will allocate local space for it as before, but the daemon will not need to allocate more space (it only needs a single master copy of each DSM object, regardless of how many clients are using it). The daemon will maintain a reference count of all the clients with references to the object; when the reference count drops to 0 the daemon will be able to free its local copy.

DSM objects will be identified by 32-bit ``keys.'' A key is four ASCII characters; beyond that, it can be anything the programmer wants. So one programmer might use four (ascii) character strings "aaaa," "aaab," another might use keys that correspond to something about the object size... anything they want, so long as it's four ASCII characters. When a client process is communicating with the daemon about DSM objects, they will identify them by key. If two processes want to refer to the same object, they use the same key when referring to it.

There will be two types of locks on DSM objects: read locks and write locks. As usual, any number of processes may share read locks, while only a single process can have a write lock (and if a process has a write lock, no further processes can have read locks).

Proper use of locks will be strictly voluntary: a process is free to read or even write a DSM object even if it doesn't have the necessary locks, but if it reads the object there is no guarantee the data won't be stale, and if it writes the object its changes won't be seen by other processes.

DSM Clients

It'll probably be clearest if we begin by describing the DSM system as seen by a client. There are three issues here: the application programmer's interface (API), the client-side data structures, and the messages used to communicate between the client and the daemon. This section will describe the first two; messages will be discussed later.

API

The programer will use the following seven calls to manipulate the DSM system:

int dsm_init();
This call is used to connect to the DSM server, and to initialize any data structures necessary. It will return a 0 on success, and a -1 on failure.
int dsm_exit();
This will close the client's connection to the daemon. It will return a 0 on success, and a -1 on failure.
void *dsm_malloc(__uint32_t key, size_t bytes);
This call is in analogy to the malloc() C library call. It allocates a DSM object of the specified size, and associates that object with a key value. The key value is an arbitrary 32 bit value chosen by the programmer; if two processes use the same key value, then they refer to the same DSM object and will be shared between the two processes. Of course, all the processes sharing a DSM object have to request it be the same size.
The function returns a pointer to the allocated object if successful, or NULL on failure. The intent is that a programmer will use this function in a similar manner to the normal malloc() call; typically allocating something like a struct or an array.
void dsm_free(void *object);
This call is in analogy to the usual free() call. It notifies the DSM daemon that this process no longer has a pointer to the DSM object, and frees its local copy.
I'll happily agree with anybody who wants to argue that it would make more sense for free() to have some sort of return code, so the programmer has a fighting chance of knowing there's a program bug in the memory management code. But free() doesn't have a return value, so neither will dsm_free().
int dsm_none(__uint32_t key);
Give up all locks on the DSM object identified by the key; in effect, notify the DSM daemon that the process will not be reading or writing the DSM object until some uture time (at which time the process will use one of the remaining calls to acquire necessary locks).
This will return a 0 on success, or a -1 on failure (an example of a condition on which this call could fail would be if the key didn't correspond to a DSM object in the system, or if the process already held no locks on the object.
int dsm_read(__uint32_t key);
Request a read lock on the DSM object identified in the key. If the process doesn't have a lock already, the daemon will respond by sending the client the current contents of the object. If the process previously had a write lock, the lock is downgraded.
This will not return until the daemon is able to give the client a read lock on the object. It will return a 0 on success, and a -1 on failure. Examples of conditions which would cause a failure would include if the key didn't correspond to a DSM object in the system, or if the process already held a read lock on the object.
int dsm_write(__uint32_t key);
Request a write lock on the DSM object identified in the key. If the process doesn't already have a lock on the object, the DSM server will send the process the current object contents. If the process previously held a read lock, the lock is upgraded.
This will not return until the daemon is able to give the client a write lock on the object. It will return a 0 on success, or a -1 on failure. Examples of conditions under which the call could fail would include if the key didn't refer to a DSM object in the system, or if the process already held a write lock on the object.

Client Data Structures

The client will need to keep track of the file descriptor used to communicate with the daemon, the locks on the DSM objects, the mapping from object keys to data areas, and the size of the data areas. This should be maintained as some some sort of lookup table; you will probably find the hcreate(), hdestroy(), and hsearch() man pages very useful. actually, it's these functions that are the reason I'm requiring keys be ASCII values, since they require null-terminated ASCII strings as keys.

These data structures are an exception to the normal rule of ``no global variables.'' These should be global to the DSM module, though they should be declared static so programs linked to the DSM module can't tamper with them.

The DSM Daemon

This section will describe the structure of the DSM daemon. This includes its data structures and an outline of its processing.

Data Structures

The daemon's job is to maintain the DSM. This means that it will need to maintain the master copy of the DSM data, and will need to keep track of three things: what processes have access to parts of it, what processes need to acknowledge that their access to it has changed, and what processes have requested changes to their access which haven't been serviced yet.

In order to accomplish this, each allocated object will need to be represented by a struct with all the information relating to the locks, and a pointer to a buffer containing the object's data. There will have to be a lookup table mapping keys to these structs. As with the client data structures, you will probably find it very helpful to take a look at the man page for the hcreate(), hdestroy(), and hsearch() function calls.

This lookup table is, of course, an exception to the normal rules regarding global variables. It makes the most sense for it to be a global variable within the daemon.

How the Daemon Works

The daemon will operate somewhat similarly to the toupper_daemon of several weeks ago. When the daemon is first started, it will:

  1. Open a port at address 1024 or above
  2. Create a file named DSM_port in the same directory from which the daemon was started. Write whatever information you want, in whatever format you want, to allow a client to conect to your server.
  3. Now the daemon will sit and wait for connections, again like before.
  4. Things get different once a connection is made: the DSM daemon won't fork off any children. Instead, it will wait for connections, and wait for messages, in a single loop. You'll need to use the select() system call to implement this.
  5. As messages come in, the daemon will process them. We'll discuss the messages it has to handle in a later section.
  6. When the daemon receives a SIGHUP signal, it should delete the DSM_port file and terminate.

On the other hand: one of the students in the class has gotten RPC working. If you'd prefer, you can use RPC. An example of how to set up an RPC client and server is in the directory /user/pfeiffer/RPC. If you use RPC, the system will only work under Solaris (I don't know what the problem is, but when I build it under Solaris, or I build it at home, everything works great. When I build it under Linux at school, there is a protection violation when I start the client). If you choose to use RPC, I'll accept it even though it will only run under Solaris.

When the daemon receives a request for a change in the locks held on an object, it must determine whether the request can be granted:

Releasing locks
This request can always be granted. Once the lock has been released, the server looks to see if there is a queue of pending requests; if so, the requests in the queue are processed until one is encountered that can't be satisfied immediately.
Read lock
This can be granted if there are no locks, or the only locks currently on the object are read locks and there is no request queue built up. If the lock can't be granted, the request is placed on the queue. One exception to this rule is that if the process holding the write lock wishes to downgrade its lock to a read lock, the request is granted immediately. Also, in this case, the requests in the queue are examined and granted until one is encountered that can't be granted.
Write lock
This can be granted only if there are no locks already on the object, or the client requesting the write lock currently holds the only read lock on the object. In this latter case, the request can be granted immediately even if there is a non-empty request queue on the object.

Messages

At last, the interaction between the client and the daemon. When the programmer requests various locks, messages are sent to the daemon. In response, the daemon sends messages back to the client, and updates the master copy of the DSM objects. In some cases, it may need to send an acknowledgement back to the client, and may also need to send data to the client.

All data in messages will be sent in native host order.

The philosophy being followed here is that messages are sent between the client and the daemon only when necessary. If the client makes a request that can always be satisfied immediately (releasing or downgrading locks, allocating space, freeing space) the daemon does not need to acknowledge it. It is assumed that both the client and the daemon ``know'' the status of the client's on the objects, so the client will also know when to expect a response to a message and when not to.

Message Format: Client to Daemon

Messages from the client to the daemon use one byte to identify the message type followed by four bytes giving the key identifying the DSM object. Messages requesting allocation of DSM objects will then have four bytes giving the size of the object; messages giving up write locks will be followed by the new contents of the object (however large that may be).

Message Format: Daemon to Client

Since the daemon only sends messages to the client in response to client requests, it isn't necessary for them to include a message type or key. A plain acknowledgement message will just be one byte containing all 0's; messages sending the contents of a DSM object will be the size of the object.

Message Types

The message types are

typedef enum {MALLOC, FREE, NONE, READ, WRITE} MessType;

Client-Daemon Conversations

The possible conversations are as follows:

Allocating a DSM Object

Client:
MALLOC key size
Daemon:
the daemon does not need to respond

Freeing a DSM Object

Client:
FREE key
Daemon:
the daemon does not need to respond

Releasing All Locks

There are two cases. If the client previously held a read lock, the conversation is:
Client:
NONE key
If the client previously held a write lock:
Client:
NONE key data
where the data is an update to the contents of the object. In both cases, the daemon does not need to respond.

Requesting a Read Lock

Again, there are two cases. If the client previously held a write lock:

Client:
READ key data
Daemon:
the daemon does not need to respond

If the client previously held no lock:

Client:
READ key
Daemon:
data

Requesting a Write Lock

You guessed it: two cases. This time, the client's message is the same in both cases, but the daemon's response varies. If the client previously held no lock:

Client:
WRITE key
Daemon:
data

If the client previously held a read lock:

Client:
WRITE key
Daemon:
NULL

The Assignment

Oh, yes, the actual assignment: I want you to write the DSM daemon, and the DSM client-side library. We'll be discussing how to turn them in to me later. I'll also be writing a main program which, if your client-side library is written correctly, will link with the library properly. But there's time to spare on that.

I'm calling it Aggie DSM in honor of all the jokes in which Aggies are, shall we say, ``simple.''


Last modified: Thu Nov 29 21:59:10 MST 2001