C: User-defined Types, Structures and Linked Structures • Jonathan Cook

From old CSCI 2210 (CS 271) notes…

User Defined Data Types

Creating a program essentially boils down to two tasks: expressing the problem’s data in the types given by a programming language and then expressing the problem’s behavioral solution in the actions of the programming language.

Most of us think of programming as the second part, but the first part, the data expression, is probably more important! Frederick Brooks, a famous computer scientist who wrote The Mythical Man-Month said something like “show me your program without showing me your data representation and I’ll never understand your program, but show me your data representation first and the program will naturally follow.”

All programming languages allow users to define new types of data, and in C this is done with the typedef statement. New types are always built out of existing types, so that ultimately all data is represented by C’s built-in datatypes. Nevertheless, building your own types is very important in large programming projects. New type definition is also the basis for Object-Oriented Programming (although plain C does not take it that far). Typedef is used as:

typedef oldtype newtype

The above statement allows you to add a new name to C’s type system, called newtype, and whenever the compiler see’s it in your program, it knows that it is a synonym for the type oldtype. The following code is an example:

typedef int feet;
typedef int inches;
...
feet length;
inches extra;
...
length += extra*12;

Above, we have created two new types for our program, feet and inches, which are both represented by the built-in type int. We then declare two variables of these new types, and use them in an assignment statement.

What will C do if we had written:

length += extra;

The answer is: C doesn’t care! All that C sees is that you have two integer variables, and so it thinks the assignment is fine. It doesn’t understand anything about how your new types relate to each other. This is very unfortunate, because C could easily have been made to disallow such assignments. C is a weakly typed language, which means that it doesn’t do all it could in checking your usage of types in the program, and this is one of the main points against it.

So why are typedef’s useful? Well, for one it helps with the readability of your program. Another programmer seeing a variable defined as feet will understand alot more than if they saw the variable defined as int. Secondly, if you need to modify the program to, say, use long for feet rather than int, you only need to change it in one place. Finally, typedef’s become very important with aggregate data types, as we’ll see next.

Aggregate Data Types

Arrays are aggregate data types. They allow you to declare a variable that holds a collection of values, all of the same type. (Alternatively, you can think of an array as a collection of variables, all of the same type).

Structures allow you to declare variables that hold a collection of values of different types. (Alternatively, a collection of variables of different types). Structures are very important. In C a structure is declared using the struct statement.

Structures also define new types. In old C, you had to use a typedef along with a struct to define the new type. C++ allows the struct alone to define the new type. A structure definition looks like (plain C on left (first), C++ on right (second)):

typedef struct point_struct  
{  
   int x;  
   int y;  
   int z;  
   float color;
} Point;

struct Point
{
   int x;
   int y;
   int z;
   float color;
};

Linked Structures

The two ideas of structures for holding data records and pointers pointing to allocated data items can be combined to form very powerful data modeling and access methods in programs.

The basic idea for this is to create structures which contain as fields pointers to other structures. These other structures might be of a different type, or might be of the same type as themselves. In this way we can create large connected structures of data.

Linked Lists

The most basic type of linked structure is a linked list. In this simple mechanism, each structure contains a pointer to one other structure of the same type. The pointer field of one data item points to another data item, whose pointer points to the next, and so on. Thus, the data items form a linked chain of items, which computer scientists call a linked list.

The first item in a linked list is called the head of the list.

An example of a struct declaration that will be used as a linked list is:

typedef struct PointStruct
{
   int X;
   int Y;
   int Z;
   struct PointStruct *Next;
} PointStruct;

This struct contains the position information about a point in a graphics scene, along with a pointer to the next point. We could use this linked list to represent the corners of a polygon. Our polygon definition could be something like:

typedef struct PolygonStruct
{
   int FillColor;
   int BorderColor;
   struct PointStruct *Corners;
   struct PolygonStruct *Next;
} PolygonStruct;

With this definition, each polygon records points to a list of point records that are its corners, and it points to the next polygon record.

The last data item in a list must somehow signal that there are no more items in the list. As we said before, a NULL pointer value (a pointer value of 0), indicates that the pointer is invalid. For linked structures, this value indicates that the current items is the last item in the list. The last item MUST have a next pointer field with a value of NULL!

Processing a linked list

Lists are typically used when one needs to store an unbounded number of data items (and thus cannot use an array), but where no efficient searching and sorting methods are needed. List are generally processed completely each time something needs to be done with them. For example, if we wanted to draw all of our polygons, we would could do the following:

PolygonStruct TmpPoly;
TmpPoly = ScenePolygons; // assume a global pointer to head of list
while (TmpPoly != NULL)
{
   DrawPolygon(TmpPoly);
   TmpPoly = TmpPoly->Next;
}

This loop is a typical linked list processing loop. It simply walks a temporary pointer down the list, and calls a function DrawPolygon on each item in the list. It stops when the temporary pointer is NULL, indicating the end of the list. The function DrawPolygon would look something like:

int DrawPolygon(PolygonStruct *Poly)
{
   PointStruct TmpPoint;
   TmpPoint = Poly->Corners;
   // while there is a next point
   while (TmpPoint->Next != NULL)
   {
      // draw line from current point to next point
      DrawLine(TmpPoint, TmpPoint->Next, Poly->BorderColor);
      TmpPoint = TmpPoint->Next;
   }
   // draw line from last point to first point
   DrawLine(TmpPoint, Poly->Corners, Poly->BorderColor);
   return 0;
}

This function uses the list of points in a slightly more complicated manner than simply walking the list of points. This is because each side of the polygon is a line between two consecutive points, so we need a pair of points to draw the line.

So, rather than checking if our current pointer is null, we check wether the next pointer is null. If it is not, we know we have at least one more point, and thus we can draw a line between the current point and the next one. At the end, to close the polygon we have to draw a line from the last point to the first. Note that this function does nothing with the FillColor field of the polygon struct. Doing filled polygons is a little too complicated for this introduction!

Other resources

About.com's struct help page