C: User-defined Types, Structures and Linked Structures
From old CS271 notes…
User Defined Data Types
Creating a program essentially boils down to two tasks: expressing the problem’s data in the types given by a programming language and then expressing the problem’s behavioral solution in the actions of the programming language.
Most of us think of programming as the second part, but the first part, the data expression, is probably more important! Frederick Brooks, a famous computer scientist who wrote The Mythical Man-Month said something like “show me your program without showing me your data representation and I’ll never understand your program, but show me your data representation first and the program will naturally follow.”
All programming languages allow users to define new types of
data, and in C this is done with the typedef
statement.
New types are always built out of existing types, so that
ultimately all data is represented by C’s built-in datatypes.
Nevertheless, building your own types is very important in large
programming projects. New type definition is also the basis
for Object-Oriented Programming (although plain C does not take
it that far). Typedef is used as:
typedef oldtype newtype
The above statement allows you to add a new name to C’s
type system, called newtype
, and whenever the compiler
see’s it in your program, it knows that it is a synonym for
the type oldtype
. The following code is an example:
typedef int feet;
typedef int inches;
...
feet length;
inches extra;
...
length += extra*12;
Above, we have created two new types for our program, feet
and inches
, which are both represented by the built-in
type int
. We then declare two variables of these new
types, and use them in an assignment statement.
What will C do if we had written:
length += extra;
The answer is: C doesn’t care! All that C sees is that you have two integer variables, and so it thinks the assignment is fine. It doesn’t understand anything about how your new types relate to each other. This is very unfortunate, because C could easily have been made to disallow such assignments. C is a weakly typed language, which means that it doesn’t do all it could in checking your usage of types in the program, and this is one of the main points against it.
So why are typedef
’s useful? Well, for one it helps with
the readability of your program. Another programmer seeing a
variable defined as feet
will understand alot more than
if they saw the variable defined as int
. Secondly, if
you need to modify the program to, say, use long
for
feet rather than int, you only need to change it in one place.
Finally, typedef’s become very important with aggregate data
types, as we’ll see next.
Aggregate Data Types
Arrays are aggregate data types. They allow you to declare a variable that holds a collection of values, all of the same type. (Alternatively, you can think of an array as a collection of variables, all of the same type).
Structures allow you to declare variables that hold a
collection of values of different types. (Alternatively,
a collection of variables of different types). Structures are
very important. In C a structure is declared using
the struct
statement.
Structures also define new types. In old C, you had to use a
typedef
along with a struct
to define the
new type. C++ allows the struct
alone to define the
new type. A structure definition looks like (plain C on left (first),
C++ on right (second)):
typedef struct point_struct
{
int x;
int y;
int z;
float color;
} Point;
struct Point
{
int x;
int y;
int z;
float color;
};
Linked Structures
The two ideas of structures for holding data records and pointers pointing to allocated data items can be combined to form very powerful data modeling and access methods in programs.
The basic idea for this is to create structures which contain as fields pointers to other structures. These other structures might be of a different type, or might be of the same type as themselves. In this way we can create large connected structures of data.
Linked Lists
The most basic type of linked structure is a linked list. In this simple mechanism, each structure contains a pointer to one other structure of the same type. The pointer field of one data item points to another data item, whose pointer points to the next, and so on. Thus, the data items form a linked chain of items, which computer scientists call a linked list.
The first item in a linked list is called the head of the list.
An example of a struct declaration that will be used as a linked list is:
typedef struct PointStruct
{
int X;
int Y;
int Z;
struct PointStruct *Next;
} PointStruct;
This struct contains the position information about a point in a graphics scene, along with a pointer to the next point. We could use this linked list to represent the corners of a polygon. Our polygon definition could be something like:
typedef struct PolygonStruct
{
int FillColor;
int BorderColor;
struct PointStruct *Corners;
struct PolygonStruct *Next;
} PolygonStruct;
With this definition, each polygon records points to a list of point records that are its corners, and it points to the next polygon record.
The last data item in a list must somehow signal that there are no more items in the list. As we said before, a NULL pointer value (a pointer value of 0), indicates that the pointer is invalid. For linked structures, this value indicates that the current items is the last item in the list. The last item MUST have a next pointer field with a value of NULL!
Processing a linked list
Lists are typically used when one needs to store an unbounded number of data items (and thus cannot use an array), but where no efficient searching and sorting methods are needed. List are generally processed completely each time something needs to be done with them. For example, if we wanted to draw all of our polygons, we would could do the following:
PolygonStruct TmpPoly;
TmpPoly = ScenePolygons; // assume a global pointer to head of list
while (TmpPoly != NULL)
{
DrawPolygon(TmpPoly);
TmpPoly = TmpPoly->Next;
}
This loop is a typical linked list processing loop. It simply
walks a temporary pointer down the list, and calls a
function DrawPolygon
on each item in the list. It stops
when the temporary pointer is NULL, indicating the end of the
list. The function DrawPolygon
would look something
like:
int DrawPolygon(PolygonStruct *Poly)
{
PointStruct TmpPoint;
TmpPoint = Poly->Corners;
// while there is a next point
while (TmpPoint->Next != NULL)
{
// draw line from current point to next point
DrawLine(TmpPoint, TmpPoint->Next, Poly->BorderColor);
TmpPoint = TmpPoint->Next;
}
// draw line from last point to first point
DrawLine(TmpPoint, Poly->Corners, Poly->BorderColor);
return 0;
}
This function uses the list of points in a slightly more complicated manner than simply walking the list of points. This is because each side of the polygon is a line between two consecutive points, so we need a pair of points to draw the line.
So, rather than checking if our current pointer is null, we check
wether the next pointer is null. If it is not, we know we
have at least one more point, and thus we can draw a line between
the current point and the next one. At the end, to close the
polygon we have to draw a line from the last point to the first.
Note that this function does nothing with the FillColor
field of the polygon struct. Doing filled polygons is a little
too complicated for this introduction!