To test understanding of arrays and pointers, especially the use of an array of pointers to characters (strings).
Write a program in C to build a concordance from a piece of arbitrary text. A concordance is a list of all the different words that occur in a piece of text.
Assuming the concordance will essentially be an array of pointers to characters (i.e. strings), a number of problems will have to be solved:
The success of your program is dependent on answering these questions correctly, incorporating them into a design, and then writing the appropriate code.
Your design should use at least three functions, including the main function (see the section below).
A pointer to each new word that you read from the input stream must be stored in the array of pointers. Your design should include a separate function to allocate space for the word. I will be sending you the correct code for this operation via email. You should take my code and incorporate it, with suitable comments, into your function. Here is what you will get:
/*********************************************************************** This function takes a pointer to a character that is assumed to be the start of an array. The length of the string (up to but not including the null character at the end) is calculated and space allocated for a copy of the string using calloc, a standard library function. The
string is then copied into the new space by strcpy, a string library
function. Finally, a pointer to the new string is returned. To use this
function include both <stdlib.h> and <string.h> in your program. ************************************************************************/ char *AllocateString(char *s) { char *t = calloc(strlen(s) + 1, sizeof(char)); strcpy(t, s); return t; }
Note that you will need to include both stdlib.h and string.h for this to work.
So that the concordance list has no duplicates, you must check the list every time a word is read from the input stream to see whether the word is already present. Use the library function strcmp to do this, by including the head file string.h at the top of your file.
Although these can be handled easily by using you own C code, the library header ctype.h contains a number of useful routines for handling characters. In particular, "tolower" converts a character to lower case, while "toupper" converts it to upper case. In addition, there are testing functions like "isdigit", "isalpha" etc. Type "man ctype" at the UNIX prompt to get a full list of these useful routines.
The input will be from standard input (no file opening and closing is necessary) and can be any piece of text. Do not use a C program as input. Instead you should use a piece of normal English text. We will test your program on a simple piece of text that we will choose.
Your program should print a list of the different words found in the text. e.g.:
1: the 2: cat 3: sat 4: on 5: mat 6: she 7: was 8: sleepy 9: and 10: comfortable ... ...
April 16th. 1997 before 5:00pm.