CS177/457
C++ Programming
Spring 1998
A Text Analyzer
Write a program to read any text file and print out a table listing
the frequency of occurrence of each letter of the alphabet. The program
should also print a count of all the whitespace characters, and all the
non-alphabetic, non-whitespace characters in the file, and the text itself.
Procedure
-
Study the attached class declaration of CType and and derive three new
classes from it: one for alphabetic characters, one for whitespace characters
and one for all others. You will need to use the constructor and member
functions addChar and addRange for CType to do this. Each class should
encapsulate a counting mechanism specialized for it.
-
Design an application class to prompt for a file name, read the text file
and analyze it using the classes you have derived from CType.
-
Write and compile the program.
-
Test your progam using suitable text files.
Hints
-
The crucial part of this assignment is understanding CType and how to use
it. Leave the code for CType untouched--you will not need to change it.
-
Use the same function for counting (with different definitions) in each
of your derived classes. This way the code for the application (the run
function) will be uniform across all three classes.
-
For alphabetic characters, use an integer array to hold the character counts,
indexed by the character's ASCII value offset by the value for 'a'. i.e.
The count for 'a' will be in element 0, the count for 'b' in element 1,
and so on. One thing you will have to do is to treat upper and lower case
letters the same, e.g. "N' is the same as "n", P" the same as "p", and
so on. Use the values of 'a' and 'A' to figure this out. You can also use
the functions in the C library with header file ctype.h to do this. (Note
this is nothing to do with my class CType.) For whitespace and all other
characters, a simple count is all that is needed.
-
Make sure you handle end of file correctly. It is not a character, and
should not be counted with any of the other groups.
Reading from a text file is just as easy as using standard input. e.g.
#include <iostream.h>
// standard input and output
#include <fstream.h>
// file handling capabilities
...
char fn[256];
cin >> fn; // reads the
file name from the user
ifstream in(fn); // opens
the file for reading
...
char c;
in.get(c)... // reads a
single character from the file
...
if (in.eof())... // tests
for end of file condition
Deliverables
-
A printout of your program source code.
-
Printouts of your program running on a test files.
-
Printouts of the test files you used.
Due Date
Hand in your documents to me (RTH) on Monday, May 11th. before 5:00
pm. Mail your source code to the grader: login hhuang.
Here is the code for the class Ctype (download
it):
class CType {
private:
class CSubRange
{ // a private nested class
private:
char
low, high; // the upper and lower bounds of the range
public:
CSubRange(int
l = 0, int h = 0) : low(l), high(h) {}
bool
inRange(char c) { return c >= low && c <= high; }
};
protected:
char *singles; //
a pointer to an array of characters
CSubRange *ranges;
// a pointer to an array of sub-ranges
int nSingles, nRanges;
// the number of characters in
// singles and the
// number of sub-ranges in ranges
public:
CType(int nMaxRanges
= 0, int nMaxSingles = 0) :
singles(new char[nMaxSingles]), ranges(new CSubRange[nMaxRanges]),
nSingles(0), nRanges(0) {}
~CType() { delete
[] singles; delete [] ranges; }
void addChar(char
c) { // add a new character
singles[nSingles++]
= c;
}
void addRange(int
l, int h) {
// add
a new sub-range
ranges[nRanges++]
= CSubRange(l, h);
}
bool contains(char
c) const; // returns true if c is a member of
// the
type
};
bool CType::contains(char
c) const {
// is it one of
the single characters?
for (int s = 0;
s < nSingles; s++)
if (c
== singles[s])
return true;
// is it in one
of the ranges?
for (int r = 0;
r < nRanges; r++)
if (ranges[r].inRange(c))
return true;
return false;
}