To test understanding of arrays and strings, especially the use of an array of objects. The pages to read in the textbook are 68-71.
Write a program in Java to build and print a concordance from a piece of arbitrary English text. A concordance is a list of all the different words that occur in a piece of text, together with the number of times each occurs.
Assuming the concordance will essentially be an array of objects, each of which contains a string and a count, a number of problems will have to be solved:
The success of your program is dependent on answering these questions correctly, incorporating them into a design, and then writing the appropriate code.
Your design should use three classes, one which stores a word and its frequency count, one that stores the concordance list itself, and one that includes the main method.
So that the concordance list has no duplicates, you must check the list every time a word is read from the input stream to see whether the word is already present. To do this you will have to compare two strings. Since instances of the String class are objects, you cannot usee the == operator. You must use the equals method. See class String for all the String methods.
Although these can be handled easily by using you own Java code, the Character class contains a number of useful routines for handling characters. In particular, the class method "toLowerCase" converts a character to lower case, while "toUpperCase" converts it to upper case. In addition, there are testing functions like "isDigit", "isLetter" etc. Visit the documentation page for the class Character.
The input will be from a file and can be any piece of text. Do not use a Java program as input. Instead you should use a piece of normal English text. We will test your program on a simple piece of text that we will choose.
Your program should print a list of the different words found in the text, and the number of times it occurs. E.g. for the text:
The cat sat on the mat. She was sleepy and comfortable on the mat.
You should get:
1: the 3 2: cat 1 3: sat 1 4: on 2 5: mat 2 6: she 1 7: was 1 8: sleepy 1 9: and 1 10: comfortable 1
September 26th. 1997 before 5:00pm.