In the following cases, using SAX parser is advantageous than using DOM parser.
- The input document is too big for available memory (actually in this case SAX is your only choice)
- You can process the document in small contiguous chunks of input. You do not need the entire document before you can do useful work
- You just want to use the parser to extract the information of interest, and all your computation will be completely based on the data structures created by yourself. Actually in most of our applications, we create data structures of our own which are usually not as complicated as the DOM tree. From this sense, I think, the chance of using a DOM parser is less than that of using a SAX parser.
In the following cases, using DOM parser is advantageous than using SAX parser.
- Your application needs to access widely separately parts of the document at the same time.
- Your application may probably use a internal data structure which is almost as complicated as the document itself.
- Your application has to modify the document repeatedly.
- Your application has to store the document for a significant amount of time through many method calls.
Example (Use a DOM parser or a SAX parser?):
Assume that an instructor has an XML document containing all the personal information of the students
as well as the points his students made in his class, and he is now assigning final grades for the
students using an application. What he wants to produce, is a list with the SSN and the grades. Also we
assume that in his application, the instructor use no data structure such as arrays to store the student
personal information and the points.
If the instructor decides to give A's to those who earned the class
average or above, and give B's to the others, then he'd better to use a DOM parser in his application. The
reason is that he has no way to know how much is the class average before the entire document gets
processed. What he probably need to do in his application, is first to look through all the students' points
and compute the average, and then look through the document again and assign the final grade to each
student by comparing the points he earned to the class average.
If, however, the instructor adopts such a
grading policy that the students who got 90 points or more, are assigned A's and the others are assigned
B's, then probably he'd better use a SAX parser. The reason is, to assign each student a final grade, he
do not need to wait for the entire document to be processed. He could immediately assign a grade to a
student once the SAX parser reads the grade of this student.
In the above analysis, we assumed that the instructor created no data structure of his own. What if he creates his own data structure, such as an array of strings to store the SSN and an array of integers to sto re the points ? In this case, I think SAX is a better choice, before this could save both memory and time as well, yet get the job done.
Well, one more consideration on this example. What if what the instructor wants to do is not to print a list, but to save the original document back with the grade of each student updated ? In this case, a DOM parser should be a better choice no matter what grading policy he is adopting. He does not need to create any data structure of his own. What he needs to do is to first modify the DOM tree (i.e., set value to the 'grade' node) and then save the whole modified tree. If he choose to use a SAX parser instead of a DOM parser, then in this case he has to create a data structure which is almost as complicated as a DOM tree before he could get the job done.
|
|