Java 12 - Mergesort and Performance Analysis
Analysis of Mergesort
- Seems alot like binary search. It divides the problem in two at every
step.
- So, it has log2(N) levels where it recursively calls Mergesort
- After the log2(N)'th level, we are at the point of an array to sort
of size 1
- But it does this for every portion of the array; it doesn't "throw
away" any portion
- But, in each step, it doesn't just do one compare. It merges.
- The Merge call at each step scans through all of the elements that
that step is for.
- Across all steps at a certain level, it does a total of about N scans
- E.g., after two divisions, the original array is split into fourths,
and each fourth is merged, so across all four fourths, all N elements are
merged. This is true at every level.
- So (N scans) * (log2(N) levels) means that Mergesort takes about
N*log2(N) operations to sort N elements
- For sorting, this is actually really good. Its much better than N^2
- Unlike searching, for sorting you basically have to at least visit
every element. So it can never be less than linear (searching could just
look at much fewer elements).
- E.g., for a million elements, BubbleSort would take 1.0e12 operations,
or a trillion, while MergeSort would take 1.99e7, or about 20 million.
Performance Analysis
- In class, we analyzed how well different algorithms do, like linear
search and binary search, and insertion sort and merge sort
- But how do we know our programs really do this? And if our program
isn't one of these simple algorithms, how do we know "how well"
it is performing?
- Performance analysis is the act of determining how well the
program runs on different sizes of inputs -- how fast it runs, how much
memory does it use, etc.
- Thus, programming is more than just writing a correct program
-- it is
- Selecting efficient algorithms for the problem,
- Programming them correctly,
- Analyzing the performance of the program,
- And using that analysis to improve the program (if it needs it)
- Sometimes, performance analysis means you have to add statements to
your program that just gather statistics, but do not contribute to solving
the problem
- We call this instrumenting your code
- Like your electric meter on your house/apartment
Analyzing Sorting Algorithms
- In class, we just looked at how many times each element was accessed
- In our program, we can count different types of operations
- Comparison operations
- Assignment operations
- In your lab, you need to instrument your mergesort program to count
the number of assignments that take place.