ECS110 Lecture Notes for Friday, November 10, 1995
Professor Rogaway conducting
Lecture 17
Scribe of the day: Hemal Patel
Today: Continuing with Heap Sorting
----------------------------------------------------------------------------
Given a typical binary heap as follows:
50
/ \
30 20
/ \ /\
10 5 3 1
/ \
7 6
we can do a procedure called a "sift-down" which would help us delete the
maximum element from the tree. This would follow the steps:
a) Kill the max element in the root
b) Move rightmost child left
c) Swap each element as you go down the tree.
For example, if we were to kill 50 in our tree above, the result would be:
30
/ \
10 20
/ \ /\
7 5 3 1
/
6
Similarly, we can do a procedure called a "sift-up" which would traverse the tree upwards
rather than downwards but instead of deleting max, it will insert an element into the tree.
If we were to insert 25 into our tree above, the result would be:
30
/ \
25 20
/ \ / \
10 5 3 1
/ \
6 7
Both these procedures would take O(log n) time.
---------------------------------------------------------------------------
INSERTING INTO A TREE
Given elements 42, 7, 13, 22, and 5, we want to insert into a tree. So begin with the
first element 42, and put it's location as root. Next is 7, make it leftchild of 42.
Next is 13, make it rightchild of 42. 22, being greater than 7 cannot be the child
(child cannot be greater than parent) of 7 hence a swap is carried out here between 7
and 22. So 22 is now leftchild of 42, and 7 is in turn leftchild of 22. 5 as the last
element is rightchild of 22, and we have our sorted tree.
42
/ \
7 13
/ \
22 5
BEFORE
42
/ \
22 13
/ \
7 5
AFTER
DELETE MAX:
Now we want to destroy our recently built tree. Begin from the root, 42 and it being
the greatest, delete it from the tree. Shift-up 22, making it the root with its
leftchild as 7 and rightchild as 13. Continue the procedure, each time deleting from
the root and shifting elements up. Finally, we'll be left with no tree, no elements!
-----------------------------------------------------------------------------------------
AN ARRAY-REPRESENTATION OF HEAPS:
Going back to our elements 42, 7, 13, 22, 5, we can sort them using an array.
The node locations are numbered and used as the indexes of the array, so we
would have an original unsorted array as follows:
_1_____2____3____4_____5__
| | | | | |
| 42 | 7 | 13 | 22 | 5 | UNSORTED ARRAY
|____|____|____|____|____|
__1____2____3____4____5___
| | | | | |
| 42 | 22 | 13 | 7 | 5 |
| | | | | | SORTED ARRAY
|____|____|____|___|_____|
This procedure involves looking at each element, 42 being the first and since
it is the greatest it is regarded as being good heap. 7 is next, it is also in
the right spot and so is 13. But 22 is not in the right spot so it is swapped
with it's parent, 7 at location 2. Hence 22 is now at location 2 and 7 is at
location 4. 5 is also in place, and we have an array that is fully sorted.
-----------------------------------------------------------------------------------------------
DELETING MAX FROM AN ARRAY:
Just like inserting, we can also delete max from the array. Begin with the
root, and stick every deleted element at the end of the array, assuming that
you have space there. This is known as shift operations which doesn't require
extra space.
____________________________________________________________________________________________
LOWERBOUND ON SORTING TIMES:
Theorem:
Every comparison-based sort takes Omega(n log n) time because it is lowerbased.
In fact, it takes greater or equal to ceiling (log n!) comparisons.
Let's suppose we are sorting three numbers namely a, b and c.
Begin by comparing a and b, a can be either <= b, or > b. If a is <=b, then
compare b and c. b is either <= c or > c. If b is <= c, then we know that
a<=b<=c but if it is > c, then we have to still compare a and c. Going
by the same comparisons above, we then conclude that a<=c<=b or c<=a<=b.
And if a is > b, compare b and c. b is either <= c or > c, and we know
now that c<=b<=a.
a:b
<= / \>
b:c b:c
<=/ \> <=/ \>
a<=b<=c a:c a:c c<=b<=a
<=/ \> / \
a<=c<=b c<=a<=b
Therefore, we can generally conclude that if there is a binary tree wtih
n! leaves, then it has (n!-1) internal nodes. And there are total 2n! -1
nodes in the tree.
Depth of this sorting tree is
log(n!) + 1
therefore there are >= log n! comparisons made. And log n! is
log(n!) = n log n - 1.44 n
n! = (n/e)^n (square root(2 pi n))
log(n!) = log (n/e)^n
= n log n.
// we cannot sort faster than nlogn time in a comparison based sort.
---------------------------------------------------------------------------------
SORTING BY BUCKET-SORTING:
Suppose we want to sort n numbers each between 1 and 100.i.e
a1, a2, .........., an. And a1 = 0, a2=2, and so on and a99=1.
Constructing an array indexed fron 1 to 100, we can say that index
2 gets 2, 3 gets 3, 1 gets 0, 99 gets 1, and so on.
1|-----|
|__0__|
2| 2 |
|____ |
3| 3 |
|____ |
4| 0 |
|_____|
| |
|_____|
99| 1 |
|_____|
100| 2 |
|_____|
and the above sorted list would be 2, 2, 3, 3, 3, ... ,99, 100, 100.
Bucket Sorting n numbers in the range [1 .. m] takes:
O(n+m) time
O(m) space
______________________________________________________________________________
EXAMPLE USING BUCKET SORTING:
Sort the following 10 numbers :
251, 231, 114, 231, 423, 513, 452, 155, 355, 423.
Start by First-Digit_Sort:
_____
1| |->114->155
|___|
2| |->251->231->231
|___|
3| |->355
|___|
4| |->423->452->423
|___|
5| |->513
|___|
Hence we have a partially sorted list of numbers.
Sort again but this time sort based on the second digit:
_____
1| |->114->513
|___|
2| |->423->423
|___|
3| |->231->231
|___|
4| |
|___|
5| |->155->251->355->452
|___|
Taking 114, it is sorted by first and second digits so output it. Next is
155, shift it to index 5 as second difit is 5. Next is 251, so first
output 155 and stick 251 in the index 5. 231 goes to index 3 and since
there are two 231s, a third sort is carried out between these same
numbers. 355 is next, stick it index 5 but outpot 251 first. Continue
until all ten numbers are sorted, and we'll get a sorted list:
114, 155, 231, 231, 251, 355, 423, 423, 452, 513.
The above method of sorting is called the Forward Radix Sort, otherwise
called the "most significant first Radix Sort".