-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathheaps.tex
119 lines (103 loc) · 6.09 KB
/
heaps.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
\chapter{Heaps}
For the purposes of generality, instead of referring to
elements that are ``greater than" or ``less than" others,
we will simply say that they are ``better than" or ``worse than"
others. For any particular ordering, the ``best" element is desired first.
A Heap is a binary tree that satisfies the following properties:
\begin{itemize}
\item The root of a heap is better than its two children (the heap property)
\item The children of the root are also heaps
\item A heap is a complete binary tree (only the last level may not be full,
and all elements in the last level are on the left)
\end{itemize}
A heap differs from most binary trees in that it provides no particular
ordering of the elements, but rather guarantees that the best element is
at the root. Further, while heaps are usually discussed and defined
as binary trees, they need not be implemented as such.
A heap can in fact be implemented as an array with no performance reduction.
Contrary to standard theoretical convention, we will be using 0-indexed arrays
rather than 1-indexed arrays. That is to say, the indexing will reflect how
most programming languages perform indexing (0 is the first element, not 1).
To represent a heap as an array, we can implement a simple indexing.
If an element is at the $i$th
index in the array, its left and right child are at the $2i+1$th and $2i+2$th
indexes respectively. By placing the root at the $0$th index, all others follow.
This is depicted below.
{
\includegraphics[scale=0.5]{heapTreeArray}
%\caption{The mapping of a binary tree to an array}
\label{fig:heapTreeArray}
}
Heaps are commonly used to implement priority queues, because they do the
minimum amount of work required to keep track of the best element.
\section{How to Implement a Heap}
A heap must support the following operations
\begin{itemize}
\item $best()$: returns the best element in the heap
\item $pop-best()$: removes the best element in the heap
\item $insert(x)$: inserts $x$ into the heap
\end{itemize}
From the definition of a heap, we know the we can easily implement
$best()$ by returning the root of the heap, which should take $O(1)$
time. However the other operations are less obvious.
To $insert(x)$ recall that a heap must be complete, therefore, if a
new element is added, it must be added to the left-most available
space in the last row of the heap. However, the heap property may now
be violated. If the new element $x$ is worse than its parent $y$, the
heap property is satisfied and we may stop. However if it is not
satisfied we may swap $x$ with $y$ and recurse on $x$'s new
position. This works because we know that $y$ is better than all of
$x$'s children, because the heap property was satisfied before $x$ was
added. Further, because $x$ is better than $y$, it is also better than
all of $y$'s children. However $x$ may still be better than its new
parent, so we must recurse. If $x$ is the new best element, it will
eventually reach the root. Because heaps are complete, this operation
will take $O(\log n)$ time, as this is the height of the heap.
To $pop-best()$, we may simply remove the root, however this
completely destroys the entire heap. Instead, we will swap the root
with the bottom-left-most element, $y$. Now removing the best element
leaves us with a still complete tree. However the heap property has
likely been violated once more. If $y$ is better than both its
children, then we may stop. However, if not, we shall swap $y$ with
its best child and recurse on $y's$ new position. Because the element
we swap with $y$ is better than both $y$ and the other child, the heap
property has been satisfied for this sub-heap. However, the heap
property may still be violated for $y's$ new sub-heap, so we must do
this again, until $y$ is the root of a valid heap. Once more, this
operation requires $O(\log n)$ time, as it must at worse traverse the
entire height of the tree.
Therefore, a heap may support $best()$ in $O(1)$ time, $insert(x)$
in $O(\log n)$ time, and $pop-best()$ in $O(\log n)$ time.
\section{Building a Heap}
Now that we can support all the operations that a heap must implement,
it would be nice to be able to actually construct one given a list of $n$
elements. A na\"ive approach is to simply call $insert(x)$ on every element
in the list. However, since $insert(x)$ requires $O(\log n)$ time, this will
require $O(n \log n)$ time. These seems pretty bad, considering one can find
the best element in a list by brute force in $O(n)$ time. Can we achieve a
construction time comparable to the brute force time?
Instead of building the heap top down with $insert$, we can build it from the
bottom up. Remark that a single element is a valid heap. If we were to try to
build from the bottom up, we could first take the last $n/2$ elements in the
list. All of these elements are their own valid and complete heaps,
and we therefore do not need to do anything to them.
To add the next $n/4$ elements, we simply perform
the procedure we did in $pop-best$ to fix the fact that the new root might
be violating the heap property, knowing that all the elements below it are
valid heaps. By repeating this process until we reach the first element in
the list, we will have created a valid heap on $n$ elements.
Because we are doing very little work for the majority of the elements, we end
up doing only $O(n)$ work over all, which is optimal, as this is the amount of
time required to find the best element.
\section{Heapsort}
Another nice property of a heap is that once one has been implemented,
it provides a very simple procedure for sort elements.
A simple algorithm to do this to construct a heap on the list and then simply
return $best$ and then call $pop-best$ over and over until there are no more
elements in the heap.
In fact, since our algorithm for building the heap is in-place and takes $O(n)$
time, and our remove method leaves the best element at the end of the array,
by simply building a heap on the input array and calling $pop-best$ $n$ times,
we will be left with a reverse sorted array in $O(n \log n)$ time.
This algorithm is particularly excellent because it requires no extra space,
runs deterministically, and is worst-case optimal.