Lecture 3
Asymptotic Notation
The result of the analysis of an algorithm is usually a formula giving
the amount of time, in terms of seconds, number of memory accesses,
number of comparisons or some other metric, that the algorithm takes.
This formula often contains unimportant details that don't really tell
us anything about the running time. For instance, when we analyzed
selection sort, we found that it took T(n)
= n2 + 3n - 4 array accesses. For large values
of n, the 3n - 4 part is insignificant compared to the
n2 part.
When comparing the running times of two algorithms, these lower order
terms are unimportant when the higher order terms are different. Also
unimportant are the constant coefficients of higher order terms; an
algorithm that takes a time of 100n2 will still
be faster than an algorithm that takes n3 for
any value of n larger than 100. Since we're interested in
the asymptotic behavior of the growth of the function,
the constant factor can be ignored.
Upper Bounds: Big-O
We need a formal way of expressing these intuitive notions. One popular
way is "big-Oh" notation. It tells us that a certain function will never
exceed another, simpler function beyond a constant multiple and for large
enough values of n. For example, we can simplify
3n2 + 4n - 10 to
O(n2). We write "3n2 +
4n - 10 = O(n2)" and say
"three n squared plus four n minus ten is big-Oh of n squared."
We might also say "...is in big-Oh..," but we don't say
"...is equal to big-Oh..," the equal sign in this case is more
like a set-membership sign.
Let's define big-Oh more formally:
O(g(n)) = {
the set of all f such that there exist positive constants
c and n0 satisfying
0 <= f(n) <= cg(n) for
all n >= n0 }.
This means that, for example, O(n2) is
a set, or family, of functions like
n2 + n, 4n2 -
n log n + 12, n2/5 - 100n,
n log n, 50n, and so forth.
Every function f(n) bounded above by some constant multiple
g(n) for all values of n greater than a certain value
is in O(g(n)).
Examples:
- Show 3n2 + 4n - 2 = O(n2).
We need to find c and n0 such that:
3n2 + 4n - 2 <= cn2 for all n >= n0 .
Divide both sides by n2, getting:
3 + 4/n - 2/n2 <= c for all n >= n0 .
If we choose n0 equal to 1, then we need a value of
c such that:
3 + 4 - 2 <= c
We can set c equal to 6. Now we have:
3n2 + 4n - 2 <= 6n2 for all n >= 1 .
- Show n3 != O(n2).
Let's assume to the contrary that
n3 = O(n2)
Then there must exist constants c and n0 such that
n3 <= cn2 for all n >=
n0.
Dividing by n2, we get:
n <= c for all n >=
n0.
But this is not possible; we can never choose a constant c
large enough that n will never exceed it, since n
can grow without bound. Thus, the original assumption, that
n3 = O(n2), must
be wrong so n3 != O(n2).
Big-Oh gives us a formal way of expressing asymptotic upper bounds,
a way of bounding from above the growth of a function. Knowing where
a function falls within the big-Oh hierarchy allows us to compare it
quickly with other functions and gives us an idea of which algorithm has
the best time performance.
Lower Bounds: Omega
Another way of grouping functions, like big-Oh, is to give an
asymptotic lower bound. Given a complicated function
f, we find a simple function g that, within a
constant multiple and for large enough n, bounds f
from below. Define :
(g(n)) = {
the set of all f such that there exist positive constants
c and n0 satisfying
0 <=
cg(n)
<
f(n)
for
all n >= n0 }.
This gives us a somewhat different family of functions; now i
any function f that grows strictly faster than g is in
(g). So, for example,
n3 =
(n2).
Tight Bounds: Theta
Neither big-Oh or Omega are completely satisfying; we would like a tight
bound on how quickly our function grows. To say it doesn't grow any
faster than something doesn't help us know how slowly it grows, and
vice-versa. So we need something to give us a tigher bound; something
that bounds a function from both above and below. We can combine
big-Oh and Omega to give us a new set of functions, Theta:
(g(n)) =
{ the set of functions f(n) such that
f(n) =
O(g(n))
and
f(n) =
(g(n)) } .
It is equivalent to say:
(g(n)) = {
the set of all f such that there exist positive constants
c1, c2 and n0 satisfying
0 <=
c1g(n)
<=
f(n) <=
c2g(n)
for
all n >= n0 }.
Whenever possible, we try to bound the running time of an algorithm
from both above and below with Theta notation.
Convenient Theorems
For polynomials, and indeed a large class of functions that are the sums
of monotonically increasing terms, we can simplify the process of O
etc. notation by noticing these rules. Below, p and q
are arbitrary polynomials:
- c p(n) =
(p(n)).
- p(n) + q(n) =
(max (p(n), q(n)) .
- nk = O((1 +
)n) for any
constant k and any positive constant
(i.e., any polynomial is
bounded by any exponential).