Wednesday, March 21, 2012

Building Blocks, Analysis of Algorithms & Rates of Growth –Part 2


In the previous tutorial (Part 1) we briefly described how to mathematically describe an algorithms running time or number of basic operations based on a input “n”, although informally.
I’ve read several books on Algorithms & computation and several dozen articles or papers on the subject, however I find the book “Introduction to Algorithms” By Cormen, Leiserson, Rivest and Stein to be the best in terms of explaining Algorithmic analysis.

This book has become so popular that it is known as CLRS (Acronym formed by the authors 1st names).
Let’s briefly review rates of growth exhibited by most Algorithms:

Rates of growth in order:
Log2 n < n < n log2 n < n2 < n3 < 2n
So let’s again, informally describe the code sample below:

Now let’ analyze the above code snippet :
1)      There is a 1 time variable declaration in the outer loop “int i=0”
2)      The expression “i<10” is evaluated 10 times, and if true “i” value is printed 10 times
3)      The “int k=0” is declared for the inner loop per every “i” outer loop’s iteration
4)      The expression “k<10” is evaluated 10 times, and if true “k” value is printed 10 times
5)      The inner loop iterates “k++” 10 times , per every “i” outer loop’s iteration
6)      The outer loop iterates 10 times “i++”
There are about 430 basic operations, and that can vary based on implementation.
Roughly speaking,  each loop iterates 10 times & prints the value of “I” per iteration and per every iteration the inner loop  iterates 10 times & executes 10 prints, so 20x20=400. If we replaced 10 with n in the “i<0” and “k<n” , then we get n *n , n2 executions.
As you can tell although correct, our observation is very wordy and inconsistent because there are many ways to implement the same loop, therefore we need a consistant , implementation dependent, and easily understood method to describe algorithms.

Formal method for analyzing algorithms:
 According to CLRS “When we look at input sizes large enough to make only the order of growth of
the running time relevant, we are studying the asymptotic efficiency of algorithms.
That is, we are concerned with how the running time of an algorithm increases with the size of the input in the limit...”.
 In our explanation we calculated that the outer & inner loops has 430 operations, but taking into account the major feature of the code snippet can be simplified to n2 operations, therefore we are ignoring insignificant operations known as constants “c”, for large input “n”.
We have 3 major notations to describe algorithms (Summarized from CLRS):
Θ - notation (Theta)
For a given function g(n) we denote by Theta (g(n)) , pronounced “Theta g of n”
Theta(g(n)) = {f(n) such that there exist positive constants c1,c2,and n0 , such that 0<=c1g(n)<=f(n)<=c2g(n) for all n >= n0.
“A function f(n) belongs to the set Theta(g(n)) if there exist positive constants c1 and c2 such that it can be sandwiched between c1g(n) and c2g(n) for sufficiently large n”(CLRS).
In terms of our outer loop & inner loop example , the code snippet code be describe as Theta(n2), meaning there is no way the code will execute in less than n2 operations and no more than n2 + some constants c, hence Theta(n2).
O- notation (Big O)
Big-O notation for describing maximum number of operations, or max running time (upper bound)
For a given function g(n) we denote by Theta (g(n)) , pronounced “Big-O , g of n ”
O(g(n)) = {f(n) such that there exist positive constants c and c0 such that 0<=f(n)<cg(n) for all n>=n0.
In terms of our outer loop & inner loop example, the code snippet code be describe as O(n2),
Meaning there is no way the code will execute in more than n2 + some constants c . Since the loop is the dominating feature and there is n operation per loop and inner loop is nested in the outer loop therefore n2 + constants operations, there exist a third nested loop or any other feature that dominates n2, lets say n3.
Ω - notation (Big Omega)
Big-Omega notation for describing minimum number of operations, or minimum time (lower bound)
(g(n)) = {f(n) such that there exist positive constants c and n0 such that 0<=cg(n)<=f(n) for all n>= n0.
In terms of our outer loop & inner loop example, the code snippet code be describe as (n2),
Meaning given large enough input or there exists inputs, such that the running will not less than n2 operations, regardless of implementation.
If a function or algorithms complexity can be described as Theta(n), Theta (log n), or Theta etc.., than it also implies O(n),O(log n), O etc and Ω(n), Ω(log n), Ω etc, this is because range in Theta automatically implies Big-O and Omega, but not the other way around.
I heavily referred to CLRS chapters 2 & 3 , for more detailed explanation please read URL:
Going forward we will be primarily using O “Big-O” therefore describing the upper bound for running time of algorithms.
Now let’s state the running time of basic mathematical operations addition, subtraction, multiplication, division, etc.

Upper Bound

Addition
O(n)
Basic or School book method
Subtraction
O(n)
School book method
Multiplication
O(n2)
School book method
Division
O(n2)
School book method
(http://en.wikipedia.org/wiki/Computational_complexity_of_mathematical_operations)
Note: There are more efficient methods for multiplication and division , we will encounter them later.
Examples:
1.For each of the following code fragments, give a Big-Oh analysis of the running time in terms of N:
    (a)
        for ( int i = 0, Sum = 0; i < N; i++ )     
            for( int j = 0; j < N; j++ )           
                Sum++;           
Well each loop is O(n) operations or linear running time, and we have a nested loops which terminated based on the variable “N” , therefore n2 operations + addition which is n operations , therefore O(n2) running time or quadratic.
(b)
       for( int i = 0, Sum = 0; i < N * N; i++ )  
            for( int j = 0; j < N; j ++ )          
                Sum++;     
The 1st loops running time is dominated by multiplication which is O(n2). The nested loops execution is based on this expression “j<N”. N’s value in the inner loop is determined by multiplication, from the outer loop thus O(n2)*n , simplifies to O(n3) running time,  or cubic.
(c)
      for( int i = 0, Sum = 0; i < N; i++ )      
            for( int j = 0; j < i; j ++ )          
                Sum++;                             
The 1st loop has O(n) running time, however the running of the nested loop is defined by analyzing the expression “j<i” , “j” must be “i-1” for the inner loop to execute at each iteration thus n * n -1 = n2 – n , simplifies to 0(n2), thus quadratic.
         (d)
         for( int i = 0, Sum = 0; i < N; i++ )      
            for( int j = 0; j < N * N; j++ )       
                for( int k = 0; k < j; k++ )                           
                          Sum++;
The 1st loop has n operations, the 2nd loop is nested and performs n2 operations, and the third loop performs n-1 addition operations, thus f(n) = n[n2(n-1+n)] ,simplifies to f(n)=n5 operations, or O(n5) running time, thus exponential.
2. An algorithm takes 0.5 milliseconds for input size 100. How long will it take for input size 500 if the running time is:
 a. Linear
f(n) = 0.5 millisecs * (n/100)
f(500) = 2.5 millisecs
 b. O( N log N) 
f(n) = [0.5 (n log n) / (100 log 100) ]
f(500) = 3.373  millisecs      
c. Quadratic       
f(n) = 0.5n2  / 10,000      
f(500) = 12.5 millisecs
 d. Cubic
f(n) = 0.5n3 / 100,000  
f(500) = 625 millisecs
3. Find the Big-O for the following functions:
    (a) f(x) = 2x3 + x2 log x           
O(x3)
    (b) f(x) = (x4 – 34x3 + x2 -20)
O(x4)    
    (c) f(x) = x3 – 1/log(x)                       
O(x3)            
We are finally finished with part 2.
In part 3, we will discuss Recursion and its running in detail.
References
1.       Cormen, Thomas, Charles Leiserson, et al. Introduction To Algorithms. 2nd. Massachusetts: MIT Press, 2009. 41-48. Print.