Skip to content

2. Algorithm Analysis

3. Algorithm Analysis 1

  • Asymptototic algorithm analysis:
    • Is a method to compare the resource consumption of two algorithms.
    • It does not show slight differences in resource consumption, but it flags the significant differences.
    • It is an estimation of the cost of an algorithm.
    • It measures the efficiency of an algorithm or its implementation (program) as the input size grows.
    • It analyzes:
      • The time required for an algorithm to run (or an instantiation of an algorithm into a program to execute).
      • The space required for a data structure.
  • Largest-value sequential search:

        /** @return Position of largest value in array A */
      static int largest(int[] A) {
      int currlarge = 0; // Holds largest element position
      for (int i=1; i<A.length; i++) // For each element
      if (A[currlarge] < A[i]) // if A[i] is larger
      currlarge = i; // remember its position
      return currlarge; // Return largest position
      }
    
  • The growth rate for an algorithm is the rate at which the cost of the algorithm grows as the size of its input grows.

  • Growth rates (sorted from fastest to slowest):
    • Linear: \(O(n)\) or \(c \cdot n\) where \(c\) is a constant. Practically, the value of \(c\) is not important.
    • Logarithmic: \(O(n \space log n)\) or \(n \space log n\).
    • Quadratic: \(O(n^2)\) or \(n^2\).
    • Exponential: \(O(2^n)\) or \(2^n\).
    • Factorial: \(O(n!)\) or \(n!\).
  • Growth rates
  • Notes:
    • comparing \(c1 ⋅ n\) and \(c2 ⋅ n\) gives the same comparison as comparing \(n\) and \(n\). So, we can ignore the constant \(c1\) and \(c2\).
    • \(n^a\) grows faster than \(log^a(n)\) or \(log (a^n)\) => Quadratic grows faster than logarithmic.
    • \(a^n\) grows faster than \(n^a\) where \(n\) is the input size and \(a\) is a constant => Exponential grows faster than quadratic.
  • When running an algorithm, there are three cases:
    • Best case. The input is the smallest possible input. Ignored in practice, but there are rare cases where analyzing the best case is important.
    • Average case. When running the algorithm a large number of times on a randomized input. If the program is expected to run a huge number of times on different inputs, then the average is the most representative of the algorithm’s performance, and analyzing it is important.
    • Worst case. The input is the largest possible input.
  • How using a 10X computer affects the running time of an algorithm:
    • Linear: 10X computer => 10 times faster.
    • Logarithmic: 10X computer => between \(\sqrt{10}\) and \(10\) times faster.
    • Quadratic: 10X computer => \(\sqrt{10}\) times faster.
    • Exponential: 10X computer => no change.
  • So using a 10X computer only improves the running time of linear and logarithmic algorithms. For quadratic and exponential algorithms, the running time is not improved.
  • The image below compares running times in 1x and 10x computers for different algorithms:
  • Running times in 1x and 10x computers
  • Do NOT change the computer to improve the running time of an algorithm, change the algorithm instead.

3.4 Asymptotic Analysis

  • The Upper bound:
    • It indicates the highest growth rate of an algorithm.
    • It is the growth rate value for the highest value in the worst-case range.
    • We say: “This algorithm has an upper bound to its growth rate of \(n^2\) in the average case”.
    • Described by the big-O notation or \(O\) notation.
    • Big-Oh notation: \(O(f(n))\) describes an upper bound and lower bound to its growth.
    • There is no strict equality in the big-Oh notation. We say that growth is in \(O(f(n))\), and not growth is \(O(f(n))\) or growth = \(O(f(n))\).
  • The Lower Bound:
    • It indicates the least amount of resources required to run an algorithm in its worst case.
    • It is the growth rate value for the lowest value in the worst-case range.
    • Described by the big-omega notation or \(\Omega\) notation which is similar to the big-Oh notation.
  • The Theta Notation or \(\Theta\) notation:
    • It indicates the exact growth rate of an algorithm (where the upper bound and lower bound are equal).
    • It is the intersection of the big-Oh and big-omega notations or the middle of the range of the worst-case growth rate (lower bound and upper bound).
    • It indicates no meaningful difference between the upper bound and lower bound growth rates.
  • It is better to use the Theta notation than the big-Oh notation because it is more precise.
  • The importance of lower and upper bounds becomes more important when we have little knowledge about the algorithm, and less important -in favor of the Theta notation- when we have more knowledge about the algorithm.
  • The worst, average, and best cases are not the same as the upper, lower, and theta bounds:
    • The worst, average, and best cases describe ranges of values for the input size; and are represented by a set of sets of inputs.
    • The upper, lower, and theta bounds describe growth rates for an input range; and are represented by Big-Oh, Big-omega, and Theta notations.
  • Rules:
    1. Transitivity: \(f(n) = O(g(n)), g(n) = O(h(n)) ⇒ f(n) = O(h(n))\).
    2. Constants are ignored: \(f(n) = O(c \cdot g(n)) ⇒ f(n) = O(g(n))\).
    3. When adding two functions, the highest growth rate is kept: \(f(n) = O(g(n)) + O(h(n)) ⇒ f(n) = f1 + f2 = O(max(f1, f2))\).
    4. When multiplying two functions, the growth rate is multiplied: \(f(n) = O(g(n)) \cdot O(h(n)) ⇒ f(n) = f1 \cdot f2 = O(f1 \cdot f2)\).

3.5 Calculating the running time of a program

  • The Big-Oh values for code that run in sequential order is + (added) while code that is trapped or called within another code (e.g. a loop) is x (multiplied).
  • Loops that increment by 1 e.g. ++1 are linear and have a growth rate of \(O(n)\).
  • Loops that increment by multiplying by 2 e.g. *= 2 are logarithmic and have a growth rate of \(O(log n)\).
  • \(O(n.log(n))\) is always greater than \(O(n)\) which means \(O(n)\) algorithms are always faster than \(O(n.log(n))\) algorithms.
  • Examples:
  • Example 1
  • Example 2

Recurrences 2

  • Recursion is breaking down an object into smaller objects of the same type.
  • Recurrence equation describes the value of a function in terms of the values of the function at smaller inputs.
  • Reoccurrence equation for Tower of Hanoi problem:
T(1) = 1
T(n) = 2T(n-1) + 1  (for n >= 2)
  • Guess-and-Verify is method that is used to solve recurrence equations:
    1. Guess the solution.
    2. Verify the guess using mathematical induction.
  • Looking at the recurrence equation for Tower of Hanoi problem, we can guess that the solution is \(T(n) = 2^n - 1\) and then verify it using mathematical induction.
Claim:
    if:
        T(1) = 1
        T(n) = 2T(n-1) + 1 (for n >= 2)
    then:
        T(n) = 2^n - 1

Proof:
    Let P(n) be the statement that T(n) = 2^n - 1
    Base Case:
        P(1) = T(1) = 2^1 - 1 = 1
        P(1) is true
    Inductive Step:
        Assume P(n) is true for some n >= 1 and prove P(n+1)  for n >= 1
        T(n+1) = 2T(n) + 1  // replace T(n) with 2^n - 1 as it is assumed true
               = 2(2^n - 1) + 1
               = 2*2^n - 2 + 1
               = 2^(n+1) - 1
        P(n+1) is true

Conclusion:
    By the principle of mathematical induction, P(n) is true for all n >= 1

Extra resources


References


  1. Shaffer, C. (2011). A Practical Introduction to Data Structures and Algorithm Analysis. Blacksburg: Virginia. Tech. https://people.cs.vt.edu/shaffer/Book/JAVA3e20130328.pdf. Chapter 3. 

  2. Srini Devadas & Eric Lehman (2005). Mathematics for Computer Science. 6.042/18.062J. Lecture Notes March 17, 2005. https://dspace.mit.edu/bitstream/handle/1721.1/104427/6-042j-spring-2005/contents/lecture-notes/l12_recur2.pdf