\( \newcommand{\eqb}[1]{\begin{eqnarray*}#1\end{eqnarray*}} \newcommand{\eqbn}[1]{\begin{eqnarray}#1\end{eqnarray}} \newcommand{\bb}[1]{\mathbf{#1}} \newcommand{\nchoose}[2]{\left(\begin{array}{c} #1 \\ #2 \end{array}\right)} \newcommand{\defn}{\stackrel{\vartriangle}{=}} \newcommand{\eql}{\;\; = \;\;} \definecolor{dkblue}{RGB}{0,0,120} \definecolor{dkred}{RGB}{120,0,0} \definecolor{dkgreen}{RGB}{0,120,0} \newcommand{\bigsp}{\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;} \newcommand{\plss}{\;\;+\;\;} \newcommand{\miss}{\;\;-\;\;} \newcommand{\implies}{\Rightarrow\;\;\;\;\;\;\;\;\;\;\;\;} \)


Module 1: Introduction


1.1   A simple problem and some solutions

 

Consider the largest distance problem:

 

In-Class Exercise 1: In this exercise you will write code to find the distance between the two furthest points among a set of points:

Note: do not read any further until you solve this problem. (Yes, there is a solution described below, but please develop your own.)
 

Now consider this algorithm:

Let's take this one step at a time:
  1. What is a convex hull?
    • First: a (simple) polygon is an ordered set of edges such that successive edges and only successive edges share a single endpoint.

    • Second: a polygon is convex if no internal angle is larger than 180 degrees.

    • Finally: for a set of points, the convex hull is the smallest convex polygon completely enclosing the points.

  2. Using the Graham-Scan algorithm to compute the convex hull:
    • First find the lowest point (rightmost amongst them):

    • Then, compute angles to other points from the horizontal line through "lowest".

    • Sort points by angle.
    • Place all points temporarily in the hull.
    • Consider groups of three consecutive points, e.g.,

      If the center point is at a "right turn", discard that point.

    • Scan through the points once, discarding points that make a right turn.
    • Note: when discarding, you need to re-compute current group of 3 vertices (use a stack).

  3. Find distances between antipodal pairs:
    • For a given vertex, its antipodals are those points that lie on parallel lines that don't cross the hull.

    • Key observation: the two farthest points in the set must be antipodal pairs on the hull.
    • To find these distances, compute distances from points to each hull edge:

    • Another key observation: antipodal pairs can be found in a single sweep
           ⇒ your clockwise neighbor's antipodal is farther along than your antipodal.

Now, let's take a first look at the code:

  1. The AntiPodal Algorithm: (source file):
    
    public class AntiPodalAlgorithm {
    
      static double distance (Pointd p, Pointd q)
      {
        // Return the Euclidean distance between the two points. 
      }
    
      static double findMaxDistance3 (Pointd[] points, int a, int b, int c)
      {
        // Compute the distance between each of ab, ac and bc,  
        // and return the largest. 
      }
    
      boolean turnComplete = false;
    
      int nextCounterClockwise (Pointd[] points, int i)
      {
        // Find the next point in the list going counterclockwise, 
        // that is, in increasing order. Go back to zero if needed. 
        // Return the array index. 
      }
    
      int prevCounterClockwise (Pointd[] points, int i)
      {
        // Similar. 
      }
      
    
      int findAntiPodalIndex (Pointd[] hullPoints, int currentIndex, int startAntiPodalIndex)
      {
        // Given a convex polygon, an index into the vertex array (a particular vertex), 
        // find it's antipodal vertex, the one farthest away. Start the search from 
        // a specified vertex, the "startAntiPodalIndex" 
      }
    
    
      public double findLargestDistance (Pointd[] points)
      {
        // 1. Compute the convex hull: 
        Hull hull = new Hull (points);
    
        // 2. Extract the hull points: 
        Pointd[] hullPoints = hull.getPoints();
    
        // 3. If it's exactly three points, we have a method just for that: 
        if (hullPoints.length == 3)
          return findMaxDistance3 (hullPoints, 0, 1, 2);
        
        // Otherwise, we start an antipodal scan. 
        boolean over = false;
    
        // 4. Start the scan at vertex 0, using the edge ending at 0: 
        int currentIndex = 0;
        int prevIndex = prevCounterClockwise (hullPoints, currentIndex);
    
        // 5. Find the antipodal vertex for edge (n-1,0): 
        int antiPodalIndex = findAntiPodalIndex (hullPoints, currentIndex, 1);
    
        // 6. Set the current largest distance: 
        double maxDist = findMaxDistance3 (hullPoints, currentIndex, prevIndex, antiPodalIndex);
        // We'll stop once we've gone around and come back to vertex 0. 
        double dist = 0;
        turnComplete = false;
    
        // 7. While the turn is not complete: 
        while (! over) {
    
          // 7.1 Find the next edge: 
          prevIndex = currentIndex;
          currentIndex = nextCounterClockwise (hullPoints, currentIndex);
    
          // 7.2 Get its antipodal vertex: 
          antiPodalIndex = findAntiPodalIndex (hullPoints, currentIndex, antiPodalIndex);
    
          // 7.3 Compute the distance: 
          dist = findMaxDistance3 (hullPoints, currentIndex, prevIndex, antiPodalIndex);
    
          // 7.4 Record maximum: 
          if (dist > maxDist)
            maxDist = dist;
    
          // 7.5 Check whether turn is complete: 
          if (turnComplete)
            over = true;
        } // end-while 
    
        // 8. Return largest distance found. 
        return maxDist;
    
      }
    
    
    } // End-class 
      

  2. Computing the convex hull:
    
    import java.util.*;
    
    public class Hull {
    
      // Maintain hull points internally: 
      private Pointd[] hull;
    
      // The data set is given to the constructor. 
    
      public Hull (Pointd[] points)
      {
        Pointd[] vertices;
        int i, n;
    
        // Record the number of vertices: 
        n = points.length;
    
        // 1. Make a copy of the vertices because we'll need to sort: 
        vertices = new Pointd[n];
        System.arraycopy (points, 0, vertices, 0, n);
    
        // 2. Find the rightmost, lowest point (it's on the hull). 
        int low = findLowest (vertices);
    
        // 3. Put that in the first position and sort the rest by 
        //    angle made from the horizontal line through "low". 
        swap (vertices, 0, low);	
        HullSortComparator comp = new HullSortComparator (vertices[0]);
        Arrays.sort (vertices, 1, vertices.length-1, comp);
    
        // 4. Remove collinear points: 
        n = removeCollinearPoints (vertices);		
    
        // 5. Now compute the hull. 
        hull = grahamScan (vertices, n);
      }
    
      
      Pointd[] grahamScan (Pointd[] p, int numPoints)
      {
        // 1. Create a stack and initialize with first two points: 
        HullStack hstack = new HullStack (numPoints);
        hstack.push (p[0]);
        hstack.push (p[1]);
    
        // 2. Start scanning points. 
        int i = 2;
    
        // 3. While scan not complete: 
        while (i < numPoints)
        {
          // 3.1 If the current point is on the hull, push next one. 
          //     We know a point is potentially on the hull if the 
          //     the angle is convex (a left turn). 
          if ( hstack.isHull (p[i]) )
            hstack.push (p[i++]);
          //     Else remove it. 
          else
            hstack.pop ();
    
          // NOTE: the isHull() method looks for "left" and "right" turns. 
        }
    
        // 4. Return all points still on the stack. 
        return hstack.hullArray();
      }
    
      private int findLowest (Pointd[] v)
      {
        // 1. Scan through points: 
        //    1.1 If y-value is lower, the point is lower. If the y-values 
        //        are the same, check that the x value is further to the right: 
        // 2. Return lowest point found: 
      }
      
      int removeCollinearPoints (Pointd[] p)
      {
        // Not shown 
      }
      
      
      void swap (Pointd[] data, int i, int j)
      {
        // Not shown 
      }
    
    }
     

Next, consider this idea:

Note:

 

In-Class Exercise 2: What is this pathological test data? Give an example.
 

Summary:

 


1.2   Improving a program's speed

 

Consider the following problem:

As a first attempt to solve this problem, consider (source file):


  static void algorithm1 (double[] A)
  {
    // 1. Set the initial maximum to the difference between the first two:  
    double max = Math.abs (A[0] - A[1]);

    // 2. Record the actual values: 
    double x = A[0],  y = A[1];

    // 3. Now compare each pair of numbers: 
    for (int i=0; i < A.length-1; i++){
      for (int j=i+1; j < A.length; j++) {

        // 3.1 For each such pair, see if it's difference is larger 
        //     than the current maximum. 
        double diff = Math.abs (A[i] - A[j]);

        if (diff > max) {
          // 3.1.1 If so, record the new maximum and the values 
          //       that achieved it. 
          max = diff;
          x = A[i];
          y = A[j];
        } 

      } // inner-for 
    } // outer-for 

    // 4. Output: 
    System.out.println ("Algorithm 1: the numbers: " + x + "," + y 
                        + " difference=" + max);
  }
  

Here are some timings for different array sizes (in milliseconds):
Array size execution time
1000 84
2000 258
3000 554
4000 985
5000 1551
6000 2228
7000 3031
8000 3962
9000 5020
10000 6247

How could we improve this program?

 

In-Class Exercise 3: In what other ways can the original Java program be improved?
 

Instead, let's consider some algorithmic improvements:

 

In-Class Exercise 4: Why are algorithms 2 and 3 faster? How many comparisons are made by each algorithm with an array of n elements?
 

In-Class Exercise 5: Take your code for Exercise 1, and count the number of times "distance" is computed. For n points, how many times is distance computed?
 


1.3   Course outline

 

What this course is about:

  • Part of the science in computer science.

  • Algorithm analysis: both for execution time and other resources (like memory).

  • Problem-solving techniques.

  • Exposure to classic computer science problems and techniques.

  • Exposure to useful data structures.

  • Exposure to important types of combinatorial objects (graphs, strings, geometry).

  • Exposure to discrete optimization problems.

  • Performance monitoring and code tuning.

  • Writing clean, efficient code.
In addition, you will:
  • Learn to think about problems and algorithms independent of code.
         ⇒ Algorithmic thinking

  • Improve your analytic skills.

  • Improve your programming skills:
    • More challenging debugging (e.g., "tree" code).
    • Reusable code.
    • Think at the right level of detail.

  • Increase your ability to learn new material by yourself.
 

Key principles:

  • Insight: Analysis of problem structure.

  • Data organization: possibly sort the data, use an efficient data structure.

  • Divide and conquer:
    • Breakdown the problem in the "right" way
    • Solve smaller problems first.
    • Put together solution from sub-solutions.

  • Simple, effective code:
    • Exploit the power of recursion.
    • Reusable data structures.
    • Simple, clear interfaces to data structures, and subproblem algorithms.

  • Statistics: exploit statistical characteristics in data.
 

Algorithms in the context of the rest of computer science:

  • The science of algorithms is an intellectual cornerstone of computer science.

  • Hard problems have been analysed and clever algorithms have made some computations routinely fast.
    Example: sorting

  • A large class of (but not all) algorithms go hand-in-hand with data structures.
    These data structures tend to be used again and again.

  • Often, some applications demand slight modifications in data structures.
    (So, it's not enough to use a package).

  • Algorithmic insight is a key skill.

  • Algorithm analysis is an integral part of many other courses.
 


1.4   Algorithms

 

What is an algorithm?

  • An algorithm is NOT a program.

  • An algorithm is: the key ideas underlying a program that remain the same no matter what language or machine is used.

  • An algorithm sometimes consists of sub-algorithms
         e.g., finding the convex hull is part of finding the largest distance.

  • Algorithms are typically described in pseudocode.

  • Algorithms are analyzed for speed independent of machine/language.

  • Algorithms are usually analyzed along with the problems they solve.
    (Analyzing a problem is often harder).
 


1.5   Describing algorithms: pseudocode

 

What is pseudocode?

  • A language-independent description of an algorithm.

  • Various "levels" of pseudocode:
    • Very-high level pseudocode, e.g., for Algorithm1 above:
      
          1. Scan through all possible pairs of numbers in the array.
          2. For each such pair, compute the absolute difference.
          3. Record the largest such difference.
             
    • English-like structured pseudocode:
      
          Input: an array of numbers, A
          1. Set the initial maximum to the difference between the first two
             in the array.
          2. Loop through all possible pairs of numbers:
              2.1 For each such pair, see if it's difference is larger
                  than the current maximum.
                  2.1.1 If so, record the new maximum and the values
                        that achieved it.
          3. Output the maximum found.
             

    • Detailed pseudocode, as found in many algorithms books:
      
          Algorithm1 (A)
            Input: A, an array of numbers.
      
                 // Set the initial maximum to the difference between the first two:  
            1.   max = Absolute-Difference (A[0], A[1])
      
                 //  Record the actual values: 
            2.   x = A[0],  y = A[1]
      
                 // Now compare each pair of numbers: 
            3.   for i=0 to length-1
      
            4.     for j=i+1 to length
      
                     // For each such pair, see if it's difference is larger 
                     // than the current maximum. 
            5.       diff = Absolute-Difference (A[i], A[j])
      
            6.       if diff > max
      
                       // If so, record the new maximum and the values 
                       // that achieved it. 
            7.         max = diff
            8.         x = A[i], y = A[j]
      
            9.       endif
      
            10.    endfor
      
            11.  endfor
      
            Output: min, max, diff
             
      Note:
      • Such pseudocode usually does not contain variable declarations or data-types
        (They are usually implied or left to the reader)
      • The pseudocode "language" is fictitious.
      • Boldface is usually used for keywords, and array indexing usually starts at 1.
      • Not all methods/functions are shown, especially when the meaning is obvious.

  • Pseudocode (all levels) is extremely useful in describing algorithms.

  • Notice how easy it is to express the key idea in high-level pseudocode.

  • Recommended practice:
    • Provide high-level pseudocode at the very minimum.
    • Provide separate detailed pseudocode where possible, usually in manuals or written documentation.
    • Include the high-level pseudocode at the beginning of a program.
    • Include the detailed pseudocode as comments in the program.
 


1.6   Analysing algorithms

 

Algorithms are analysed using the so-called "order notation":

  • The order-notation is initially confusing to learn.
  • Once mastered, it makes analysis and presentation easier.
  • The key idea: pay attention only to important details.

Consider the following example:


    for (int i=0; i < A.length-1; i++){
      for (int j=i+1; j < A.length; j++) {
        double diff = Math.abs (A[i] - A[j]);
        if (diff > max) 
          max = diff;
      }
    }
  
Let's analyse this piece of code:
  • The outer loop is executed A.length-1 times.
  • For the i-th outer loop iteration, the inner one is executed A.length-i times.
  • In the inner loop, the Math.abs function is called every time.
  • The assignment inside the if is not always executed.
  • Assume it takes about s seconds for one iteration of the inner loop.
  • How many seconds overall?
\begin{eqnarray} \sum_{i=0}^{\mbox{A.length-2}} \sum_{j=i+1}^{\mbox{A.length-1}} s & \;\;\; = \;\;\; & s \sum_{i=0}^{\mbox{A.length-2}} (\mbox{A.length-1} - 1 - i) \\ & \;\;\;= \;\;\; & s \sum_{j=0}^{\mbox{A.length-1}} j \\ & \;\;\; = \;\;\; & s \frac{\mbox{(A.length)(A.length-1)}}{2}\\ \end{eqnarray}

First, it's preferable to use math symbols:

\begin{eqnarray} \sum_{i=0}^{n-2} \sum_{j=i+1}^{n-1} s & \;\;\;= \;\;\; & s \sum_{i=0}^{n-2} (n - i - 1) \\ & \;\;\; = \;\;\; & s \sum_{j=0}^{n-1} j \\ & \;\;\; = \;\;\; & s \frac{n(n-1)}{2}\\ \end{eqnarray}

It is possible to simplify further by observing that $$ \frac{n(n-1)}{2} = \frac{n^2}{2} - \frac{n}{2} $$

Observe: the term that dominates is $$ \frac{n^2}{2} $$

Accordingly, for purposes of approximate analysis:

  • Ignore the "lower-order" terms.
  • Ignore constants.
⇒ The algorithm requires O(n2) steps.
 

Let's try a thought experiment: first, we'll rewrite the nested for-loop as:


    for (int i=0; i < A.length-1; i++){
      for (int j=i+1; j < A.length; j++) {
        // Instead of:
        //double diff = Math.abs (A[i] - A[j]);
        //if (diff > max) 
          //max = diff;
        // We'll represent the "work done at each step" as:
        System.out.print (" X");
      }
      System.out.println ();
    }
  
 

In-Class Exercise 6: What is the resulting output? Think of each X as a visual way of representing the "work done" inside the earlier version of the loop. Draw the result for n=8. For an array of length \(n\), how many X's are printed?
 


1.7   Background math

 

A quick math review:

  • What is a function?
    • A formal definition requires more set-theory definitions.

    • But we have an intuitive idea:
      e.g., \(f(x) = x^2\) implies \(f(3) = 3^2 = 9\)

    • Informally:
      • A domain of possible arguments - "what the function takes".
      • A range - "what the function produces".
      • For any argument, a unique result.
 

In-Class Exercise 7: Use a calculator and plot the following functions in the range (0,100) by "joining the dots" of function values computed at 0, 10, 20, ..., 100:

  1. \(f(x) = x\)
  2. \(f(x) = x^2\)
  3. \(f(x) = \log_2 (x) \)
  4. \(f(x) = x \; \log_2(x) \)
  5. \(f(x) = 2^x\)
  6. \(f(x) = e^x\)
 

  • Useful exponentiation rules: $$\eqb{ x^a x^b \eql x^{a+b} \\ \frac{x^a}{x^b} \eql x^{a-b} \\ x^{a^b} \eql x^{ab} }$$

  • Logarithms:
    • What exactly are they?
           ⇒ the "inverse of exponentiation"

    • Definition: \(y = \log_a(x)\;\;\) if \(a^y = x\).

    • Example: What is \(\log_2(256)\)?
      \(2^? = 256\)
      \(2^8 = 256\)
      Thus, \(\log_2(256) = 8\)

    • Another way of thinking about it:
      • Divide 256 by 2 to get: 256 / 2 = 128.
      • Then, 128 / 2 = 64
      • ...
      • How many times do we need to repeatedly divide by 2, starting with 256, to get 1?
        log2256 times.

    • What do logarithms have to do with algorithms?
      (Other than anagrammatically)

    • Example: searching
      • Consider an algorithm to search a data set of size n.
      • Break up the data into two pieces (size \(\frac{n}{2}\) each).
      • Figure out (quickly) which piece has the data.
      • Continue the search with data set of size \(\frac{n}{2}\).
      • Repeat the procedure.
      How long does this procedure take?
      How many times can you recursively break up the data into two pieces?
      Ans: \(\log(n)\) times.
      Thus, the procedure takes about \(\log(n)\) steps.

  • Note: \(\log_e(x)\) is sometimes written as \(\ln (x)\).
 

In-Class Exercise 8: Use the definition of logarithms and the exponentiation rules to show that: \(\log_a(xy) = \log_a(x) + \log_a(y)\).
 

  • Useful sums:
    • \(1 + 2 + 3 + ... + n \eql \frac{n(n+1)}{2}\)

    • \(1 + a + a^2 + a^3 + ... + a^n \eql \frac{(a^{n+1} - 1)}{(a-1)}\)

    • \(1 + a + a^2 + a^3 + \ldots \eql \frac{1}{(a-1)}\) when \(0 < a < 1\) (infinite sum)

    • \(1 + \frac{1}{2} + \frac{1}{3} + \frac{1}{4} + \ldots + \frac{1}{n} \approx \log_e n\).
    Please memorize these. Better yet, try to derive them yourself.
 


1.8   The order notation

 

The key things to remember about order notation:

  • It's a way to compare two functions (not numbers).

  • One of the functions will describe the running time (counted as "number of steps") of an algorithm (as a function of the input size).

  • The other function will be a "nice" and simple function like \(n^2\) or \(n\log(n)\).

  • The whole purpose is to have a simple way to describe how an algorithm performs as the data (problem) size grows.
 
Consider the functions:
  • \(f(n) = 9n^2 + 1000 n\)
  • \(g(n) = n^3\)
Think of \(f(n)\) as the time taken by some algorithm, and \(g(n)\) as a candidate "nice" function used for analysis.
 

An important thing to remember:

  • Because we are interested in algorithm analysis, functions like \(f(n)\) and \(g(n)\) will represent the number of steps taken by an algorithm.

  • Because of this, we will assume that \(f(n) \geq 1\) for any such function (an algorithm will take at least one step).

  • This additional assumption can sometimes create a minor technical inconvenience:
    • Suppose \(f(n) = \log_{10}(n) \).
    • Then, for \(n=2, f(2) = \log_{10}2 \approx 0.301 \).
    • Clearly, it doesn't make sense for an algorithm to take 0.301 steps.

  • We could make the math more consistent with added complication, for example, by using the "ceiling" operator:
    • \(f(n) = \lceil \log_{10}(n) \rceil \).
    • But this will clutter the math.

  • So, to keep it simple, whenever these functions are less then 1, we'll assume a "rounding up to 1".
 

In-Class Exercise 9: Download and execute TwoFunction.java. You will also need Function.java and SimplePlotPanel.java.

  • Confirm that the functions \(f(n) = 9n^2 + 1000 n\) and \(g(n) = n^3\) are being plotted.
  • At what value of \(n\) does \(g(n)\) rise above \(f(n)\)?
  • Can you derive this value of \(n\) by hand?
 

We'll now try and work towards a formal definition:

  • We sense from the above example that \(f(n)\) is "less than" \(g(n)\).

  • But that is not always true.
         ⇒ e.g., not true for \(n=10\).

  • Possible definition 1: \(f(n) = O(g(n))\) if \(f(n) \leq g(n)\) for all large enough \(n\).

  • This definition is often written as: \(f(n) = O(g(n))\) if there is some \(N\) such that \(f(n) \leq g(n)\) for all \(n > N\)

  • How to say it aloud: \(f(n)\) is "Oh-of" \(g(n)\) or "Big-Oh-of" \(g(n)\).

    In-Class Exercise 10: Consider \(f(n) = 9n^2\) and \(g(n) = n^2\). Is \(f(n) = O(g(n)) \)?

  • However, the dominant term in \(f(n) = 9n^2\) is \(n^2\)
         ⇒ We'd like to say that \(f(n) = 9n^2\) is "of the order of" \(g(n) = n^2\).

  • They are of the same order except for a constant multiplying factor.

  • Possible definition 2: \(f(n) = O(g(n)\) if there exists a constant \(c\) such that \(f(n) \leq c g(n)\).

    In-Class Exercise 11: Consider \(f(n)=9n^2\) and \(g(n) = n^2\) What choice of \(c\) suffices to show \(f(n) = O(g(n)\) by the above definition?

  • Does this always work?

    In-Class Exercise 12: Consider \(f(n) = 9n^2 + 1000n\) and \(g(n) = n^2\). Is there a choice of \(c\) such that \(f(n) = O(g(n)\) by the above definition?

  • Possible definition 3: \(f(n) = O(g(n)\) if there exists constants \(c\) and \(N\) such that \(f(n) \leq c g(n)\) for all \(n > N\).

    In-Class Exercise 13: Consider \(f(n) = 9n^2 + 1000n\) and \(g(n) = n^2\). Suppose \(c=10\). Is there a value of \(N\) such that \(f(n) \leq c g(n)\) for all \(n > N\)? Plot f and g by modifying the code in TwoFunction.java. Change the upper limit of the for-loop to 2000.

  • Suppose we pick \(c=100\). Then $$ 100 g(n) \eql 100 n^2 $$ Which we are going to write as $$\eqb{ 100 g(n) & \eql & 100 n^2 \\ & \eql & 9n^2 + 91n^2 }$$ Now $$ 100 g(n) \geq f(n) $$ will be true if $$ 9n^2 + 91n^2 \;\; \geq \;\; 9n^2 + 1000n $$ And this will be true for values of \(n\) where $$ 91n^2 \geq 1000 n $$ Which holds when $$ n \geq 11 $$ Thus \(c=100\) and \(N=11\) is sufficient to use Defintion 3 to prove that \(f(n) = O(g(n)\).

  • Thus, the higher we pick \(c\), the lower \(N\) needs to be.

    In-Class Exercise 14: What value of \(N\) is sufficient when \(c=1000\)?

 
All of the above reasoning leads to this definition:

  • Definition 4: We say that \(f(n) = O(g(n))\) if there exists a constant \(c\) such \(f(n) \leq c g(n)\) for all \(n > 1\).
 
Note:
  • Many books use definition 3 above. We will use definition 4.

    In-Class Exercise 15: Can you show that definitions 3 and 4 are equivalent?

  • Useful facts:
    • Fact 1: If \(f_1(n) = O(g_1(n))\) and \(f_2(n) = O(g_2(n))\) then
      1. \(f_1(n) f_2(n) = O(g_1(n) g_2(n))\)
            ⇒ Simple product rule

      2. \(f_1(n) + f_2(n) = O(max(g_1(n), g_2(n))\)
            ⇒ Use the bigger of the two \(g_i\)'s

    • Fact 2: If \(f(n) = a_kn^k + a_{k-1}n^{k-1} + \ldots + a_1 n + a_0\) then \(f(n) = O(n^k)\)
          ⇒ The "order" of a polynomial is dominated by its largest power.

    In-Class Exercise 16: Show how Fact 1 implies Fact 2.

  • We almost always prefer the tightest possible bound:
    • It is true that 9n2 = O(n5), but this bound is vastly exaggerated.
    • The tightest possible: 9n2 = O(n2),

  • Incorrect usage: f(n) < O(n2) or f(n) ≤ O(n2).

  • Don't provide a constant in the order: e.g, f(n) = O(5n2) is incorrect usage.

  • Note: whether we are using the \(n \geq 1\) or \(n \geq N\) definition, the assumption is that \(f(n)\) and \(g(n)\) are strictly positive for all n.
        ⇒ Execution time is non-zero for any value of n.
 
Related definitions:
  • We say that \(f(n) = \Omega(g(n))\) if \(g(n) = O(f(n))\)
        ⇒ Informally, \(f\) is at least as large as \(g\), e.g., \(f(n) = 4n^3\) and \(f(n) = 6n^2\)

  • We say that \(f(n) = \Theta(g(n))\) if both \(f(n) = O(g(n))\) and \(f(n) = \Omega(g(n))\)
        ⇒ Informally, \(f\) and \(g\) are of the same order, e.g. \(f(n)=4n^3+8n^2\) and \(g(n)=n^3+n\)
 

The two ways in which we use Big-Oh:

  • Mathematical way:
    • In this case, we are given two functions, like \(f(n) = 9n^2 + 1000n\) and \(g(n) = n^2\) and are asked:
      • Is \(f(n) = O(g(n))\)?
      • Is \(g(n) = O(f(n))\)?
    • What needs to be done for this type of problem?
      • You need to find a constant \(c\) such that \(f(n) \leq c g(n)\) for all \(n > 1\).
      • Or: \(g(n) \leq c f(n)\) for all \(n > 1\).

  • Algorithmic way:
    • In this case, we have an algorithm such as the nested-for:
      
                 // ... stuff ... 
                 for i=0 to length-1
                     for j=i+1 to length
                     // ... more stuff ...
             
    • Suppose \(n\) is the length.
    • Suppose the true running time of the algorithm is \(R(n)\)
    • Then, we want to find the slowest growing function \(f(n)\) such that $$ R(n) \eql O(f(n)) $$
    • In this example, it would technically correct to say $$ R(n) \eql O(n^3) $$
    • But the tighter bound is $$ R(n) \eql O(n^2) $$ because $$ R(n) \eql \frac{n(n-1)}{2} \eql \frac{n^2}{2} - \frac{n}{2} \eql O(n^2) $$

  • In the last step above, how do we know $$ \frac{n^2}{2} - \frac{n}{2} \eql O(n^2)? $$
    • We need to find \(c\) such that $$ \frac{n^2}{2} - \frac{n}{2} \leq c n^2 $$
    • In this example, \(c=1\) is enough.

  • Later in the course, we will not have to find \(c\) but will reason more or less directly from the algorithm.
 

In-Class Exercise 17: Implement and analyze the Selection Sort algorithm:

  • SelectionSort works like this: Suppose the array is in data[0], data[1] .... First find the smallest element and put that in data[0]. Then find the smallest element in the range data[1],...,data[data.length-1] and put that in data[1], etc.
  • Download SelectionSort.java and UniformRandom.java
  • Write your code in the file SelectionSort, to sort a data set of integers.
  • Then test by executing SelectionSort.
  • If f(n) denotes the number comparisons made by SelectionSort for a data set of size n, what is the smallest value of k for which f(n) = O(nk)?
  • Count the number of comparisons - does your experimental evidence match the analysis?
  • For a given array size, does the number of comparisons depend on the input data?
 


1.9   Worst-case and other comparisons

 

Consider comparing two algorithms A and B on the sorting problem:

  • On some data sets A might perform better than B. On others, B might perform better.
    How to compare them?

  • The standard approach: a worst-case order analysis:

  • Other approaches:
    • Best-case (theoretical)
    • Average-case (theoretical)
    • Average-case (experimental)

  • Other kinds of issues:
    • Complexity of implementation.
    • Practical data sets, problem sizes.
    • Relation to application, relative importance of algorithm.
    • Exploiting architecture, operating systems.
 

Typical worst-case analysis:

  • Assume the worst possible input for the algorithm.

  • Identify key operations that take unit time.

  • Count number of operations using order notation.
 

Theoretical average-case analysis:

  • Assume a probability distribution over possible inputs.

  • Derive expected running time using distribution.

  • Express expected running time using order notation.

  • Typically very difficult even for simple problems/algorithms.

  • Not often practical, even if feasible.
 

Experimental average-case analysis:

  • Compare algorithms on randomly generated data sets.

  • Running a single algorithm sometimes makes little sense.
    (Depends on processor, system architecture, OS).

  • Create different types of data sets (uniform, non-uniform).

  • Test over wide range of problem sizes.

  • Accompany with worst-case analysis.
 

A comprehensive analysis of a problem:

  • Prove a lower-bound on the problem.
    (e.g., sorting cannot be solved faster than O(n log(n))

  • Devise a fast algorithm.

  • Analyse worst-case complexity of algorithm.

  • Experimentally evaluate algorithm: compare with others and lower-bound.

  • See if special cases can be solved faster.
 


1.10   Simple timing of a Java program

 

Use the System.currentTimeMillis() method:


      long startTime = System.currentTimeMillis();

      // ... algorithm runs here ...
      
      double timeTaken = System.currentTimeMillis() - startTime;
  
 

In-Class Exercise 18: Compare your selection sort implementation with the sort algorithm in the Java library. Download SelectionSortTiming.java and insert your code from the previous exercise.
 


1.11   Famous algorithms

 

A brief mention of some "famous" algorithms and data structures:

  • Hashing:
    • Probably the "most used" data structure.
    • Useful for "equality" search, sets.
    • Except for pathological cases, extremely fast for insertion and retrieval.

  • FFT:
    • Fast Fourier Transform.
    • Simple to implement, needs math background to understand.
    • Used as a key building-block in just about every signal-processing software.
    • Considered the "most used" algorithm of all time.

  • Dijkstra's Shortest-Path algorithm:
    • Currently the algorithm used for routing in the Internet.
    • Simple to implement, using a binary heap.
    • Interesting theoretical improvements exist.
    • Since tens of thousands of internet routers run this algorithm every few seconds, perhaps it should be considered the "most executed" algorithm.

  • Quicksort:
    • Exemplar of the "divide and conquer" technique.
    • Easy to implement, hard to beat.
    • Poor worst-case performance, excellent average-case performance.

  • B-tree:
    • Devised for accessing external files.
    • Variations are used in almost every commercial database system.
    • Simple, yet powerful.
    • Block-oriented.

  • Binary tree (a whole class of data structures):
    • Various versions: plain, AVL, red-black etc.
    • Some are self-balancing.
    • Opened a new line of analysis: amortized analysis.
    • Useful as both a "dictionary" and "priority-queue".

  • Binary heap:
    • Simple, efficient priority queue.
    • Used in various graph algorithms.
    • Efficient (and unusual) array implementation.
    • Can be (but rarely) used for sorting.

  • Depth/breadth-first:
    • The key idea: how to "explore" a graph like structure systematically.
    • Breadth-first is used in other systematic searches.

  • Dynamic "programming":
    • More an idea than an actual algorithm.
    • Produces optimal results in a variety of problems.
    • A bit hard to understand.

  • Simulated Annealing:
    • A simple algorithm, based on an interesting metaphor.
    • Used for solving combinatorial problems.
    • Not very efficient, but very easy to implement.

  • The RSA algorithm:
    • Named after Rivest-Shamir-Adleman.
    • First public asymmetric public-key cryptographic algorithm.
    • Easy to implement (but not as easy to implement efficiently).
    • Not free - must pay authors royalty.
    • Very popular, has made authors very rich.
    • Effectiveness recently (2002) called into question by Shor's quantum factoring algorithm.

  • The Simplex algorithm:
    • Used in linear programming, a class of optimization problems.
    • Part of almost every optimization package.
    • Needs some math background to understand.
    • NOT an efficient algorithm (provably faster algorithms exist), but "undefeated" until recently (by Karmarkar's algorithm).
 

In-Class Exercise 19: Do a web search on "top ten algorithms". What do you see?


© 2001, Rahul Simha