Assignment 2


 

  1. PART I: Pen-and-paper (PDF) exercises:
    1. Analyze the following code and provide a "Big-Oh" estimate of its running time in terms of n. Explain your analysis.
      
          sum = 0;
          for (int i=1; i <= n; i++) {
            for (int j=1; j <= n/i; j++)
              sum = sum + j;
          }
          
    2. Simplify O (n * (n2 + log(n) * (3*n + (3*n2)/log(n) ) ) ).
    3. Module exercise 4.5.
    4. Module exercise 5.7. In particular, first write recursive pseudocode for a non-wildcard search. Then write recursive pseudocode for wildcard search as directed in the exercise.
    5. Array section problem, part 2. Review the array section problem described in Homework-2. Now assume that the elements in the arrays A and B are unique: there are no duplicates in A, and none in B. (Of course, there are elements in A that are in B, otherwise we would not find a common section). In this case, show that there is a faster implementation of findSection() that runs in O(n). [Hint: think about first solving this problem: for every element in A, what is its location in B.] Assume both arrays are of similar size (n elements).

    Note: Credit will not be given only for answers - show all your work: your reasoning, steps you took to get your answer etc.

  2. Part I Implementation: Implement the first (naive) algorithm for the closest pair problem below.

  3. PART II: In this part of the assignment you will implement algorithms for finding the distance between the closest pair of points in a given set of points. This problem is similar to the maximum-distance problem in Module 1, but the algorithms are quite different. You will implement and compare three algorithms for this problem, in increasing order of difficulty and speed. All three algorithms must implement the ClosestPairAlgorithm interface.
    1. The first, called NaiveClosestPair, simply scans all possible pairs, computes the distance for each pair and tracks the least.
    2. The second, called Grid, tries to avoid some comparisons by categorizing the data:
      • First, impose a grid of cells on the region. (Scan the points to identify the region). For example, suppose we impose a 10 x 10 grid on the region.
      • Scan the points and for each point, identify the cell it lies in.
      • Now, create a matrix of linked-lists, one list per cell. The list for a cell should contain the points that lie in that cell.
      • For each cell, find the minimum distance of points in the cell. In all likelihood, the minimum distance will be among these values.
      • Now perform pairwise cell comparisons. A naive approach is to compare every pair of cells - this is tantamount to an all-pairs points comparison. Instead, you should be clever in avoiding comparisons between cells that are too far apart.
    3. The third, called DivideAndConquer, uses the technique described in Section 33.4 of the Cormen textbook. This is more complicated to understand than to implement. So, read it a couple of times, and try out a small example by hand. Then, follow these steps:
      • First, we'll need to create point-arrays sorted by X and sorted by Y. To do this, create a class called, say, SortablePointd, that implements the Comparable interface and that holds a point. The idea is, once any array of such objects is created, you can call java.util.Arrays.sort () to efficiently sort:
            SortablePointd[] pointsSortedByX = new SortablePointd [n];
            // ... insert data ...
            java.util.Arrays.sort (pointsSortedByX);
            // Now pointsSortedByX is sorted by X.
            
        What should SortablePointd look like? Something like this:
        class SortablePointd implements Comparable {
        
          // Constructor:
          public SortablePointd (Pointd p, boolean sortByX)
          {
          }
        
          public int compareTo (Object obj)
          {
            // ... Do the comparison based on whether it's by X or Y ...
          }
         
          // ...
        }
            
        Efficient sorting is an important part of the algorithm.
      • The overall structure looks like this (pseudocode):
                1. Sort points by X into array pointsSortedByX;
                2. Sort points by Y into array pointsSortedByY;
                3. dist = recursiveClosestPair (pointsSortedByX, 0, n-1, pointsSortedByY);
                4. return dist;
             
      • All the work is done in recursiveClosestPair():
                // Handle bottom-out cases for recursion ... not shown ...
        
                1. Identify median point in pointsSortedByX.
                2. Move all points in pointsSortedByY whose X value is less or equal
                   than median's X into leftYList;
                3. Move all points in pointsSortedByY whose X value is greater
                   than median's X into rightYList;
        
                   // Recurse:           
                4. leftDist = recursiveClosestPair (pointsSortedByX, left, median, leftYList);
                5. rightDist = recursiveClosestPair (pointsSortedByX, median+1, right, rightYList);
                6. minDist = min (leftDist, rightDist);
        
                   // Now worry about the band. These are the points 
                   // whose x is within median-minDist to median+minDist
                7. Go through pointsSortedByY and add to listOfPointsInBand;
                8. if (no new points in band)
                     return minDist;
        
                   // Otherwise, need to check band points against each other.
                9. For each band point, compute distance to next 7 points
                   and compare with minDist.
        
                10. return minDist;
             
      What to deliver:
      • Implementations of the three algorithms.
      • Write your own test code to test your algorithms. Provide evidence that you tested your code.
      • Performance comparison between your algorithms. For this purpose, you can use this properties file. Note that you can turn off performance-testing while testing.
      • Time-saving hint: do not write your own linked-list code. Use java.util.LinkedList instead.
      Relevant links:
      Algorithm interface
      ClosestPairAlgorithm interface
      Pointd class

    Submission:

    • Part I is due one week before Part II in the appropriate place in Blackboard. However, you would be well-advised to get started with Part II early.
    • For this assignment and others, you will need to follow the usual submission instructions carefully.