Analyze the following pseudocode and provide a "Big-Oh" estimate of
its running time in terms of n. Explain your analysis.
s = 0
i = 1
while s < n
s = s + i
i = i + 1
endwhile
Explain (with a detailed argument)
that a full trie with O(n) leaves has at most O(n)
interior nodes.
This part gives added weight to completing in-class exercise 5.7.
First, write recursive pseudocode for
a non-wildcard search. Then write recursive pseudocode for
wildcard search as directed in the exercise.
Note: Credit will not be given only for answers - show all your
work: your reasoning, steps you took to get your answer etc.
PART I: Implementation:
In this part, you will implement the naive algorithm for
the line-segment intersection problem. In the
line-segment intersection problem, the input is a collection of
line-segments some of which may intersect. You are to report all such
intersections (if any exist). The naive algorithm
works by examining each possible pair
of line-segments and determining whether the pair has an
intersection point, similar to the naive algorithm for
the largest-distance problem in Module 1.
Now, determining whether two segments intersect requires some
clarification,
for example:
Segments AB and CD properly intersect (no common endpoints, no
collinearity, and no endpoint of one segment is on the other
segment).
Segments AB and EF are also said to properly intersect (no
collinearity, but E lies on AB).
All other cases result in invalid intersections (by our
definition).
Clearly, GH and IJ don't intersect.
KL and LN share an endpoint only - we do not consider this a
proper intersection.
Also, LN and LM are collinear segments. Again, not a proper intersection.
The goal is to report all proper intersections. To make implementation
easy, you do not have to write code to determine whether an
intersection is proper. Simply call the properIntersection
method in the Geometry class: it returns the intersection
point if a proper intersection was found.
To implement:
Write your code in a class called
NaiveIntersection.java. This class will need to implement
the LineSegmentIntersectionAlgorithm interface.
The classes/interfaces involved are: (you won't need most of it)
Algorithm interface.
The interface for all algorithms that use the test-environment.
Geometry class.
A library of useful methods in geometry, of which you will only
need the properIntersection() method.
Pointd class.
A class that represents a point with x and y values (as double's).
LineSegment class.
A class that represents a line segment (two points).
IntersectionInfo class.
Intersection computations result in all sorts of strange
cases. The details are reported in an instance of this
class. This is useful if you are doing your own intersection
computations (not needed for the naive algorithm).
Note: all data will be generated in the unit square.
PART I: Pseudocode:
Read through part II below and provide pseudocode for your
algorithm(s).
Since part II is not trival, this will get you started with thinking
about part II.
Write two levels of pseudocode, one high-level English-like,
and the other a bit more detailed (like we've used in the modules).
PART II: implementation
In this part, you will implement a variation of 2D-hashing to solve
the line-segment intersection problem and investigate its efficiency.
In particular:
Name your algorithm HashIntersection.java.
Your algorithm will need to implement the
LineSegmentIntersectionAlgorithm
interface that itself extends the
Algorithm interface
you've already used. Also see the other javadocs referenced above.
The key idea: build a 2D hash table for line-segments and use
it to cut down the number of intersection computations. Here's how
it works:
Recall the 2D hash table for points discussed in class: the table
is a grid, each cell of which is a linked list of points that get
hashed to the cell.
A 2D hashtable for line-segments is similar, except that a
line-segment may traverse several cells:
Here:
Segment CD traverses cells (1,1), (1,2) and (2,3) whereas
segment EF traverses (1,2) and (2,1).
We place segment CD in the lists for ALL cells that it
traverses. Likewise for every segment.
Now suppose we are processing segment CD: (1) simply examine
the cells that it traverses; (2) look at all the segments in
these cells; (3) perform intersection checks with these
segments. If an intersection occurs, report it.
Clearly, segment AB will never be examined when
processing segment CD, thereby reducing the number of
intersection computations.
Note that just because segments share a cell, they need
not intersect: see segments CD and GH above.
Now, the above case is somewhat exaggerated. Hopefully, most
line segments will lie entirely in one (or at most two) cells and
thus, we would cut down the number of intersection computations
quite drastically. At the same time, there is all this overhead of
building the hashtable (which is done from scratch for every call
to the algorithm). This is the tradeoff you are going to explore.
Geometric algorithms always have to deal with "ugly special
cases" and 2D hashing is no exception. What do we do about segment
CD which appears to "go through" a cell corner? Should it be
inserted into the lists of cells (1,3) and (2,2)as well?
You will need to deal with these types of special cases.
How many cells should the hash table have? The more cells,
the more refined the grid and therefore the better the chance of
separating out the intersections. However, the more refined the
grid the more cells are likely to be traversed. This is another
issue for you to explore. The example above uses 4 intervals on
each axis. The number of intervals to use will be specified in the
properties file.
Suggested steps for implementation:
Start with some pen-and-paper work: write down pseudocode for
various parts such as, determining the interval size on each axis,
how to compute cell coordinates, how to determine the cells that a
segment traverses etc.
Submit this pseudocode in Part I.
You will find it useful to write code that identifies and
enumerates all cells traversed by a line-segment.
Note: you will have to handle several special cases.
Get started by using the following template which reads the "number
of intervals" from the properties file.
Then, develop pseudocode for building the hashtable,
inserting a line segment.
Implement a single cell and make sure its working.
Then, implement a 3 x 3 table (set
alg1.numIntervals=3 in the properties file) and test it.
Next, debug your code on a 2 x 2 table (which, paradoxically
is harder using our test data - you'll see why).
Test with larger sizes.
Important: if you create additional classes (you are
encouraged to), name then uniquely by putting your username in the
name of the class. For example, if you implement a "Cell" class
and your username is batman, call the class BatmanCell.
This is because grading is done with a single classpath variable
and therefore might result in name conflicts.
You can create really small segments by adjusting the range of
segment-sizes in the properties file. It should work better with smaller segment sizes.
Compare the performance of your 2D-hashing algorithm with the
naive method. Submit some plots for various data sizes, grid sizes
and segment lengths.
You may use a2.props2 with
the test environment. After your code is working, test
for performance by setting
testPerformance=true.
Submit your pseudocode as part of your Part II submission as well.
Submission:
Part I is due one week before Part II in the appropriate
place in Blackboard.
However, you would be
well-advised to get started with Part II early.
For this assignment and others, you will need to follow
the usual submission instructions
carefully.