The Naive Algorithm
Chapter: Exact Pattern Matching
...Section: The Naive Algorithm
Now let's return to the exact pattern matching problem.
Let's say the target sequence is an array t[n] and the pattern
sequence is the array p[m]. A naive approach to the problem
would be:
for (int i=0; i<=n-m; i++) {
int j = 0;
while (t[i+j]==p[j] {
if (j==m-1) then return(true);
else j++;
}
}
return(false);

- Q. 1
- What is the running time of the naive algorithm in big-Oh notation?
How many comparisons are made when the naive algorithm attempts to
find the pattern 001 in the target 00001000101001? I'd like a
non-programmer in the team to do this exercise
on paper with a pen or pencil. Repeat for trying to find the pattern 0011 in the same target. Deliverable Jot down your results in your lab notebook.
Code up the naive algorithm. Deliverable A program Naive.java such that java Naive
string file prints the starting location of every occurrence of string in the forward strand of the genome stored in file file. For example, java Naive ATGATGATGATG ecoli.txt should print every
location in the forward strand of e. coli where the string ATGATGATGATG begins. Show the running program to Rhys and record their
comments in your running file. I do want your programmer to do this exercise in Java or C because Perl has
too much built in for you to properly understand the costs of this naive
algorithm.
For the non-programmers in the team, here is what I would like you to be
doing in the meantime.
Design an experiment to determine the relationship between running times
for the naive algorithm and the size m of the pattern and the size n of the
target string in the genome file. When your programmer is ready, use their
program to conduct the experiment. Deliverable: The results of your experiment, including, preferably,
a graph indicating the relationship clearly.
rhyspj@gwu.edu