Biologically-Inspired Algorithms


Preliminaries: What is a Graph?

Informal definition:

Formally:

Depicting a graph:

Exercise: What is the maximum number of edges in an undirected graph with n vertices? What is this number in order-notation?

Paths:

Why are graphs important?


Optimization Problems

What are they?

Example: the Travelling Salesman Problem (TSP)

  • Input: a collection of points (representing cities).

  • Goal: find a tour of minimal length.
    Length of tour = sum of inter-point distances along tour

  • Details:
    • Input will be a list of n points, e.g., (x0, y0), (x1, y1), ..., (xn-1, yn-1).
    • Solution space: all possible tours.
    • "Cost" of a tour: total length of tour.
      => sum of distances between points along tour
    • Goal: find the tour with minimal cost (length).

  • Note: strictly speaking, we have defined the Euclidean TSP.
    => There is also a graph version that we will not consider

  • Applications:
    • Logistics: trucking and delivery problems.
    • Machine operation (e.g., drilling a collection of holes)

Exercise: For an n-point TSP problem, what is the size of the solution space (i.e., how many possible tours are there)?

Example: the Bin Packing Problem (BPP)

  • Input: a collection of items and unlimited bins.

  • Goal: pack the items into as few bins as possible.

  • Details:
    • Input is a list of n integer bin sizes s0, s1, ..., sn-1 and an integer bin size B.
    • Solution space: all feasible packings
      => all assignments of items to bins such that no bin overflows. (A constraint).
      => sum of sizes of items assigned to each bin is at most B.
    • Goal: find the assignment that uses the fewest bins.
    • Note: assume si < B.

  • Alternate description:
    • Item sizes: s0, s1, ..., sn-1 where si < B.
    • Define the assignment function:
      dij = 1,
      0,
      if item i is placed in bin j
      otherwise
    • B = bin size.
    • Goal: minimize k, the number of bins
      such that:
      1. For each j,    s0dij + ... + sn-1dij < B (all items assigned to bin j fit into the bin)
      2. For each i,    dij + ... + dij = 1 (each item is assigned to a bin)

Exercise: Consider the following Bin Packing problem: there are three items with sizes 1, 2 and 3 respectively, and a bin size of 6. Enumerate all possible assignments.

Example: Quadratic Programming Problem

  • Input: n coefficients a0, a1, ..., an-1.

  • Goal: minimize a0x02 + a1x12 + ... + an-1xn-12.

  • Constraint: x0 + ... + xn-1 = K

  • Here, each xi is real-valued.

  • Example:
    Minimize 3 x02 + 4 x12
    Such that x0 + x1 = 10

Types of optimization problems:

  • The Quadratic Programming problem is very different from both TSP and BPP.

  • TSP and BPP are similar is some structural respects.

  • Two fundamental types of problems:
    • Discrete optimization problems: finite or countable solution space.
      => usually finite number of potential solutions.
    • Continuous optimization problems: uncountable solution space
      => variables are real-valued.

  • Examples:
    • Discrete: TSP, BPP
    • Continuous: Quadratic Programming problem.

  • We will focus on discrete optimization problems.


Problem Size and Execution Time

Problem size:

  • Typically, the "size" of a problem is the space required to specify the input:

  • Example: TSP
    => For an n-point problem, space required is: O(n).

  • Example: BPP
    => For an n-item BPP, O(n) space is required.

  • Non-Euclidean TSP
    • Recall: a graph is given
      => required to use only edges in the graph.
    • A graph with n vertices can have O(n2) edge
      => O(n2) space required for input.

  • Consider this TSP example:
    • Input: "The points are n equally spaced points on the circumference of the unit circle".
    • How much space is required for an input of 10-points? For 1000-points?
      => O(1) !

  • The above example is not general
    => we need O(n) space (worst-case) to describe any TSP problem.

  • Terminology: instance
    • A problem instance is a particular problem (with the data particular to that problem).
    • Example: TSP on the 4 points (0, 1), (0.5, 0.6), (2.5, 3.7) and (1, 4).
      => an instance of size 4.
    • Another instance: TSP on the 5 points (0, 1), (0.5, 0.6), (2.5, 3.7), (1, 4) and (6.8, 9.1).
      => an instance of size 5.

Execution time of an algorithm:

  • What we expect of an algorithm:
    • An algorithm is given its input and then executed.
    • The algorithm should produce a candidate solution or report that no solutions are possible.

  • Example: TSP
    • An algorithm is given the set of input points.
    • The algorithm should output a tour after execution.

  • Example: BPP
    • An algorithm is given the item sizes and bin size as input.
    • The algorithm should output an assignment
      (Or report that some items are too large to fit into a bin).

  • Output size:
    • Example: TSP
      • Output is a tour
        => O(n) output for n points.
    • Example: BPP
      • Output is an assignment.
      • If assignment is specified as the function
        dij = 1,
        0,
        if item i is placed in bin j
        otherwise
        then output could be as large as O(n2).

  • Execution time:
    => Total execution time includes processing input and writing output.

Consider these two algorithms for TSP:

  • Algorithm 1:
    1. Initially the set P = {0, ..., n-1 } and the set Q is empty.
    2. Move 0 from P to Q.
    3. Repeat the following until P is empty:
      • Suppose k was the point most recently added to Q.
      • Find the point in P closest to k and move that to Q.
    4. Output points in the order in which they were added to Q.

  • Algorithm 2:
    1. Generate a list of all possible tours and place in an array (of tours).
    2. Scan array and evaluate the length of each tour.
    3. Output the minimum-length tour.

Exercise: Consider the following 4 input points: (0,0), (1,0), (1,1) and (0,-2).

  1. Show the steps in executing each algorithm on this input.
  2. What is the complexity (execution time) of Algorithm 1 on an input of size n?
  3. What is the complexity of Algorithm 2 on an input of size n?

Polynomial vs. exponential complexity:

  • An algorithm has polynomial complexity or runs in polynomial time if its execution time can be bounded by a polynomial function of its input size.

  • Example: An algorithm takes O(n3) (worst-case) on input of size n
    => algorithm is a polynomial-time algorithm.

  • Requirements for the polynomial:
    • Highest power in the polynomial should be a constant (i.e., not dependent on n.
    • Polynomial should be finite (not an infinite sum of terms).

  • Algorithms that run slower-than-polynomial are said to have exponential complexity:
    • Typically, the (worst-case) running time is something like O(an).
    • Note: factorials have (like O(n!)) have exponential complexity
      => From Stirling's formula: n! = O(sqrt(n) * (cn)n).

Exercise: Which of these are polynomial-time algorithms:

  1. Algorithm 0 runs in time O( (n2 + 3)4 ).
  2. Algorithm 1 runs in time O(n log(n)).
  3. Algorithm 2 runs in time O(nn).
  4. Algorithm 3 runs in time O(nlog(n)).
  5. Algorithm 4 runs in time O( (log n)3 ).

Summary:

  • Many important optimization problems are discrete optimization problems.

  • For these problems:
    • It's easy to find simple algorithms of polynomial complexity
      => but that are not guaranteed to find optimal solutions.
    • The only algorithms that guarantee optimality take exponential time.


Combinatorial Optimization Problems

A combinatorial optimization problem is:

  • A set of states or candidate solutions S = { s0, s1, ..., sm} .

  • A cost function C defined on the states
    => C(s) = cost of state s.

  • Goal: find the state with the least cost.

Example: TSP

  • Each instance of TSP is a combinatorial optimization problem.

  • Example: the 4-point TSP problem with points (0,1), (1,0), (2,3) and (3,5)
    • Does this have a set of "states" or "candidate solutions"?
      => Yes: S = { all possible tours } = { [0 1 2 3], [0 1 3 2], [0 2 1 3], [0 2 3 1], [0 3 1 2], [0 3 2 1] }

    • Is there a well-defined cost function on the states?
      => Yes: C(s) = length of tour s
      e.g., C([0 1 2 3]) = dist(0,1) + dist(1,2) + dist(2,3) + dist(3,0).

    • Is the goal to find the least-cost state?
      => Yes: find the tour with minimal length.

Example: BPP

  • States: all possible assignments of items to bins.

  • Cost function: C(s) = number of bins used in state s.

  • Goal: find the state that uses the least bins (minimal-cost state).

Size of a combinatorial optimization problem:

  • The input is usually of size O(n) or O(n2).
    • TSP: list of n points.
    • BPP: n item sizes and one bin size.
    • Graph-based TSP: n vertices and up to O(n2) edges.
    • MST: n vertices and up to O(n2) edges.

  • The state-space is usually exponential in size:
    • TSP: all possible tours.
    • BPP: all possible assignments of items to bins.
    • MST: all possible spanning trees.

  • The output is usually of size O(n) or O(n2)
    • TSP: a tour of size O(n)
    • BPP: an assignment (matrix) of size O(n2).


Greedy Algorithms

Key ideas:

  • For many combinatorial optimization problems (but not all!), it is easy to build a candidate solution quickly.

  • Use problem structure to put together a candidate solution step-by-step.

  • At each step: "do the best you can with immediate information"

  • Greedy algorithms are usually O(n) or O(n2).

Example: TSP

  • Greedy Algorithm:
    1. Initially the set P = {0, ..., n-1 } and the set Q is empty.
    2. Move 0 from P to Q.
    3. Repeat the following until P is empty:
      • Suppose k was the point most recently added to Q.
      • Find the point in P closest to k and move that to Q.
    4. Output points in the order in which they were added to Q.

  • What is "greedy" about this?
    • At each step, we add a new point to the existing tour.
    • The new point is selected based on how close it is to previous point.
    Greedy => no backtracking.

  • Execution time: O(n2) (each step requires an O(n) selection).

  • Solution quality: not guaranteed to find optimal solution.

Example: BPP

  • Greedy Algorithm:
    1. Let A = { all items }
    2. Sort A in decreasing order.
    3. At each step until A is empty:
      • Remove next item in sort-order from A.
      • Find first-available existing bin to fit item.
      • If no existing bin can fit the item, create a new bin and place item in new bin.

  • Running time: O(n log(n)) for sort and O(n2) for scanning bins at each step (worst-case).
    => O(n2)

  • Solution quality: not guaranteed to find optimal solution.

Example: MST (Minimum Spanning Tree)

  • Greedy Algorithm: (Kruskal)
    1. Sort edges in graph in increasing order of weight.
    2. Process edges in sort-order:
      • If adding the edge causes a cycle, discard it.
      • Otherwise, add the edge to the current tree.

  • Complexity: O(E log(E)) for sorting, and O(E log(V)) for processing the edges with union-find.
    => O(E log(E)) overall.

  • Solution quality: finds optimal solution.

About greedy algorithms:

  • For many problems, it's relatively easy to pose a greedy algorithm.

  • For a few problems (e.g., MST), the greedy algorithm produces the optimal solution.

  • For some problems (e.g. BPP), greedy algorithms produce "reasonably good" solutions (worst-case).

  • For some problems (e.g. BPP), greedy algorithms produce "excellent" solutions (in practice).

  • For some problems (e.g., Hamiltonian tour), greedy algorithms are of no help at all.


Computing with DNA

Yes, with actual DNA.

Key ideas:

  • We will use chemical reactions with DNA to solve the Restricted Hamiltonian Path problem
    => exploit the massive parallelism with millions of DNA molecules.

  • The Restricted Hamiltonian Path problem (directed graph version):
    • Given a directed graph and two vertices s and d, find a path between s and d that visits each vertex exactly once.
    • Result: Restricted Hamiltonian Path problem is NP-complete.
    • Note: for an n-vertex graph, the path is of length n.

  • Overview of process:
    • Represent vertices using strings of DNA bases.
    • Represent edges using combinations of vertex-strings.
    • Use gel-electrophoresis to extract DNA with correct length.
    • Use filtering process in many steps to separate out paths with all vertices.
      => solution (no pun intended) to Hamiltonian path problem.

Exercise: Why is the Restricted Hamiltonian Path problem NP-complete (given that the regular Hamiltonian Path problem is NP-complete)?

Details:

  • Step 1 (on paper): associate unique DNA substrings with vertices, e.g.

    • In the above example, an 8-base string represents a vertex.
    • In practice, a larger number is required to help separate out different-length paths.

  • Step 2 (on paper): associate unique DNA substrings with edges, based on substrings for vertices:

    • For edge ( v1, v2) join the latter half of v1's string with the first half of v2's string.

  • Step 3 (on paper): identify the complementary strings for the vertices, e.g.,

  • Step 4 (on paper): create unique "start" and "stop" DNA strings for vertices s and d.

    • The "start" string represents an artifical edge between "start" and s.
    • The "end" string represents an artifical edge between d and "end".

  • Step 5 (lab): synthesize all of the above DNA material (substrings) separately (one beaker corresponding to each different string).

  • Step 6 (lab): mix all the edges and complementary vertices:

    • The complementary vertices will, lego-like, bind edges in sequence.

    • This will produce all possible paths in the graph, including invalid ones (without "start" and "end").

      Exercise: Why does this produce all paths?

  • Step 6 (lab): extract all paths beginning with "start" and ending with "end".
    => use PCR techniques

  • Step 7 (lab): separate out the DNA with the correct length (exactly n substrings)
    => use gel-electrophoresis.

    • Note: this will result in paths of exactly length n.
    • However, it will also contain paths with repetitions.

  • Step 8 (lab):
    • Filter out all paths that don't contain the string for v1
      => use v'1 (complement) to bind.
    • This leaves all paths containing v1.
    • Next, filter out all paths that don't contain v2.
    • ... (repeat for all vertices)
    • What remains: the DNA representation of all paths of length n that contain all vertices
      => Hamiltonian paths!

Summary:

  • The purpose was to show that DNA and chemical processes can "compute".

  • Potential efficiency: chemical reactions occur in parallel.

  • It is not yet a practical method:
    • Problems need to be carefully coded.
    • Encoding takes time.
    • Macro-scale experimentation results in errors
      => (fraction of a teaspoonful required for 7-vertex graph).

  • Related work:
    • Using DNA for building "wetware" (gates, flip-flops).
    • Using proteins for computation.
    • Self-assembling nanotechnology.


Genetic Algorithms and Combinatorial Optimization Problems

Key ideas:

  • Use evolution as a metaphor for "winnowing" solutions.

  • Outline (for TSP):

    • Each candidate TSP tour is a "genome" (animal).
    • Start with a large number of potential solutions (initial population).
    • At each step generate a new population:
      • Use mutation to "explore"
      • Use mating to preserve "good characteristics".
    • Weak (high-cost) solutions "die".
    • Strong (low-cost) solutions "survive".
    • Eventually, optimal solution should dominate population.

Details: (TSP example)

  • Input: the n TSP points.

  • Associate tour with genome.

  • A genome's fitness is the tour's length.
    (shorter the better).

  • Step 1: create an initial population of m random tours (.e.g, m = 1000).
    (They don't have to be unique).

  • Step 2: Compute the "fitness" value of each genome (tour).
    Example with four 5-city tours:
    ID Genome
    (tour)
    Fitness
    (inverse tour length)
    Fraction of
    total (PDF)
    1 0-1-2-3-4 27.5 0.31
    2 4-0-1-3-2 12.95 0.15
    3 0-2-1-3-4 9.3 0.11
    4 2-4-3-0-1 36.0 0.42


    87.75
    (total)
    1.00
    (total)

  • Step 3: compute the population PDF (Probability Distribution Function) => fraction based on fitness.
    • Compute the total fitness (sum of tour costs).
    • Compute what fraction of the total each fitness value amounts to.
    • The fractions are the PDF.

  • Step 4: generate a new population drawing from the PDF
    => about 31% (on average) of the new population will contain genome 1 and 11% will contain genome 3.

  • Step 5: Apply crossover rules (mating):
    • The crossover-fraction is an algorithm parameter, e.g., crossover-fraction = 0.3
      => 30% of genome-pairs will engage in crossover.
    • Select a random 30% of pairs randomly (assuming crossover-fraction = 30).
    • Apply a crossover rule to each such pair: exchange parts of genomes between the pair.

  • Step 6: Apply mutation
    • mutation-fraction is an algorithm parameter.
    • mutation-fraction = fraction of genomes to mutate
      (e.g., 0.05)
    • Select 5% of the genomes (randomly) to mutate.
    • Apply mutation to each (make a slight adjustment in the tour).

  • Repeat steps 2-6 until fitness values converge
    => population is dominated by high-fitness genomes (tours).

Crossover and mutation in TSP:

  • How do we "mate" two TSP tours?

    • Take the first part (about half) of Tour 1.
    • Take the remaining points in the order these points are found in Tour 2.

  • How do we mutate a TSP tour?
    => swap 2 points

Summary:

  • Advantages of genetic algorithms:
    • Genetic algorithms are easy to implement.
    • Like simulated annealing, the problem-specific part can be separated out from the generic algorithm.
    • If re-arrangements in fact do impact the solution, genetic algorithms have a reasonable chance of finding a good solution.
    • By its nature genetic algorithms try many initial solutions (simultaneously)
      => simulated annealing needs to be re-run with different starting solutions.

  • Disadvantages:
    • Genetic algorithms are slow.
    • It's hard to define meaningful crossovers and mutations for some problems.
    • It requires some experimentation to get it working.
      => it's easier to automate this part in simulated annealing.
    • Generally, simulated annealing (with appropriate modification) is thought to be a better option.

  • Warning:
    • Its biological origins do not give it any special advantage
      => beware of its mystical appeal!


Neural Networks

Simplified neuron architecture:

Artificial neuron:

  • Structure:

  • Rule: Neuron fires if X1 + X2 + X3 > T.

A simple application to pattern recognition:

McCulloch-Pitt neuron model:

  • Structure:

  • Inputs: X1, X2, ..., Xn
  • An "importance" weight Wi is associated with each input i.
  • Neuron has output only if W1X1 + ... + WnXn > T.

Networks of neurons:

  • Neural networks are used to approximate unknown functions.

  • Key ideas:
    • Each network has a set of parameters (neuron weights).
    • Use a "training set" of samples from unknown function to set parameters.
    • Once parameters are set, the function is approximated.