NP-Complete Theory

A problem is said to be polynomial if there exists an algorithm that solves the problem in time T(n)=O(n^c), where c is a constant.
Examples of polynomial problems:
- Sorting: O(n log n) = O(n²)
- All-pairs shortest path: O(n³)
- Minimum spanning tree: O(E log E)= O(E²)
A problem is said to be exponential if no polynomial-time algorithm can be developed for it and if we can find an algorithm that solves it in O(n^u(n)), where u(n) goes to infinity as n goes to infinity.
The world of computation can be subdivided into three classes:
1. Polynomial problems (P)
2. Exponential problems (E)
3. Intractable (non-computable) problems (I)
There is a very large and important class of problems that
1. we know how to solve exponentially,
2. we don't know how to solve polynomially, and
3. we don't know if they can be solved polynomially at all
This class is a gray area between the P-class and the E-class. It will be studied in this chapter.

II. Definition of NP

Definition 1 of NP: A problem is said to be Nondeterministically Polynomial (NP) if we can find a nodeterminsitic Turing machine that can solve the problem in a polynomial number of nondeterministic moves.
For those who are not familiar with Turing machines, two alternative definitions of NP will be developed.
Definition 2 of NP: A problem is said to be NP if
1. its solution comes from a finite set of possibilities, and
2. it takes polynomial time to verify the correctness of a candidate solution
Remark: It is much easier and faster to "grade" a solution than to find a solution from scratch.
We use NP to designate the class of all nondeterministically polynomial problems.
Clearly, P is a subset of NP
A very famous open question in Computer Science: P = NP ?
To give the 3rd alternative definition of NP, we introduce an imaginary, non-implementable instruction, which we call "choose()".
Behavior of "choose()":
1. if a problem has a solution of N components, choose(i) magically returns the i-th component of the CORRECT solution in constant time
2. if a problem has no solution, choose(i) returns mere "garbage", that is, it returns an uncertain value.
An NP algorithm is an algorithm that has 2 stages:
1. The first stage is a guessing stage that uses choose() to find a solution to the problem.
2. The second stage checks the correctness of the solution produced by the first stage. The time of this stage is polynomial in the input size n.

Template for an NP algorithm:



begin
   /* The following for-loop is the guessing stage*/
   for i=1 to N do 
      X[i] := choose(i);
   endfor


   /* Next is the verification stage */
   Write code that does not use "choose" and that 
   verifies if X[1:N] is a correct solution to the
   problem.
end

Remark: For the algorithm above to be polynomial, the solution size N must be polynomial in n, and the verification stage must be polynomial in n.
Definition 3 of NP: A problem is said to be NP if there exists an NP algorithm for it.
Example of an NP problem: The Hamiltonian Cycle (HC) problem
1. Input: A graph G
2. Question: Goes G have a Hamiltonian Cycle?

Here is an NP algorithm for the HC problem:


begin
   /* The following for-loop is the guessing stage*/
   for i=1 to n do 
      X[i] := choose(i);
   endfor


   /* Next is the verification stage */
   for i=1 to n do 
	for j=i+1 to n do 
	   if X[i] = X[j] then
		return(no);
	   endif 
	endfor 
   endfor 
   for i=1 to n-1 do 
	if (X[i],X[i+1]) is not an edge then
		return(no);
	endif 
   endfor 
   if (X[n],X[1]) is not an edge then
	return(no);
   endif 

   return(yes);
end

The solution size of HC is O(n), and the time of the verification stage is O(n²). Therefore, HC is NP.
The K-clique problem is NP
1. Input: A graph G and an integer k
2. Question: Goes G have a k-clique?

Here is an NP algorithm for the K-clique problem:


begin
   /* The following for-loop is the guessing stage*/
   for i=1 to k do 
      X[i] := choose(i);
   endfor


   /* Next is the verification stage */
   for i=1 to k do 
        for j=i+1 to k do 
           if (X[i] = X[j] or (X[i],X[j]) is not an edge) then
                return(no);
           endif 
        endfor 
   endfor 

   return(yes);
end

The solution size of the k-clique is O(k)=O(n), and the time of the verification stage is O(n²). Therefore, the k-clique problem is NP.

III. Focus on Yes-No Problems

Definition: A yes-no problem consists of an instance (or input I) and a yes-no question Q.
The yes-no version of the HC problem was described above, and so was the yes-no version of the k-clique problem.
The following are additional examples of well-known yes-no problems.
The subset-sum problem:
- Instance: a real array a[1:n]
- Question: Can the array be partitioned into two parts that add up to the same value?
The satisfiability problem (SAT):
- Instance: A Boolean Expression F
- Question: Is there an assignment to the variables in F so that F evaluates to 1?
The Treveling Salesman Problem
The original formulation:
- Instance: A weighted graph G
- Question: Find a minimum-weight Hamiltonian Cycle in G.
The yes-no formulation:
- Instance: A weighted graph G and a real number d
- Question: Does G have a Hamiltonian cycle of weight <= d?

IV. Reductions and Transforms

Notation: If P stands for a yes-no problem, then
- I_P: denotes an instance of P
- Q_P: denotes the question of P
- Answer(Q_P,I_P): denotes the answer to the question Q_P given input I_P
Let P and R be two yes-no problems
Definition: A transform (that transforms a problem P to a problem R) is an algorithm T such that:
1. The algorithm T takes polynomial time
2. The input of T is I_P, and the output of T is I_R
3. Answer(Q_P,I_P)=Answer(Q_R,I_R)
Definition: We say that problem problem P reduces to problem R if there exists a transform from P to R.
Observation (Transitivity of reduction): If a problem P reduces to a problem R, and R reduces to a problem S, then P reduces to S. The reason is that by performing the first transform (from P to R) and then the second transform (from R to S), we get a transform from P to S.

V. NP-Completeness

Definition: A problem R is NP complete if
1. R is NP
2. Every NP problem P reduces to R
An equivalent but casual definition: A problem R is NP-complete if R is the "most difficult" of all NP problems.
Theorem: Let P and R be two problems. If P reduces to R and R is polynomial, then P is polynomial.
Proof:
- Let T be the transform that transforms P to R. T is a polynomial time algorithm that transforms I_P to I_R such that Answer(Q_P,I_P) = Answer(Q_R,I_R)
- Let A_R be the polynomial time algorithm for problem R. Clearly, A takes as input I_R, and returns as output Answer(Q_R,I_R)
- Design a new algorithm A_P as follows:
  Algorithm A_P(input: I_P)
  begin
  I_R := T(I_P);
  x := A_R(I_R);
  return x;
  end
- Note that this algorithm A_P returns the correct answer Answer(Q_P,I_P) because x = A_R(I_R) = Answer(Q_R,I_R) = Answer(Q_P,I_P).
- Note also that the algorithm A_P takes polynomial time because both T and A_R
  Q.E.D.
The intuition derived from the previous theorem is that if a problem P reduces to problem R, then R is at least as difficult as P.
Theorem: A problem R is NP-complete if
1. R is NP, and
2. There exists an NP-complete problem R₀ that reduces to R
Proof:
- Since R is NP, it remain to show that every arbitrary NP problem P reduces to R.
- Let P be an arbitrary NP problem.
- Since R₀ is NP-complete, it follows that P reduces to R₀
- And since R₀ reduces to R, it follows that P reduces to R (by transitivity of transforms).
Q.E.D.
The previous theorem amounts to a strategy for proving new problems to be NP complete. Specifically, to prove a new problem R to be NP-complete, the following steps are sufficient:
1. Prove R to be NP
2. Find an already known NP-complete problem R₀, and come up with a transform that reduces R₀ to R.
For this strategy to become effective, we need at least one NP-complete problem. This is provided by Cook's Theorem below.
Cook's Theorem: SAT is NP-complete.

VI. NP-Completeness of the k-Clique Problem

The k-clique problem was laready shown to be NP.
It remain to prove that an NP-complete problem reduces to k-clique
Theorem: SAT reduces to the k-clique problem
Proof:
- Let F be a Boolean expression.
- F can be put into a conjunctive normal form: F=F₁F₂...F_r
  where every factor F_i is a sum of literals (a literal is a Bollean variable or its complement)
- Let k=r and G=(V,E) defined as follows:
  V={<x_i,F_j> | x_i is a variable in F_j}
  E={(<x_i,F_j> , <y_s,F_t>) | j !=t and x_i != y_s'}
  where y_s' is the complement of y_s
- We prove first that if F is satisfiable, then there is a k-clique.
- Assume F is satisfiable
- This means that there is an assignment that makes F equal to 1
- This implies that F₁=1, F₂=1, ... , F_r=1
- Therefore, in every factor F_i there is (at least) one variable assigned 1. Call that variable z_i
- As a result, <z₁,F₁>, <z₂,F₂>, ... , <z_k,F_k> is a k-clique in G because they are k distinct nodes, and each pair (<z_i,F_i> , <z_j,F_j>) forms an edge since the endpoints come from different factors and z_i != z_j' due to the fact that they are both assigned 1.
- We finally prove that if G has a k-clique, then F is satistiable
- Assume G has a k-clique <u₁,F₁>, <u₂,F₂>, ... , <u_k,F_k> which are pairwise adjacent
- These k nodes come the k fifferent factors, one per factor, becuae no two nodes from the same factor can be adjacent
- Furthermore, no two u_i and u_j are complements because the two nodes <u_i,F_i> and <u_j,F_j> are adjacent, and adjacent nodes have non-complement first-components.
- As a result, we can consistently assign each u_i a value 1.
- This assignment makes each F_i equal to 1 because u_i is one of the additive literals in F_i
- Consequently, F is equal to 1.
Q.E.D.
An illustration of the prrof will be carried out in class on
F=(x₁ + x₂)(x₁' + x₃)(x₂ + x₃')

NP-Complete Theory

I. Introduction

II. Definition of NP

III. Focus on Yes-No Problems

IV. Reductions and Transforms

V. NP-Completeness

I. Introduction

II. Definition of NP

III. Focus on Yes-No Problems

IV. Reductions and Transforms

V. NP-Completeness

VI. NP-Completeness of the k-Clique Problem