Module 7: Graphs, Part I

7.1 What is a Graph?

Informal definition:

A graph is a mathematical abstraction used to represent "connectivity information".
A graph consists of vertices and edges that connect them, e.g.,
It shouldn't be confused with the "bar-chart" or "curve" type of graph.

Formally:

A graph G = (V, E) is:
- a set of vertices V
- and a set of edges E = { (u, v): u and v are vertices }.
Two types of graphs:
- Undirected graphs: the edges have no direction.
- Directed graphs: the edges have direction.
Example: undirected graph
- Edges have no direction.
- If an edge connects vertices 1 and 2, either convention can be used:
  - No duplication: only one of (1, 2) or (2, 1) is allowed in E.
  - Full duplication: both (1, 2) and (2, 1) should be in E.
Example: directed graph
- Edges have direction (shown by arrows).
- The edge (3, 6) is not the same as the edge (6, 3) (both exist above).

Depicting a graph:

The picture with circles (vertices) and lines (edges) is only a depiction
⇒ a graph is purely a mathematical abstraction.
Vertex labels:
- Can use letters, numbers or anything else.
- Convention: use integers starting from 0.
  ⇒ useful in programming, e.g. degree[i] = degree of vertex i.
Edges can be drawn "straight" or "curved".
The geometry of drawing has no particular meaning:

Graph conventions:

What's allowed (but unusual) in graphs:
- Self-loops (occasionally used).
- Multiple edges between a pair of vertices (rare).
- Disconnected pieces (frequent in some applications).
Example:
What's not (conventionally) allowed:
- Mixing undirected and directed edges.
- Re-using labels in vertices.
- Bidirectional arrows.
Most common:
- No multiple edges.
- No self-loops.
Other terms used:
- Vertices: nodes, terminals, endpoints.
- Edges: links, arcs.

In-Class Exercise 7.1: If we disallow multiple edges and self-loops, what is the maximum number of edges in an undirected graph with n vertices? What is this number in order-notation? Explain your reasoning.

Definitions:

Degrees:
- Undirected graph: the degree of a vertex is the number of edges incident to it.
- Directed graph: the out-degree is the number of (directed) edges leading out, and the in-degree is the number of (directed) edges terminating at the vertex.
Neighbors:
- Two vertices are neighbors (or are adjacent) if there's an edge between them.
- Two edges are neighbors (or are adjacent) if they share a vertex as an endpoint.
Paths:
- Undirected: a sequence of vertices in which successive vertices are adjacent.
- Directed: a sequence of vertices in which every pair of successive vertices has this property: there's a directed edge from the first to the second.
- A simple path does not repeat any vertices (and therefore edges) in the sequence.
- A cycle is a simple path with the same vertex as the first and last vertex in the sequence.
Connectivity:
- Undirected: Two vertices are connected if there is a path that includes them.
- Directed: Two vertices are strongly-connected if there is a (directed) path from one to the other.
Components:
- A subgraph is a subset of vertices together with the edges from the original graph that connects vertices in the subset.
- Undirected: A connected component is a subgraph in which every pair of vertices is connected.
- Directed: A strongly-connected component is a subgraph in which every pair of vertices is strongly-connected.
- A maximal component is a connected component that is not a proper subset of another connected component.
Digraph: another name for a directed graph.

Example:

More definitions:

Euler tour: A cycle that traverses all edges exactly once (but may repeat vertices).

Known result: Euler tour exists if and only if all vertices have even degree.
Hamiltonian tour: A cycle that traverses all vertices exactly once.

Known result: testing existence of a Hamiltonian tour is (very) difficult.
Euler path: A path that traverses all edges exactly once.
Hamiltonian path: A path that traverses all vertices exactly once.
Trees:
- A tree is a connected graph with no cycles.
- A spanning tree of a graph is a connected subgraph that is a tree.
Weighted graphs:
- Sometimes, we include a "weight" (number) with each edge.
- Weight can signify length (for a geometric application) or "importance".
- Example:

Why are graphs important?

History:
- Maze-searching.
- Euler's crossing problem: the Konigsberg bridges
  How to cross each bridge just once and return to starting point?
Applications:
- Fundamental mathematical construct to represent "connectivity".
- Appears in thousands of problems.
- Source of many classic problems: traveling salesman, routing, spanning trees.
- Many "graph-structured" applications: networks, transportation-systems, electronic circuits, molecules.
Source of theory:
- Many important algorithms.
- Key to understanding algorithm design and analysis.
- Simple to describe, yet perplexing:
  - Euler tour: easy problem.
  - Hamiltonian tour: hard problem.

The field of graph theory:

Large area of mathematics:
- Analysis of general graphs.
- Analysis of special types of graphs.
- Many classic problems
  e.g., the four-color theorem.
- Optimization problems based on graphs,
  e.g., shortest-paths.
Graph algorithms: an area in computer science.
- Rich source of algorithms, theory, insight.
- Useful algorithms used in many applications
  (e.g., in a compiler).

In this course:

Fundamental algorithms for exploring a graph: breadth-first and depth-first.
Finding shortest paths and minimum spanning tree.
Convention about undirected vs. directed:
- When not specified, assume undirected.
- Unless otherwise mentioned, an algorithm or definition about undirected graphs usually can be modified to apply to directed graphs.
  (although, directed graphs are usually more complicated).

In-Class Exercise 7.2: Suppose d_i is the degree of vertex i in a connected undirected graph with n vertices and m edges. Let D = d₁ + ... + d_n. What is the relation between D and m?

7.2 Graph Data Structures

First, an idea that doesn't work:

We have already represented trees (like binary trees) with node instances and pointers between instances.
Idea: use a node instance for each vertex, and a pointer from one vertex to another if an edge exists between them.

In-Class Exercise 7.3: Draw an example graph and the corresponding pointer-based data structure. What are the pros and cons of this data structure?

The two fundamental data structures:

Adjacency matrix.

Key idea: use a 2D matrix.
Row i has "neighbor" information about vertex i.
Undirected: adjMatrix[i][j] = 1 if and only if there's an edge between vertices i and j.
adjMatrix[i][j] = 0 otherwise.
Directed: adjMatrix[i][j] = 1 if and only if there's an edge from i to j.
adjMatrix[i][j] = 0 otherwise.

Example: undirected


      0 1 1 0 0 0 0 0
      1 0 1 0 0 0 0 0
      1 1 0 1 0 1 0 0
      0 0 1 0 1 0 1 0
      0 0 0 1 0 0 1 0 
      0 0 1 0 0 0 0 1
      0 0 0 1 1 0 0 0
      0 0 0 0 0 1 0 0

Note: adjMatrix[i][j] == adjMatrix[j][i] (convention for undirected graphs).

Example: directed


      0 1 0 0 0 0 0 0
      0 0 1 0 0 0 0 0
      1 0 0 1 0 1 0 0
      0 0 0 0 1 0 1 0
      0 0 0 0 0 0 1 0 
      0 0 0 0 0 0 0 1
      0 0 0 1 0 0 0 0
      0 0 0 0 0 0 0 0

Adjacency list.
- Key idea: use an array of vertex-lists.
- Each vertex list is a list of neighbors.
- Example: undirected
- Example: directed
- Convention: in each list, keep vertices in order of insertion
  ⇒ add to rear of list
Both representations allow complete construction of the graph.
Advantages of matrix:
- Simple to program.
- Some matrix operations (multiplication) are useful in some applications (connectivity).
- Efficient for dense (lots of edges) graphs.
Advantages of adjacency list:
- Less storage for sparse (few edges) graphs.
- Easy to store additional information in the data structure.
  (e.g., vertex degree, edge weight)

In-Class Exercise 7.4: Suppose a graph has V vertices and E edges. In order-notation (in terms of V and E), what is the size of the adjacency matrix representation? What is the size of the adjacency list representation?

In-Class Exercise 7.5: Write (parts of) a Java class to implement the adjacency list representation. Do not write all the details, but only the code for initializing the data structure. Assume the number of nodes is an integer given as parameter to the constructor.

Graph ADT:

ADT = Abstract Data Type.
What operations should a graph ADT support?
- Insertion of vertices, edges.
- Is the graph connected?
- What are the connected components?
- Does the graph have a cycle?

Example: consider this interface for an undirected graph ADT:


public interface UndirectedGraphSearchAlgorithm {

  // Call this once to set up data structures. 

  public void initialize (int numVertices, boolean isWeighted);


  // Call this repeatedly with each edge to be inserted: 

  public void insertUndirectedEdge (int startVertex, int endVertex, double weight);


  // Return the number of connected components. If "1", then the whole 
  // graph is connected. 
  
  public int numConnectedComponents ();


  // Return the component to which each vertex belongs. 

  public int[] componentLabels ();


  // Is there a cycle in the graph? 

  public boolean existsCycle();

}

In-Class Exercise 7.6: How would you use this ADT to tell whether a graph is a connected tree?

Next, let's look at some sample code using the matrix representation:

Convention for weighted graphs: use weight=0 to represent the lack of an edge.


public class UndirectedGraphSearch implements UndirectedGraphSearchAlgorithm {

    int numVertices;                 // Number of vertices, given as input. 
    int numEdges;                    // We keep track of the number of edges. 

    boolean isWeighted;              // Is this a weighted graph? 

    double [][] adjMatrix;           // The matrix. Note: we use "double" to store 
                                   // "double" weights, if the graph is weighted. 


    public void initialize (int numVertices, boolean isWeighted)
    {
        // Store: 
        this.numVertices = numVertices;
        this.isWeighted = isWeighted;

        // Create the adjacency matrix. 
        adjMatrix = new double [numVertices][];
        for (int i=0; i < numVertices; i++) {
            adjMatrix[i] = new double [numVertices];
            for (int j=0; j < numVertices; j++)
                adjMatrix[i][j] = 0;
        }
        numEdges = 0;
    }


    // Insert a given input edge. 

    public void insertUndirectedEdge (int startVertex, int endVertex, double weight)
    {
        // Unweighted graph: use weight 1.0. 
        if (! isWeighted) {
            weight = 1.0;
        }

        // Insert in both places: 
        adjMatrix[startVertex][endVertex] = weight;
        adjMatrix[endVertex][startVertex] = weight;  

        numEdges ++;
    }

}

The same with an adjacency list:


public class UndirectedGraphSearch implements UndirectedGraphSearchAlgorithm {

    int numVertices;                 // Number of vertices, given as input. 
    int numEdges;                    // We keep track of the number of edges. 

    boolean isWeighted;              // Is this a weighted graph? 

    LinkedList[] adjList;            // The list (of lists): one list for each vertex. 
                                   // We are using java.util.LinkedList as our linked list. 
                                   // This will store instances of GraphEdge. 

    // We will test new insertions to see if they exist. Because LinkedList performs 
    // ".equals()" testing for containment, we need an instance (and only one) of GraphEdge 
    // for such testing. 
    GraphEdge testEdge = new GraphEdge (-1, -1, 0);


    public void initialize (int numVertices, boolean isWeighted)
    {

        // Store: 
        this.numVertices = numVertices;
        this.isWeighted = isWeighted;

        // Adjacency-list representation 
        adjList = new LinkedList [numVertices];
        for (int i=0; i < numVertices; i++)
            adjList[i] = new LinkedList ();
            // Note: we are using java.util.LinkedList. 

        numEdges = 0;

    }


    // Insert a given input edge. 

    public void insertUndirectedEdge (int startVertex, int endVertex, double weight)
    {
        if (! isWeighted) {
           weight = 1.0;
        }

        // Adj-list representation: see if the edge is already there. 
        testEdge.startVertex = startVertex;
        testEdge.endVertex = endVertex;

        // Exploit the methods in java.util.LinkedList. 
        if (adjList[startVertex].contains (testEdge)) {
            // ... report error ... 
            return;
        }

        // It's undirected, so add to both vertex lists. 
        GraphEdge e = new GraphEdge (startVertex, endVertex, 1.0);
        adjList[startVertex].addLast (e);

        // We wouldn't have this for directed graphs: 
        GraphEdge e2 = new GraphEdge (endVertex, startVertex, 1.0);
        adjList[endVertex].addLast (e2);

        numEdges ++;

    }

}

7.3 Breadth-First Search

About graph search:

"Searching" here means "exploring" a particular graph.
Searching will help reveal properties of the graph
e.g., is the graph connected?
Usually, the input is: vertex set and edges (in no particular order).

Key ideas in breadth-first search: (undirected)

Mark all vertices as "unvisited".
Initialize a queue (to empty).
Find an unvisited vertex and apply breadth-first search to it.
Add the vertex's neighbors to the queue.
Repeat: extract a vertex from the queue, and add its "unvisited" neighbors to the queue.

Example:

Initially, place vertex 0 in the queue.
Dequeue 0
⇒ mark it as visited, and add its unvisited neighbors to queue:
Dequeue 1
⇒ mark it as visited, and add its unvisited neighbors to queue:
Dequeue 2
⇒ mark it as visited, and add its unvisited neighbors to queue:
Dequeue 2
⇒ it's already visited, so ignore.
Continuing ...
Breadth-first search tree, and visit order:
- Exploring an edge: examining an unvisited neighbor.
- If an unvisited neighbor gets on the queue for the first time, the corresponding edge is called a "tree edge".
- Putting the tree edges and all vertices together results in: the breadth-first search tree.
- For a particular graph and its implementation, the tree produced is unique.
- However, starting from another vertex will result in another tree, that may be just as useful.

In-Class Exercise 7.7: What is the visit-order and breadth-first search tree for this graph:

Searching an unconnected graph:

The connected components are explored in order:
Example:

The tree, and visit order:

Implementation:

There are two varieties:
- Allow a vertex to have two states: "unvisited" and "visited"
  - An "unvisited" vertex can get placed in the queue multiple times.
    It's possible to dequeue a "visited" vertex.
  - When dequeueing, check whether vertex was visited.
- Allow three states: "unvisited", "in-queue", and "visited":
  - When placing a neighbor in the queue for the first time, mark it as "in-queue".
  - Place only "unvisited" neighbors in queue.
    ⇒ Queue has only unique, unvisited vertices.
  - Extra space required.
We'll use the former approach, with an adjacency matrix:

Pseudocode:


Algorithm: breadthFirstMatrix (adjMatrix, n)
Input: A graph's adjacency matrix, number of vertices n.

     // Visit order will start with "0", so initialize to -1. 
1.   for i=0 to n-1
2.       visitOrder[i] = -1
3.   endfor

     // A counter for the order: 
4.   visitCount = -1

     // Standard queue data structure. 
5.   Create queue;

     // Look for an unvisited vertex and explore its tree. 
     // We need this because the graph may have multiple components. 

6.   for i=0 to n-1
7.       if visitOrder[i] < 0 
           // We call this "iterative" because other searches are recursive. 
8.         breadthFirstMatrixIterative (i)
9.       endif
10.  endfor


Algorithm: breadthFirstMatrixIterative (u)
Input: vertex u, adjMatrix is assumed to be global.

     // Queue needs to be reset for each tree. 
1.   Clear queue;

     // Place root of tree on the queue. 
2.   queue.addToRear (u);

     // Continue processing vertices until no more can be added. 
3.   while queue not empty

         // Remove a vertex. 
4.       v = remove item at front of queue;
         // If it hasn't been visited ... 
5.       if visitOrder[v] < 0
           // Visit the vertex. 
6.         visitCount = visitCount + 1
7.         visitOrder[v] = visitCount
           // Look for neighbors to visit. 
8.         for i=0 to n-1
9.           if adjMatrix[v][i] = 1 and i != v  // Check self-loop: i != v 
                and visitOrder[i] < 0           // unvisited
10.            queue.addToRear (i)
11.          endif
12.        endfor
13.      endif

14.  endwhile

In-Class Exercise 7.8: Does the above code identify components? If so, where? If not, how can you modify the code to identify components?

A sample Java implementation: (source file)


import java.util.*;

public class UndirectedBreadthFirstMatrix {
  
  int numVertices;                 // Number of vertices, given as input. 
  int numEdges;                    // We keep track of the number of edges. 
  boolean isWeighted;              // Is this a weighted graph? 
  boolean useMatrix = true;        // Adjacency-matrix or list? 
  double [][] adjMatrix;           // The matrix. Note: we use "double" to store 
                                   // "double" weights, if the graph is weighted. 

  int[] visitOrder;                // visitOrder[i] = the i-th vertex to be visited in order. 
  int visitCount;                  // We will track visits with this counter. 
  LinkedList queue;                // The queue for breadth-first, a java.util.LinkedList instance. 


  public void initialize (int numVertices, boolean isWeighted)
  {
    // ... We've seen this before ... 
  }


  public void insertUndirectedEdge (int startVertex, int endVertex, double weight)
  {
    // ... 
  }
  


  //------------------------------------------------------------------------------- 
  // BREADTH-FIRST SEARCH 


  // Initialize visit information before search. 

  void initSearch ()
  {
    // IMPORTANT: all initializations will use "-1". We will test 
    // equality with "-1", so it's important not to change this cavalierly. 
    visitCount = -1;
    for (int i=0; i < numVertices; i++) {
      visitOrder[i] = -1;
    }
  }

  
  // Matrix implementation of breadth-first search. 

  void breadthFirstMatrix ()
  {
    // 1. Initialize visit variables. 
    initSearch ();

    // 2. Create a queue. 
    queue = new LinkedList();

    // 3. Find an unvisited vertex and apply breadth-first search to it. 
    //    Note: if the graph is connected, the call with i=0 will result 
    //          in visiting all vertices. Nonetheless, we don't know this  
    //          in advance, so we need to march through all the vertices. 

    for (int i=0; i < numVertices; i++) {
      if (visitOrder[i] == -1) {
        // We call it "iterative" because depthFirst search is "recursive". 
        breadthFirstMatrixIterative (i);
      }
    } 
    
  }
  

  // Apply breadthfirst search to a particular vertex (root of a tree)

  void breadthFirstMatrixIterative (int u)
  {
    // 1. A fresh queue is used for a new breadth-first search. 
    queue.clear();

    // 2. Start with the first vertex: add to REAR of queue using java.util.LinkedList.addLast(). 
    //    Because LinkedList stores objects, we "objectify" the vertex. A more efficient 
    //    implementation would use a separate queue of int's. 
    queue.addLast (new Integer(u));

    // 3. As long as the queue has vertices to process... 
    while (! queue.isEmpty()) {

      // 3.1 Dequeue and extract vertex using java.util.LinkedList.removeFirst(). 
      Integer V = (Integer) queue.removeFirst();
      int v = V.intValue();

      // 3.2 If the vertex has been visited, skip. 
      if (visitOrder[v] != -1)
        continue;

      // 3.3 Otherwise, set its visit order. 
      visitCount++;
      visitOrder[v] = visitCount;

      // 3.4 Next, place its neighbors on the queue. 
      for (int i=0; i < numVertices; i++) {
        // 3.4.1 First, check whether vertex i is a neighbor.
        if ( (adjMatrix[v][i] > 0) && (i != v) ) {

          // 3.4.1.1 If i hasn't been visited, place in queue. 
          if (visitOrder[i] == -1) {
            queue.addLast (new Integer(i));
          }
        } 

      } // end-for 

    } // end-while 

  }
  
} // end-class

Analysis: adjacency matrix

Assume V vertices and E edges.
Each vertex is processed V times (worst-case) in starting a tree.
⇒ O(V).
Searching for a neighbor: O(V) (scan through matrix).
⇒ all scans take O(V²).
Each queue operation is O(1).
Each edge is processed once:
⇒ O(E) queue operations and O(E) vertex manipulations.
Total: O(V² + E) = O(V²).

Analysis: adjacency list

Searching for a neighbor: O(# neighbors)
⇒ total neighbor searches is O(E) (Why?)
Other operations are the same
Total: O(V + E) = O(E).

In-Class Exercise 7.9: Why is O(V² + E) = O(V²)? Which is better: adjacency matrix or adjacency list?

About O(V + E):

Note that O(V + E) = O(E)
⇒ it is written as O(V + E) just for emphasis.
O(V + E) is optimal:
- Every vertex and every edge must be examined.
  ⇒ O(V + E)
  ⇒ not possible to do better than O(V + E)
BFS (with adjacency list) is an example of an optimal algorithm.

Directed graphs:

BFS in directed graphs is similar, except that a vertex's "neighbor" is one that's reachable by an edge going out from the vertex.
⇒ search code is identical (but insertion code is different).

Applications:

Connectivity:
- Breadth-first search identifies connected components.
- However, depth-first search is preferred (required for directed graphs).
Shortest paths:
- A path between two vertices in the tree is the shortest path in the graph.
Optimization algorithms:
- Various problems result in "graph search space".
- BFS together with "exploration rules" is often used to search for solutions (e.g., branch-and-bound exploration).

Note: BFS works on a weighted graph by ignoring the weights and only using connectivity information (i.e., is there an edge or not?).

7.4 Depth-First Search on Undirected Graphs

Key ideas:

Mark all vertices as "unvisited".
Visit first vertex.
Recursively visit its "unvisited" neighbors.

Example:

Start with vertex 0 and mark it visited.
Visit the first unvisited neighbor (node 1), mark it visited and recurse from there.
Explore 1's first unvisited neighbor, 2.
Continuing until all vertices are visited ...
- Vertices are marked in order of visit.
- An edge to an unvisited neighbor that gets visited next is in the depth-first search tree.

In-Class Exercise 7.10: What is the visit-order and depth-first search tree for this graph:

Then, apply breadth-first search, but using a stack instead of a queue. Show the stack contents.

Implementation:

Easier than breadth-first search.

Pseudocode: adjacency matrix


Algorithm: depthFirstMatrix (adjMatrix, n)
Input: A graph's adjacency matrix, number of vertices n.

     // Visit order will start with "0", so initialize to -1. 
1.   for i=0 to n-1
2.       visitOrder[i] = -1
3.   endfor

     // A counter for the order: 
4.   visitCount = -1

     // Look for an unvisited vertex and explore its tree. 
     // We need this because the graph may have multiple components. 
5.   for i=0 to n-1
6.       if visitOrder[i] < 0 
7.         depthFirstMatrixRecursive (i)
8.       endif
9.   endfor


Algorithm: depthFirstMatrixRecursive (v)
Input: vertex v, adjMatrix is assumed to be global.

     // Mark vertex v as visited. 
1.   visitCount = visitCount + 1
2.   visitOrder[v] = visitCount

     // Look for first unvisited neighbor. 
3.   for i=0 to n-1
4.       if adjMatrix[v][i] > 0 and i != v  
5.           if visitOrder[i] < 0
                 // If unvisited visit recursively. 
6.               depthFirstMatrixRecursive (i)
7.           endif
8.       endif
9.   endfor

Completion order:

In breadth-first search, once a vertex is processed, it is never processed again.
In depth-first, we also encounter a vertex after returning from the recursive call.
⇒ we can record a completion order.

To record the completion order, simply increment the completion-counter after the recursion:


Algorithm: depthFirstMatrixRecursive (v)
Input: vertex v, adjMatrix is assumed to be global.

     // Mark vertex v as visited. 
1.   visitCount = visitCount + 1
2.   visitOrder[v] = visitCount

     // Look for first unvisited neighbor. 
3.   for i=0 to n-1
4.       if adjMatrix[v][i] > 0 and i != v  // Check self-loop. 
5.           if visitOrder[i] < 0
                 // If unvisited visit recursively. 
6.               depthFirstMatrixRecursive (i)
7.           endif
8.       endif
9.   endfor

     // After returning from recursion, set completion order:     
10.  completionCount = completionCount + 1
11.  completionOrder[v] = completionCount

Note: completionCount and completionOrder need to be initialized (not shown) in depthFirstMatrix.

Sample Java code: (source file)


public class UndirectedDepthFirstMatrix {

  int numVertices;                 // Number of vertices, given as input. 
  int numEdges;                    // We keep track of the number of edges. 
  boolean isWeighted;              // Is this a weighted graph? 
  boolean useMatrix = true;        // Adjacency-matrix or list? 
  double [][] adjMatrix;           // The matrix. Note: we use "double" to store 
                                   // "double" weights, if the graph is weighted. 

  int[] visitOrder;                // visitOrder[i] = the i-th vertex to be visited in order. 
  int visitCount;                  // We will track visits with this counter. 
  int[] completionOrder;           // completionOrder[i] = the i-th vertex that completed. 
  int completionCount;             // For tracking. 


  public void initialize (int numVertices, boolean isWeighted)
  {
    // ... 
  }


  public void insertUndirectedEdge (int startVertex, int endVertex, double weight)
  {
    // ... 
  }
  

  //------------------------------------------------------------------------------- 
  // DEPTH-FIRST SEARCH 

  // Initialize visit information before search. 

  void initSearch ()
  {
    // IMPORTANT: all initializations will use "-1". We will test 
    // equality with "-1", so it's important not to change this cavalierly. 
    visitCount = -1;
    completionCount = -1;
    for (int i=0; i < numVertices; i++) {
      visitOrder[i] = -1;
      completionOrder[i] = -1;
    }
  }


  // Matrix implementation of depth-first search. 

  void depthFirstMatrix ()
  {
    // 1. Initialize visit variables (same initialization for breadth-first search). 
    initSearch ();

    // 2. Find an unvisited vertex and apply depth-first search to it. 
    //    Note: if the graph is connected, the call with i=0 will result 
    //          in visiting all vertices. Nonetheless, we don't know this  
    //          in advance, so we need to march through all the vertices. 
    for (int i=0; i < numVertices; i++) {
      if (visitOrder[i] < 0) {
        depthFirstMatrixRecursive (i);
      }
    }
  }


  // Recursive visiting of vertices starting from a vertex. 

  void depthFirstMatrixRecursive (int v)
  {
    // 1. First, visit the given vertex. Note: visitCount is a global. 
    visitCount++;
    visitOrder[v] = visitCount;

    // 2. Now find unvisited children and visit them recursively. 
    for (int i=0; i < numVertices; i++) {

      // 2.1  Check whether vertex i is a neighbor and avoid self-loops. 
      if ( (adjMatrix[v][i] > 0) && (i != v) ) {   
        if (visitOrder[i] == -1) {  
          // i is an unvisited neighbor. 
          depthFirstMatrixRecursive (i);
        }
      } 
    }

    // 3. After returning from recursion(s), set the post-order or "completion" order number. 
    completionCount++;
    completionOrder[v] = completionCount;
  }

} // end-class

"Back" and "Down" edges:

Consider the previous example:
Back edge
- When vertex 2 probes neighbor 0, 0 has already been visited.
- Vertex 0 is higher up in the tree (ancestor of 2)
  ⇒ 0 has a lower visitOrder
  ⇒ visitOrder[0] < visitOrder[2].
- The exploration from 2 to 0 is called a back edge.
Down edge:
- When vertex 3 probes neighbor 6, 6 has already been visited.
- Vertex 6 is lower down in the tree (descendant of 3)
  visitOrder[6] > visitOrder[3]
- The exploration from 3 to 6 is called a down edge.

Let's modify the code to detect back and down edges:


Algorithm: depthFirstMatrixRecursive (v)
Input: vertex v to be explored, adjMatrix is global.

     // Mark vertex v as visited. 
1.   visitCount = visitCount + 1
2.   visitOrder[v] = visitCount

     // Look for first unvisited neighbor. 
3.   for i=0 to n-1
4.       if adjMatrix[v][i] > 0 and i != v  // Check self-loop. 

5.           if visitOrder[i] < 0
                 // If unvisited visit recursively. 
6.               depthFirstMatrixRecursive (i)
7.           else if (i != u)  // Avoid trivial case. 
8.               if visitOrder[i] < visitOrder[v] 
                     // Found a back edge. 
9.                   numBackEdges = numBackEdges + 1
10.              else
                     // Found a down edge. 
11.                  numDownEdges = numDownEdges + 1
12.              endif
13.          endif  // visitOrder < 0 

14.       endif  // adjMatrix[v][i] > 0 
15.   endfor


     // After returning from recursion, set completion order:     
16.  completionCount = completionCount + 1
17.  completionOrder[v] = completionCount

Why identify back and down edges?
- A graph with no back edges revealed by DFS is an acyclic graph.
- Back and down edges are also useful in identifying so-called articulation edges (see below)

Identifying connected components:

Depth-first search is most often used for identifying connected components in an undirected graph.
Key ideas:
- Every time depth-first is re-started (in depthFirstMatrix)
  a new component has been found.
  create a new component label.
- In the recursion (depthFirstMatrixRecursive), simply identify each vertex with the current component label.

Let's re-write (part of) the pseudocode to identify components:


Algorithm: depthFirstMatrix (adjMatrix, n)
Input: A graph's adjacency matrix, number of vertices n.

     // Visit order will start with "0", so initialize to -1. 
1.   for i=0 to n-1
2.       visitOrder[i] = -1
3.   endfor

     // A counter for the order: 
4.   visitCount = -1
5.   completionCount = -1
6.   currentComponentLabel = -1

     // Look for an unvisited vertex and explore its tree. 
     // We need this because the graph may have multiple components. 
7.   for i=0 to n-1
8.       if visitOrder[i] < 0 
9.           currentComponentLabel = currentComponentLabel + 1
10.          depthFirstMatrixRecursive (i)
11.      endif
12.  endfor


Algorithm: depthFirstMatrixRecursive (v)
Input: vertex v, adjMatrix is assumed to be global.

     // Mark vertex v as visited. 
1.   visitCount = visitCount + 1
2.   visitOrder[v] = visitCount

     // Mark the component label 
3.   componentLabel[v] = currentComponentLabel

     // Look for first unvisited neighbor...  
4.   // ... etc (same as earlier version)

Sample Java code: (source file).

In-Class Exercise 7.11: Trace the execution of DFS on the graph below to identify exactly when the components are identified and labeled.

Analysis: adjacency matrix

Same as breadth-first search: O(V²)
Why?
- O(1) work for processing each vertex (except for identifying neighbors).
- O(V) work for identifying neighbors.
  ⇒ O(V²) overall.

Analysis: adjacency list

Similar analysis (to breadth-first search) gives: O(V + E).
DFS with adjacency list is optimal.

Applications:

Connectivity: identifying connected components.
⇒ which earlier-stated problem would this solve?
Cycle existence.
Others: finding articulation edges, vertices, "bipartiteness".
Identifying equivalence classes

7.5 Finding Articulation Vertices using DFS

An articulation vertex is a vertex whose removal disconnects a connected graph:

Recall visit order: the order in which DFS visits a node.

A back edge is an edge to a node that was visited earlier.

Key observation: if all nodes visited after node 6 (in this example) have no back edges to a node visited before 6, then 6 is an articulation vertex.

If, on the other hand, there were a back edge to some predecessor (in visit order) of 6, there'd be an alternative path if 6 were removed.

How do we track whether visit-order successors of 6 have no back edge to its predecessors?
⇒ Track the lowest visit order amongst back edges

Thus, we'll track the minimal visit-order among back edges we encounter as we descend in recursion and back out of recursion
⇒ Anytime the minimal is less than a node's own visit order, we've found a back edge to a predecessor

In pseudocode:


Algorithm: depthFirstMatrixRecursive (u, v)
//Note: u is the vertex from which we reach v
1.   visitOrder[v] = ++ visitCount
2.   minBackVisitOrder = visitOrder[v]

3.   for i=0 to n-1

4.       if i is a neighbor

5.           if visitOrder[i] < 0
6.               m = depthFirstMatrixRecursive (v, i)
7.               if m ≥ visitOrder[v]
8.                   print v                       // v is an articulation vertex
9.               endif
10.              if m < minBackVisitOrder
11.                  minBackVisitOrder = m
12.              endif

13.          else if (i != u)
14.              if visitOrder[i] < visitOrder[v]  // Found a back edge. 
15.                  if visitOrder[i] < minBackVisitOrder
                         minBackVisitOrder = visitOrder[i]
                     endif
16.              endif
17.          endif  // visitOrder < 0 

18.       endif  // i is a neighbor 

19.   endfor

20.  return minBackVisitOrder

Note:

In this variation of DFS, we pass the "from" vertex (vertex u) along in the recursive call.
This is useful in some DFS applications, in this case, to avoid using the "from" vertex in assessing visit order.
Some additional code (not shown to avoid clutter) is needed to address what happens with the first call.

In-Class Exercise 7.12: Trace through the algorithm above, showing the values of visitOrder[v], m and minBackVisitOrder just after line 12. What happens with vertex 0? What is a solution to the issue with vertex 0?

7.6 Depth-First Search in Directed Graphs

Key ideas:

A straightforward depth-first search is similar to the undirected version
⇒ only explore edges going outward from a vertex in a directed graph.
In addition to "back" and "down" edges, it is useful to identify "cross" edges.

Example:

Consider: (slightly different from previous example)
Applying DFS gives:

Let's modify the pseudocode to detect cross edges as well:

Key ideas:
- What used to be a back edge in the undirected version is now either a back edge or cross edge.
- Back or down edge: when the vertex on the other side is "visited" but not yet "completed".
- Cross edge: when the vertex on the other side is "completed".

Pseudocode:


Algorithm: depthFirstMatrixRecursive (u, v)
Input: the vertex u from which this is being called, vertex v
       to be explored, adjMatrix is assumed to be global.

     // Mark vertex v as visited. 
1.   visitCount = visitCount + 1
2.   visitOrder[v] = visitCount

     // Look for first unvisited neighbor. 
3.   for i=0 to n-1
4.       if adjMatrix[v][i] > 0 and i != v  // Check self-loop. 

5.           if visitOrder[i] < 0
                 // If unvisited visit recursively. 
6.               depthFirstMatrixRecursive (v, i)
7.           else if (i != u)  // Avoid trivial case. 
8.               if completionOrder[i] > 0    // Finished processing i
                     // Found a cross edge. 
9.                   numCrossEdges = numCrosEdges + 1
10.              else if visitOrder[i] > visitOrder[v] 
                     // Found a down edge. 
11.                  numDownEdges = numDownEdges + 1
12.              else
                     // Found a back edge 
13.                  numBackEdges = numBackEdges + 1
14.              endif
15.          endif  // visitOrder < 0 

16.       endif  // adjMatrix[v][i] > 0 
17.   endfor

     // After returning from recursion, set completion order:     
18.  completionCount = completionCount + 1
19.  completionOrder[v] = completionCount

Analysis: (same as undirected case)

Adjacency matrix: O(V²).
Adjacency list: O(V+E).

7.7 Strongly-Connected Components in Directed Graphs

Finding components:

Recall, in a directed graph a strongly-connected component is a set of vertices, along with edges associated with those vertices, such that there is a path from every vertex in that set to every other in that set.
Example: vertices 0,1,2 form a strongly-connected component.

In-Class Exercise 7.13: Describe in pseudocode a simple algorithm to find strongly-connected components that uses, as a part of the algorithm, the standard DFS algorithm but slightly modified for directed graphs. Hint: vertices i and j are in the same strongly-connected component if i is reachable from j and j is reachable from i. How long does your algorithm take?

We will look at two different algorithms for finding strongly-connected components:

Kosaraju's Algorithm: this applies DFS twice but in an unusual way.
Tarjan's Algorithm: computes the components directly but is harder to understand.

Kosaraju's algorithm:

First, we need to understand the reverse of a directed graph.
- Suppose G = (V,E) is a graph.
- G' = (V,E') is the reverse of G if E'=E with the direction of edges reversed.
Next, recall completion-order
later completion ⇒ closer to root
The algorithm has two phases:
- First phase: find completion order on reverse graph using standard DFS.
- Second phase: use a modified DFS on original graph in which completion order is used as a priority.

Let's describe the pseudocode first:


Algorithm: stronglyConnectedComponents (adjMatrix, n)
Input: A graph's adjacency matrix, number of vertices n.

1.    G' = reverse (G)
2.    completionOrder = depthFirstSearch (G')
3.    sortOrder = sort vertices in reverse order of completion // Last to first.
4.    componentLabels = modifiedDepthFirstSearch (G, sortOrder)
5.    return componentLabels


Algorithm: modifiedDepthFirstSearch (adjMatrix, sortOrder)
Input: A graph's adjacency matrix, a sort (priority) order of vertices

1.   for i=0 to n-1
2.       visitOrder[i] = -1
3.   endfor
4.   visitCount = -1
5.   currentComponentLabel = -1

     // Look for an unvisited vertex and explore its tree. 
6.   for i=0 to n-1
         // Last-to-first of completion in reverse search. 
7.       v = sortOrder[i]    
8.       if visitOrder[v] < 0 
9.           currentComponentLabel = currentComponentLabel + 1
10.          modifiedDepthFirstMatrixRecursive (v)
11.      endif
12.  endfor
13.  return  componentLabels


Algorithm: modifiedDepthFirstMatrixRecursive (v)

     // Mark vertex v as visited and record component label. 
1.   visitCount = visitCount + 1
2.   visitOrder[v] = visitCount
3.   componentLabel[v] = currentComponentLabel

     // Look for first unvisited neighbor. 
4.   for i=0 to n-1
         u = sortOrder[i]   // Prioritize in sort-order 
5.       if adjMatrix[v][u] > 0 and u != v  
6.           if visitOrder[u] < 0
7.               depthFirstMatrixRecursive (u)
8.           endif 
9.       endif  
10.  endfor

With our example, let's find the reverse and its completion order:
Then, run DFS in the forward (regular graph) direction but using vertices in "opposite order of completion in the reverse-graph"
Why this works for node 5:
- Look at forward DFS at node 5.
- We want to have 7 marked as "visited" so that it does not get into 5's component.
- Thus, we need to start 7's exploration before.
- This comes from 7's later completion time in reverse-DFS.
Why this works for node 0:
- When starting forward-DFS at node 0, node 1 should NOT have been visited.
- This happens when 1 has a earlier completion time in the reverse-DFS.
Why this works for the edge (2,3):
- In forward-DFS at node 2, we want that 3 should have been visited
- This means 3 should have completed later in reverse DFS.
- In other words, it's closer to its reverse-DFS root (which is 4).
The proof is in two parts.
Consider two vertices u and v.
Part (1): Suppose u and v are mutually reachable: does the algorithm report that they are in the same component?
- Suppose u is visited before v in the second-phase.
- Then, DFS will certainly visit v.
  ⇒ The algorithm reports both in the same component.
- The same holds if v is visited before u.
Part (2): Suppose u and v are reported in the same component by the algorithm: is it true that they are mutually reachable?
- Because u and v are in the same reported component, they are part of a single tree
  ⇒ Let w be the root of this tree.
- Then w was visited before u and v.
  ⇒ w has a later phase-1 completion time than u and v.
  (Store this fact for now).
- Next, observe that we reached u from w in the second phase
  ⇒ there's a directed path from w to u in G.
  ⇒ there's a directed path from u to w in G'.
- Suppose there is no path from w to u in G' (the reverse).
  
  ⇒ Then, the path from u to w in G' would result in u having a later completion time
  ⇒ Contradiction.
- Thus, there is a path from w to u in G'.
  ⇒ There is a path in G from u to w,
- Which means, there are paths in G from w to u and back.
The same argument shows that there is path from w to v and back.
⇒ u and v are mutually reachable in G.

Tarjan's Algorithm:

Some intuition:
- Consider what happens when we complete processing "3" in
- How can we identify the strong component "3, 4, 6"?
  - First, there is no back edge that goes from "4, 6" to any node "above" 3.
  - Second, there is a back edge to "3".
  - Third, both "4" and "6" were visited in the descent from "3".
- But nodes "above" 3 will have a lower visitOrder.
  ⇒ keep track of lowest visitOrder reachable by a back edge from a "potential component".
- How to mark vertices in a "component"?
  - Build a stack.
  - Place vertices on the stack as you recurse down.
  - When coming back up the recursion, if you've identified a component, all vertices in the component are going to be at the top of the stack.
  - Example above: "3, 4, 6" will be on the stack when returning to "3".
  - We will place these on the stack when visiting them.
  - Another example: "0, 1, 2" will be at the top when returning to "0".
    (It won't be removed along the way back from "3" to "0").

Pseudocode:


Algorithm: depthFirstMatrixRecursive (u, v)
Input: the vertex u from which this is being called, vertex v
       to be explored, adjMatrix is assumed to be global.

     // Mark vertex v as visited. 
1.   visitCount = visitCount + 1
2.   visitOrder[v] = visitCount

     // Current lowest reachable: 
3.   lowestReachable[v] = visitOrder[v];

     // Important: minLowestReachable is a local variable so that a  
     // fresh version is used in each recursive call. 
4.   minLowestReachable = lowestReachable[v];

     // Initialize stackTop to 0 outside. stackTop points to next available. 
5.   stack[stackTop] = v;
6.   stackTop ++;

     // Look for first unvisited neighbor. 
7.   for i=0 to n-1
8.       if adjMatrix[v][i] > 0 and i != v  // Check self-loop. 

9.           if visitOrder[i] < 0
                 // If unvisited visit recursively. 
10.              depthFirstMatrixRecursive (v, i)
11.          else if (i != u)  // Avoid trivial case. 
                 // Identify back, down and cross edges ... (not shown) 
12.          endif  // visitOrder < 0 

13.          if lowestReachable[i] < minLowestReachable
14.              minLowestReachable = lowestReachable[i]
15.          endif  

16.      endif  // adjMatrix[v][i] > 0 
17.   endfor

     // We reach here after all neighbors have been explored. 
     // Check whether a strong component has been found. 
18.  if minLowestReachable < lowestReachable[v]
         // This is not a component, but we need to update the 
         // lowestReachable for this vertex since its ancestors will 
         // use this value upon returning. 
19.      lowestReachable[v] = minLowestReachable
20.      return
21.  endif
 
     // We've found a component. 
22.  do
         // Get the stack top. 
23.      stackTop = stackTop - 1
24.      u = stack[stackTop];
         // We start currentStrongComponentLabel from "0". 
25.      strongComponentLabels[u] = currentStrongComponentLabel;
         // Set this to a high value so that it does not affect 
         // further comparisons up the DFS tree. 
26.      lowestReachable[u] = numVertices;
27.  while stack[stackTop] != v
    
28.  currentStrongComponentLabel = currentStrongComponentLabel + 1
     // After returning from recursion, set completion order... (not shown)

7.8 Directed Acyclic Graphs (DAG's)

What are they?

A DAG is a directed graph without a cycle
⇒ no path can cause you to revisit a vertex.
Example:
Example application: task scheduling
- A large activity consists of many related tasks.
- Some tasks need to occur before others.
  ⇒ typically a precedence relation among tasks.
- Represent task precedence using a DAG.

Sequential scheduling:

Suppose the tasks represent programs that must be executed on a sequential processor.
Goal: find an execution sequence that does not violate precedence constraints.
Example: 3 0 1 2 5 4 6 7
Note: 4 0 1 2 5 3 6 7 violates precedence requirements.

Topological sort:

A topological sort of a DAG is a sequence of vertices that:
- does not violate precedence;
- contains all the vertices.
A simple (the "classic") algorithm:
- Find a vertex that has no predecessors (zero in-degree)
  (there must be one, or it's not a DAG).
- Add this to the sequence.
- Remove it from the graph.
- In removing, adjust the in-degree of every neighbor of the removed vertex.
- Repeat.

In-Class Exercise 7.14: Apply the classic algorithm to the above example.

Using depth-first search:

Key observation: the vertex whose completion occurs first in depth-first search can be placed last in the sequence.
Why? It has no successors (descendants in the DFS tree).
Steps:
- Examine completionOrder in depth-first search.
- Place first vertex to complete at end of sequence.
- Remove it from the graph.
- Place next vertex to complete in next-to-last in sequence.
- ... and so on ...
Note: after DFS "completes" a vertex, the vertex is never seen again
⇒ no need to remove it from graph!
Example:

Pseudocode:

We only need to record topological sort order every time a completion occurs.

Partial pseudocode:


Algorithm: depthFirstMatrixRecursive (u, v)
Input: the vertex u from which this is being called, vertex v
       to be explored, adjMatrix is assumed to be global.

     // Mark vertex v as visited. 
1.   visitCount = visitCount + 1
2.   visitOrder[v] = visitCount

     // Look for first unvisited neighbor and recurse  
3.   for i=0 to n-1
          // ... (same as before, not shown) ... 
4.   endfor

     // After returning from recursion, set completion order:     
5 .  completionCount = completionCount + 1
6.   completionOrder[v] = completionCount

     // Set topological sort order. 
     // Initialize topSortCount to -1 outside this method 
7.   topSortCount = topSortCount + 1
8.   topSortOrder[topSortCount] = v