Module 10: Supplemental Material


Expression trees

What is an expression tree?

Exercise 1: Draw an expression tree for the expression 10/4 + 2*(5+1). Compute the values at each non-leaf node.

We will now look at a variety of issues related to expression trees:

Let's start with building the tree:

Exercise 2: Convert the expression 10/4 + 2*(5+1) into the form above, with complete parentheses.

We will build the tree as we go along in the parsing:

Here's the program:
// We need this for the Stack class.
import java.util.*;

class ExprTreeNode {
    
    ExprTreeNode left, right;   // The usual pointers.
    boolean isLeaf;             // Is this a leaf?
    int value;                  // If so, we'll store the number here.
    char op;                    // If not, we need to know which operator.

}


class ExpressionTree {

    ExprTreeNode root;


    public void parseExpression (String expr)
    {
        root = parse (expr);
    }
    
    
    // This is the recursive method that does the parsing.

    ExprTreeNode parse (String expr)
    {
        ExprTreeNode node = new ExprTreeNode ();
        
	// Note: expr.charAt(0) is a left paren. 
        // First, find the matching right paren.
        int m = findMatchingRightParen (expr, 1);
        String leftExpr = expr.substring (1, m+1);

        // Bottom-out condition:
        if (m == expr.length()-1) {
            // It's at the other end ⇒ this is a leaf.
            String operandStr = expr.substring (1, expr.length()-1);
            node.isLeaf = true;
            node.value = getValue (operandStr);
            return node;
        }
        
        // Otherwise, there's a second operand and an operator.

	// Find the left paren to match the rightmost right paren.
        int n = findMatchingLeftParen (expr, expr.length()-2);
        String rightExpr = expr.substring (n, expr.length()-1);

	// Recursively parse the left and right substrings.
        node.left = parse (leftExpr);
        node.right = parse (rightExpr);
        node.op = expr.charAt (m+1);

        return node;
    }
    

    int findMatchingRightParen (String s, int leftPos)
    {
        // Given the position of a left-paren in String s,
	// find the matching right paren.

	// Recognize the code?

        Stack<Character> stack = new Stack<Character>();
        stack.push (s.charAt(leftPos));
        for (int i=leftPos+1; i < s.length(); i++) {
            char ch = s.charAt (i);
            if (ch == '(') {
                stack.push (ch);
            }
            else if (ch == ')') {
                stack.pop ();
                if ( stack.isEmpty() ) {
                    // This is the one.
                    return i;
                }
            }
        }
        // If we reach here, there's an error.
        System.out.println ("ERROR: findRight: s=" + s + " left=" + leftPos);
        return -1;
    }


    int findMatchingLeftParen (String s, int rightPos)
    {
        // ... similar ...
    }


    int getValue (String s)
    {
        try {
            int k = Integer.parseInt (s);
            return k;
        }
        catch (NumberFormatException e) {
            return -1;
        }
    }

} // end-ExpressionTree

Exercise 3: Show how the recursion works for the expression (((35)-((3)*((3)+(2))))/(4)). That is, draw the sequence of "stack snapshots" showing recursive calls to the parse() method.

Next, let's write code for computing the value of the expression after it's been parsed:

Why did we do all this?


Postfix expressions

What are they?

Exercise 4: Convert the expression 10/4 + 2*(5+1) into postfix form.

Postfix expressions are evaluated using a stack:

Example: let's evaluate the expression 35 3 3 2 + * - 4 /

Exercise 5: Convert the expression 10/4 + 2*(5+1) into postfix form and show the steps in evaluating the expression using a stack. Draw the stack at each step.

Why is this useful?

Exercise 6: Convert the expression 10/4 + 2*(5+1) into postfix form and write down the push/arithmetic "instructions" corresponding to this expression.

Now let's write code to create postfix from an expression tree and then to evaluate it using a stack:


Enumerating the elements of a data structure

Let's revisit our binary tree map:

How do we implement the method getAllKeyValuePairs()?

  • We will build an array and use an in-order traversal to place elements in the array.

  • Here's the code:
    public class BinaryTreeMap3 {
    
        TreeNode root = null;
        int numItems = 0;
    
        KeyValuePair[] allPairs;  // We will place the elements in this array.
        int currentIndex = 0;
    
        // ... other methods like add(), contains() ... etc
    
    
        public KeyValuePair[] getAllKeyValuePairs ()
        {
            if (root == null) {
                return null;
            }
            
            allPairs = new KeyValuePair [numItems];
            inOrderTraversal (root);
            return allPairs;
        }
        
    
        void inOrderTraversal (TreeNode node)
        {
            // Visit left subtree first.
            if (node.left != null) {
                inOrderTraversal (node.left);
            }
    
    	// Then current node.
            allPairs[currentIndex] = node.kvp;
            currentIndex ++;
    
    	// Now right subtree.
            if (node.right != null) {
                inOrderTraversal (node.right);
            }
        }
    
    } //end-BinaryTreeMap3
    
         

Exercise 7: Download BinaryTreeMap3.java and BinaryTreeMapExample3.java, compile and execute. You will notice that the output is sorted, whereas the input (the order in which the strings were added) was not. How/where did the sorting occur?

Using special purpose iterator classes:

  • Sometimes it is more convenient to use iterator classes.

  • For example:
    import java.util.*;
    
    public class StringExample {
    
        public static void main (String[] argv)
        {
    	// Make a tree (of strings) and add stuff to it.
    	TreeSet<String> tribes = new TreeSet<String>();
    	tribes.add ("Ewok");
    	tribes.add ("Aqualish");
    	tribes.add ("Gungan");
    	tribes.add ("Amanin");
    	tribes.add ("Jawa");
    	tribes.add ("Hutt");
    	tribes.add ("Cerean");
    
    	// Get the data structure to return an object that does the iteration.
    	Iterator iter = tribes.iterator ();
    
    	// Now use the iterator object to perform iteration.
    	while (iter.hasNext()) {
    	    String name = (String) iter.next ();    // Note: a cast is required.
    	    System.out.println (name);
    	}
        }
    
    }
        

Exercise 8: Why is a cast required above?

It turns out that we can use an iterator specialized to strings:

	Iterator<String> iter = tribes.iterator ();
	while (iter.hasNext()) {
	    String name = iter.next ();      // No cast needed.
	    System.out.println (name);
	}
Note:
  • How do we build this feature into our own data structures?
         ⇒ This is a somewhat complicated topic
         ⇒ See Modules 6-7 of the advanced Java material

  • It is even more complicated to write our own iterable versions of data structures that can be specialized to particular types (e.g., strings or integers).
         ⇒ This involves the murky details of generic types in Java.


An application: word counts

Let us develop a simple application of map's: count the number of occurences of words in a body of text:

  • For example, consider this body of text:
              The quick brown fox jumped over the lazy dog, after which 
              the dog jumped on the fox and bit the fox, 
              after which their friendship ended rather abruptly.
           
    For this example, we want the output to read something like:
              4 the         // 4 occurences of "the" in the text
    	  3 fox         // 3 occurences of "fox" in the text
    	  2 jumped      // ... etc
    	  2 dog
    	  2 after
    	  2 which
    	  1 quick
              1 brown
    	  ...
           

  • Our algorithm for this application is quite simple:
    Algorithm: wordCount 
    Output: word counts
      1.    while more words
      2.      w = getNextWord()
      3.      if w is not in dictionary
      4.        add w to dictionary
      5.        set w's count to 1
      6.      else
      7.        increment w's count
      8.      endif
      9.    endwhile
      10.   Print counts for each word
         

  • We will compare a tree data structure with a list.
Here's the program:
import java.util.*;

public class WordCount {

    public static void main (String[] argv)
    {
        countWordsInFileUsingTree ("testfile");
	countWordsInFileUsingList ("testfile");
    }


    static void countWordsInFileUsingTree (String fileName)
    {
	// Create an instance of the data structure.
        BinaryTreeMap3 tree = new BinaryTreeMap3 ();

	// Read a text file and extract the words into an array.
        String[] words = WordTool.readWords (fileName);
        System.out.println ("Read in " + words.length + " words");

	// Put words into data structure. If a word repeats, increment its count.
        for (int i=0; i// Increment count.
                KeyValuePair kvp = tree.getKeyValuePair (words[i]);
		Integer count = (Integer) kvp.value;
                kvp.value = new Integer (count+1);
            }
        }

	// Note use of array:
        KeyValuePair[] uniqueWords = tree.getAllKeyValuePairs ();
        System.out.println ("Found " + uniqueWords.length + " unique words");

	// Sort the words by count.
        Arrays.sort (uniqueWords, new KeyValueComparator());
        for (int i=0; i < uniqueWords.length; i++) {
            System.out.println (uniqueWords[i].value + "  " + uniqueWords[i].key);
        }
    }


    static void countWordsInFileUsingList (String fileName)
    {
	// Our data structure is now a linked list.
        OurLinkedListMap list = new OurLinkedListMap ();

	// ... everything else is the same ...

    }

} //end-WordCount
Note:
  • We have hidden text-parsing and word-extraction in WordTool.

  • The relevant data structure methods we've used are:
    • add()
    • contains()
    • getAllKeyValuePairs() (to get all the key-value pairs as an array)

  • Notice why we get all the key-value pairs as an array
         ⇒ So that we can sort by value

Exercise 9: Download BinaryTreeMap3.java, OurLinkedListMap.java, WordCount.java and testfile. Compile and execute WordCount to make sure it works. Then, perform word counts for this classic book and this one. You can comment out the linked-list version (since it does the same thing and does it slower). Find another free book on-line and apply WordCount to the book.

Exercise 10: Compare the performance between a tree, a hashtable and a linked list for the two large texts above. You will need to add code to use a hashtable for counting. Use OurHashMap.java as the hashtable.

Finally, let us explain how we used Java's sorting algorithm above:

  • First, let's examine the usage:
            KeyValuePair[] uniqueWords = tree.getAllKeyValuePairs ();
    
    	// Sort the words by count.
            Arrays.sort (uniqueWords, new KeyValueComparator());
         
    • Thus, we start with an unsorted array of objects (uniqueWords)
    • In this case, each object is a KeyValuePair.
    • We call the method sort() in the Arrays class in the library.
    • We pass on an instance of KeyValueComparator to the sort algorithm.

  • Obviously, Java's sort algorithm does not know how to sort arbitrary objects that we've created ourselves (such as KeyValuePair)
         ⇒ Recall: we wrote KeyValuePair ourselves.

  • Every sort algorithm needs to be able to compare any two elements
         ⇒ Thus, a sort algorithm needs to be able to compare two KeyValuePair instances.

  • The idea is: we will write a method to do such a comparison and pass that to Java's sort algorithm.

  • But how do we pass a method?
    • We can't. Instead, we pass a class with our method.
    • In our example, that class is KeyValueComparator, a class that we will write.

  • How does Java's sort method know which method to call inside KeyValueCompartor?
         ⇒ Since Java's sort method is already written and compiled, it already calls some method
         ⇒ It is this method that we need to override.

Let's look at some of the details:

  • The signature of Java's sort method (the one that we want to use) is:
        public void sort  (Object[] inputArray, Comparator comp)
        
    (Note: we have simplified it slightly to remove references to generic types).

  • If we look for Comparator in the library, we see that it is an interface:
        public interface Comparator {
    
            public int compare (Object o1, Object o2);
    
            public boolean equals (Object obj);
        
        }
        
    (Again, we've simplified this by removing generic types).

  • Thus, Java's sort algorithm will call the compare method of whatever class is passed in as Comparator in order to compare the objects in inputArray.

  • What we are going to do is pass an array of KeyValuePair's as the first argument.

  • For the second argument, we will create an instance of:
    class KeyValueComparator implements Comparator {
    
        // We decide how to compare two key-value pairs. Java's sort
        // algorithm will repeatedly call this as it compares elements.
    
        public int compare (KeyValuePair kvp1, KeyValuePair kvp2) 
        {
            // Note: the .value variable is of type Object. That's why we need the cast.
    	Integer count1 = (Integer) kvp1.value;
    	Integer count2 = (Integer) kvp2.value;
    
            if (count1 > count2) {        
                return -1;
            }
            else if (count1 < count2) {
                return 1;
            }
            else {
                return 0;
            }
        }
    
    
        // We're required to implement this as part of implementing
        // the interface.
    
        public boolean equals (Object obj)
        {
            return false;
        }
    
    } //end-KeyValueComparator
         
    Note:
    • Notice that our KeyValueComparator class implements the interface Comparator
    • Here, we have provided an implementation of compare() (and equals(), which is required by the interface but not used by the sort algorithm).
    • We write whatever code we like inside compare().
    • Of course, we really do want to properly compare two KeyValuePair objects.
           ⇒ Which is why we extract the integer's from the objects and compare them.

  • When Java's sort gets this object, it calls compare() whenever it needs to (which is very often).

  • Finally, we note that Comparator is really intended to be written for specific types, which we removed to simplify the description.
         ⇒ In our code, we specified the type as KeyValuePair:
    class KeyValueComparator implements Comparator<KeyValuePair> {
    
        public int compare (KeyValuePair kvp1, KeyValuePair kvp2) 
        {
            // ...
        }
    
    }
            

Exercise 11: Add a counter to KeyValueComparator above to count how often the compare() method is called. What is the relation between this count and the size of the array being sorted?

Exercise 12: Examine the logic in compare(). What would happen if we switch the first two return statements to read:

        if (count1 > count2) {        
            return 1;
        }
        else if (count1 < count2) {
            return -1;
        }
        else {
            return 0;
        }
  
Try it out and see what happens. Then explain what happened.


© 2006-2020, Rahul Simha & James Taylor (revised 2020)