Module 13: Characters and Strings


Objectives

 

By the end of this module, for simple programs, you will be able to:

 


A simple example

 

Consider this simple program:

 

More examples:

 

Operators for char variables:

 


The (useful) relationship between int and char

 

Consider this program:

 

In-Class Exercise 1: What does the above program print?
 

In-Class Exercise 2: Now let's go the other way. Assign the value 100 to an int variable and then from there to char variable. Print the char variable. What is the output?
 

In-Class Exercise 3: Write a for-loop to print the letters a to z (all lowercase) using the above idea.
 

About the conversion:

  • Under the hood, char's are represented by integer codes.

  • It is this code that we see as the number, after casting.
 


char arrays

 

Here is a simple use of a char array:

  • Thus, a char array is declared in much the same way as an int or double array:

  • In this case, the array is initialized using actual values, again similar to other basic types:

  • Arrays can also be initialized using assignment, e.g.,

 

Next, let's write a program to count character occurences in text:

  • Notice how we used a char value as an array index:

  • This is really a shorter version of:

  • Thus, sentence[0] is cast to the integer 116, which results in accessing count[116].
 

In-Class Exercise 4: Trace through the first five iterations of the first for loop. What array index corresponds to the blank (space)?
 

Let's improve the above program a little:

  • Instead of assuming "hard" numbers like 97, let's write code that's independent of the underlying numbers.

 

In-Class Exercise 5: Modify the second for-loop to use start and end.
 

In-Class Exercise 6: Rewrite the above program so that you use a counts array of length exactly 26.
 


More examples

 

We'll now get more practice with problem solving and programming by working on palindromes.
 

In-Class Exercise 7: What is your favorite palindrome?
 

Here's a program to check whether a word is a palindrome:

  • Here, we have defined n for readability (instead using word.length everywhere.
 

In-Class Exercise 8: Trace through the above program. Why does the for-loop stop at (n-1)/2?
 

Next, suppose we want to check whether sentences like

  char[] sentence = {'n','e','v','e','r',' ','o','d','d',' ','o','r',' ','e','v','e','n'};
are palindromes.

We need to remove blanks before checking whether it's a palindrome.

If we wrote a method that removed blanks, we could use it as follows:

  char[] sentence = {'n','e','v','e','r',' ','o','d','d',' ','o','r',' ','e','v','e','n'};
  char[] withoutBlanks = removeBlanks (sentence);
  checkPalindrome (withoutBlanks);

Note:

  • The method removeBlanks returns a char array.

  • That array is likely to be shorter (with the blanks removed).

  • We will use this idea:
       1.  Count the number of blanks in the array.
       2.  Use this to determine the size of the new array.
       3.  Make the new array.
       4.  Copy over the non-blank characters from the old to the new.
       5.  Return the new array.
    

Here's the program:

 

In-Class Exercise 9: Put the pieces together so that you have a program that works with

  char[] sentence = {'n','e','v','e','r',' ','o','d','d',' ','o','r',' ','e','v','e','n'};
 


Word matching examples

 

We'll now look at some programs to:

  • Determine if two words are equal.
  • Determine if two words are approximately equal.
  • Find a word in a sentence.
  • Identify the common prefix of two words.
 

Let's start by testing whether two char arrays are equal:

  • Notice that the return value of isEqual is boolean:

  • Next, before even comparing array contents, we check that the lengths are equal:

    Clearly, if the lengths aren't equal, we can return false immediately.

 

In-Class Exercise 10: Consider this replacement for the second part of isEqual:

Argue that this achieves the same result. Which approach is better and why?
 

In-Class Exercise 11: For numbers, we could say that 3.14 is approximately equal to 3.15. What about words? Propose a definition (or two) for when two words might be approximately equal.
 

We will use wildcard matching for approximate comparison:

  • Suppose an asterisk * represents "match any character".

  • Then, "rive*" matches "river" and "rivet".

  • Similarly, "*i*er" matches "river" and "tiger".

  • Thus, the problem we wish to solve: given two words with asterisks in them, establish whether they are equal.

  • Here's the program:

 

In-Class Exercise 12: Replace the nested-if with a single if-statement that combines the conditions. Why is the condition B[i]!='*' needed? If you remove it, does the comparison fail for the above examples? Can you find examples where it would fail?
 

Next, let's look for a word within a sentence and identify the position where it occurs (if it does):

  • Thus, if the sentence is "never odd or even" and the word is "odd", we want to return 6.

    For the same sentence and word "prime", we want to return -1 (not found).

  • Let's write a method called wordsearch that has the following signature:

  • If that's the case, we can use it as follows:

  • Here's wordsearch:

 

In-Class Exercise 13: What is the return value when the sentence is "never odd or even" and the word is "eve"? What does this say about our wordsearch solution?
 

In-Class Exercise 14: Modify wordSearch so that it returns a proper index only if a full word is found to match. That is, a search for "eve" in "never odd or even" should return -1, whereas searches for "never", "odd", "or", or "even" should return the correct position.
 

Next, let's consider the prefix problem: given two words, find their common prefix:

  • Example: rive is the common prefix shared by river and rivet.

  • Our goal is to write a method with this signature:
            static char[] commonPrefix (char[] A, char[] B)
    
    so that it takes two char arrays and returns the common prefix in a char array.

  • Then, we can use this method as follows:
        char[] word = {'r','i','v','e','r'};
        char[] word2 = {'r','i','v','e','t'};
        char[] prefix = commonPrefix (word, word2);
        // Should return "rive" in array prefix.
    

  • Let's start with this idea:
       1.  Suppose the two arrays are A and B.
       2.  Start with i=0 at the leftmost position of each array.
       3.  As long as A[i]==B[i], advance along the array.
       4.  The moment A[i] != B[i], stop.
    

Consider this code:

 

In-Class Exercise 15: Why are we computing the minimum of the lengths of the two arrays? What would go wrong if we used A.length instead as the for-loop limit?
 

In-Class Exercise 16: Implement the method printArray that takes a char array as parameter and prints the contents on a single line. Then, add that method to the above two and see if the whole program works.
 

In-Class Exercise 17: Change the words to "river" and "rover". What do you notice? Trace through the program. Why doesn't it work?
 

Let's address the problem as follows:

 

In-Class Exercise 18: Trace through the program with the above fix. Why does this work?
 

There is another, more compact and "slick" way to do the same thing:

  • Observe: we can use a more complex condition to test continued execution of the for-loop:

  • Here, both conditions (because of the &&) need to be true to continue execution.
           => Execution stops as soon as the first mismatch occurs.
 


Strings

 

Consider this example:

  • A string is a sequence of characters.

  • Notice how a string is declared:

  • Important: String is NOT a reserved word.
           =>
  • String is a special kind of object in the Java language.

  • The (zero or more) characters in a string are enclosed in double-quotes:

 

Java defines a special operator for concatenating strings:

  • This is the same + used for number arithmetic.

  • What's unusual is that one can combine numbers and strings to create longer strings:

  • There is a shortcut operator += to combine assignment and concatenation to the same variable:

    The above is equivalent to: z = z + 2.718.

  • The concatenation operator is particularly useful for System.out.println:

 

In-Class Exercise 19: Modify the code below so that only a single System.out.println is used.

public class ArrayPrint {

    public static void main (String[] argv)
    {
        int n = 5;
        for (int i=1; i<=n; i++) {
            int j = i * i;
            System.out.print ("The square of ");
            System.out.print (i);
            System.out.print (" is ");
            System.out.println (j);
        }
    }

}
 

String methods:

  • Because strings are objects, it is possible to define methods inside them.

  • The Java String object has many such useful methods:

  • Note that length is a method and that it is called using the dot operator with the variable:

  • The method takes no parameters but returns an integer:

  • One can extract a sub-string from a given string using the substring method:

    • Think of the characters in a string being indexed, as in an array, starting with 0.
    • The first parameter to substring is the index of the start of the substring.
    • The second parameter is one more than the index of the last char of the substring.
 

In-Class Exercise 20: Type up the above program and execute. What goes wrong?
 

To see if two strings are the same, use the equals method in one of the strings to be compared:

 

To compare two strings alphabetically, use the compareTo method:

The compareTo method returns one of three values:

  • -1, if the parameter string is greater.
  • 1, if the parameter string is less.
  • 0, otherwise.
 

There are other useful methods for strings. Here are three more:

  • To obtain the character at the i-th position of the string, use charAt(i).

  • Use indexOf with either a char or String parameter to see whether the parameter exists in the string. The first position found is returned (or -1, if not found).

  • One can extract a char from a string using the toCharArray method.
 


Arrays of strings

 

Arrays of strings aren't very different from arrays of numbers or chars, as the following example shows:

  • A variable that represents a string array is declared with the square brackets, as with integers:

  • Space is created using the new operator:

    or by direct initialization with actual strings:

  • String arrays can be passed as parameters, like other kinds of arrays:

  • Notice that each array element is a string, and so String methods such as length() can be used with each element:

  • Recall that an array's length is not a method:

 

In-Class Exercise 21: Devise a mnemonic for 5 digits of the square root of 2, and change the above program accordingly.
 

In-Class Exercise 22: Sometimes mnemonics are used to remember hard-to-spell words. For example, here's a mnemonic to remember how to spell "Rhythm": "Rhythm helps your two hips move". Write a program that takes an array of strings and puts together all the first letters of the strings as a string, and then prints it out. Use two examples to demonstrate that your program is working. (Find another one.)
 


Reading and writing

 

Since we char's are similar to other basic types, much of what we said earlier applies.

Let's point out a few new things:

  • When seeing a cast such as:

    say to yourself "i is assigned x cast into its integer representation".

  • Remember the complex for-loop condition? When seeing such a condition:

    say to yourself "as long as i is less than length and A[i] equals B[i]".

  • When you see a method called with the "dot" operator, as in

    say "k is assigned x dot length"

  • Important: an array's length is NOT a method:

    (There are no parentheses following the method name.)

 

Since we are more experience by now, writing style can be gleaned by looking at the examples.

Instead, we will point out a couple of useful ways of working with arrays and characters:

  • One can use a Java library method to print out an array using the Arrays.toString method.

  • For this purpose we need the right import statement at the top of the file, outside the class:

  • Notice how we used casting back and forth to print the letters 'a' to 'z':

 


When things go wrong

 

 

In-Class Exercise 23: Identify the errors in the following program and fix them.

 

In-Class Exercise 24: Identify the errors in the following program and fix them.




© 2011, Rahul Simha