GWU

CS 1111

Introduction to Software Development

GWU Computer Science


Lecture Notes 07: Chars (ASCII), Strings, and the String API


Objectives

By the end of this module, for simple HelloWorld-like programs, you will be able to:




Why are strings useful in computation?

Characters vs Strings

About characters:

For example:


public class CharExample {

  public static void main (String[] argv)
  {
    char first = 'o';
    char second = 'y';
    char third = '!';
    System.out.print (first);
    System.out.print (second);
    System.out.println (third);
  }

}

    
In this example:

A String, on the other hand, has zero or more characters.

About strings:

We'll look at some string examples further below.




Character examples

A char, unlike a String is indeed a primitive type, just like integers and doubles. In fact, characters are stored in memory as integers, where the integer corresponds to the character. You may have heard of an ASCII lookup table before, which shows this mapping:

Therefore, because they are actually stored as integers, char variables can be used in a for-loop, as in:

public class CharExample2 {

  public static void main (String[] argv)
  {
    for (char ch='a'; ch <= 'z'; ch++) {
      System.out.print (ch);
    }
    System.out.println ();
  }

}

    
Characters and ASCII
Characters are represented as integers in the machine.

It is often useful to work with the integer representation, and to go back and forth between representations.

For example:


public class CharExample3 {

  public static void main (String[] argv)
  {
    // A typical char:
    char first = 'a';

    // Now extract the int representation:
    int i = (int) first;
    System.out.println (i);

    // Perform integer arithmetic:
    int j = i+2;

    // Convert (or cast) back to char:
    char third = (char) j;

    // Print as char:
    System.out.println (third);
  }

}
Activity 3: Write up, compile and execute the above program in the Java Visualizer.

Note:
  • The conversion from one type of variable (say, int) to another type (say, char) is called casting.

  • You would say "I'm casting a char into an int".

  • The syntax is to have the target type in parens on the right side of the assignment symbol:
      char third = (char) j;
        

  • In some cases, we don't explicitly need to cast. We could have written
      // Going from char to int:
      int i = first;
        

  • As a general stylistic rule, it's best to specify a cast even if not explictly required.

  • Why, then, does the compiler allow some casts without programmer intention stated explicitly?
    • It's considered safe to go from more specific to more general.
    • All char's are int's
    • But not the other way around.
    • Thus, there are some integers that have no char equivalent.
    • The compiler then wants you to say "I know what I'm doing" by explicitly stating the cast.
Important digression:
  • We've learned a really important concept: type.

  • That is, variables have a property called type.

  • So far, we've seen integer, double, boolean, String, and character types.

  • As it will turn out, every variable must have a type.

  • There are strict rules in going back and forth between types.

  • These rules are very useful in helping programmers avoid mistakes.

  • For example, without "types" you could accidentally use an int for 3.141 and have that incorrectly represented as 3.

  • In a later module we will encounter a deeper understanding of "types".




The String API

The String API is the list of methods that each of these objects provide.
Today, we'll show you how to find and use some of the typical String methods.

First, open this page (link opens in new tab):
Java 11 - String API.

You'll notice several things:

Today, we'll use the following String methods:

Because Strings are a class in Java, they come with a bunch of useful features that are built in to the language. Much like your homework problems, developers have written several methods and packaged them under the String class so that we're able to access them and use them. Let's look at an illustrative example:


public class StringExample {

  public static void main (String[] argv)
  {
    // Declare a string variable and assign it an actual string:
    String s = "The quick brown fox jumps over the lazy dog";
    System.out.println (s);

    // Extract its length:
    int k = s.length ();
    System.out.println (k);     // 43. Includes spaces.

    // You can identify any character in the string, such as:
    char c = s.charAt(0);
    System.out.println (c);     // Prints 'T'.

    // The very useful concatenation operator +
    String s1 = "The";
    String s2 = "quick";
    String s3 = s1 + s2;        // s3 is "Thequick"

    // You can concatenate any number of strings in an expression:
    String s4 = "brown";
    String s5 = s1 + " " + s2 + " " + s4;
    System.out.println (s5);

    // You can concatenate any other type:
    String allLetters = "";
    for (char ch='a'; ch<='z'; ch++) {
      allLetters = allLetters + ch;
    }
    System.out.println ("The alphabet: " + allLetters);

    // Example with concatenating integers onto a string:
    String num1to10 = "";
    for (int i=1; i<=10; i++) {
      num1to10 = num1to10 + " " + i;
    }
    System.out.println ("Numbers 1 to 10: " + num1to10);

    // Example with a BUG when concatenating integers onto a string:
    String num11to20 = "";
    num11to20 = num11to20 + (11 + 12 + 14 + 15 +16 +17 +18 + 19 + 20);
    System.out.println ("Numbers 11 to 20: " + num11to20);
  }

}
 
Activity 5: Write up the above program, compile and execute in the Java Visualizer.

Now let's understand the code: Strings can contain the escape characters we saw earlier in Lecture Notes 02.

Example:


public class StringExample2 {

  public static void main (String[] argv)
  {
    String s = "";

    int last = 5;
    for (int i=1; i<=last; i++) {
      for (int j=1; j<=i; j++) {
        s = s + "*";
      }
      s = s + "\n";
    }
    System.out.println ("A triangle with base=" + last + ":\n" + s);
  }

}
  
Activity 6: Write up the above program, compile and execute. Change the base of the triangle to 10.

Activity 7: Trace the execution of the above program, showing how the string s changes.

Note:

Activity 8: Consider this snippet of code:
    int q = 5;

    System.out.println (q);
    System.out.println ("q");
    System.out.println ('q');

    String s = "q";
    char c = 'q';
    
What gets printed? Ponder about the differences between the three.




The Two Main Categories of Types

Strings have an API (Application Programming Interface), which describe the set of rules for using Strings in the programming of an application.

but why do we need to know the rules for using Strings?

Because Strings are different than the other types we've seen, like int, double, float, boolean, and char.

Before we move forward, let's look at how memory works in a computer and how we will imagine it for the duration of the class:


Primitive types:

The following are called "Primitive Types": int, float, double, boolean, and char.
They are called "Primitive" because the function they perform is extremely simple:

They are locations in memory inside of which you save the actual value you need.



How are these values saved?
The computer remembers where we are placing a value (the address) and the type of that value. Then, it interprets the zeroes and ones as a value for one of these types.

  • The most simple is the boolean, which needs a single bit to denote true or false.

  • Then, we have the byte, which needs a single Byte. 8 bits allows you to cover: 2^8 = 256 different symbols.

  • Then, we have the char, which in Java uses two Bytes. This allows the coverage of the large set of Unicode characters. ASCII actually needs only 7 bits.

  • The int uses 32 bits (4 Bytes). That means that, when saving a variable of type int, the computer uses 4 words (or 4 boxes) in a row. That also means that the range of values for an int is: -2^{31} to 2^{31} which covers -2,147,483,648 to 2,147,483,648. It uses 31 bits for the magnitude and 1 bit for the sign.

  • We also have the type short, which is for small integers, and it uses 2 Bytes, covering the range: -2^{15} to 2^{15} which covers -32,768 to 32,767.

  • For larger integers than usual, we have the type long, and it uses 8 Bytes, covering the range: -2^{63} to 2^{63} which covers -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807.

  • For real-valued numbers, we have the type float, which uses 4 Bytes using the floating-point description. covering the range: ±1.40239846 x 10^{-45} to ±3.40282347 x 10^{38}

  • The standard type for real-valued numbers is the double, which uses 8 Bytes using the floating-point description. covering the range: ±4.9406564584124654 x 10^{-324} to ±1.7976931348623157 x 10^{308}

Reference types:

As we will see later in the semester, we will be creating objects that contain so much information that it does not make sense to save them inside the box or space assigned to a variable.
Instead, we use a clever mechanism: We leave a forwarding number!, or in other words, the address of where we are keeping the object's information.

When we make a variable of a Reference type, like String, inside the variable, instead of saving the String value, we save the address of the starting location where we are going to keep more information about the value or values of that type.

An Example using our Abstraction:



Look at the example shown below:


Note:

  • The computer saves the address inside the variable.
  • The address is actually of size 32 or 64 bits, that's why I show it inside a variable with a large-sized box.
  • The particular strings that were declared and initialized are inside Objects that have, in addition to the series of characters that compose the String, Methods that we may interact with to perform actions with these objects.
  • Objects are stored in another section of memory (enough said).




String indices

Each character in a string has its own "location" or "index".

The following is a visualization of the indices for the string: I like Squirrels!


In this example, you can see that:




Substrings

We can extract parts of a string.

Observe the following piece of code:

 28
 29
 30
 31
 32


      int beginIndex = 7;
      // Extract everything after index beginIndex. Input is now a variable.
      String sub1 = s3.substring(beginIndex);// s3.substring(beginIndex) resolves into "Squirrels!"
      System.out.println ("The substring from index " + beginIndex + " to the end in s3 is " + sub1);
Activity 8: In StringMethods.java add these instructions to the program and see what gets printed.
Note that the beginIndex is inclusive (includes the character at the beginIndex).

We can also indicate a final index for the substring.

Observe the following piece of code:

 33
 34
 35
 36
 37
 38
 39


      int endIndex = 11;
      // This method takes two inputs
      // below, s3.substring(beginIndex, endIndex) resolves into "Squi"
      String sub2 = s3.substring(beginIndex, endIndex);
      System.out.println ("The substring from index " + beginIndex + // continues below
        " to index " + endIndex + " in s3 is " + sub2);
Activity 9: In StringMethods.java add these instructions to the program and see what gets printed.
Note that the final index is non-inclusive or exclusive (does not include the character at the endIndex).

Once we have worked with Conditionals, we will see examples some of the characteristics of objects that make them very different than primitive variables.




Next class:

We'll go through some more in-class exercises on strings as part of Homework5

Assignments for next lecture:

Get as far as you can on Homework5 problems on your own.