GWU

CS 1111

Introduction to Software Development

GWU Computer Science


Lecture Notes 07: Chars (ASCII), Strings, and the String API


Objectives

By the end of this module, for simple HelloWorld-like programs, you will be able to:




Why are strings useful in computation?

Characters vs Strings

About characters:

For example:


public class CharExample {

  public static void main (String[] argv)
  {
    char first = 'o';
    char second = 'y';
    char third = '!';
    System.out.print (first);
    System.out.print (second);
    System.out.println (third);
  }

}

    
In this example:

A String, on the other hand, has zero or more characters.

About strings:

We'll look at some string examples further below.




Character examples

A char, unlike a String is indeed a primitive type, just like integers and doubles. In fact, characters are stored in memory as integers, where the integer corresponds to the character. You may have heard of an ASCII lookup table before, which shows this mapping:

Characters and ASCII
Characters are represented as integers in the machine.

It is often useful to work with the integer representation, and to go back and forth between representations.

For example:


public class CharExample3 {

  public static void main (String[] argv)
  {
    // A typical char:
    char first = 'a';

    // Now extract the int representation:
    int i = (int) first;
    System.out.println (i);

    // Perform integer arithmetic:
    int j = i+2;

    // Convert (or cast) back to char:
    char third = (char) j;

    // Print as char:
    System.out.println (third);
  }

}




The String API

The String API is the list of methods that each of these objects provide.
Today, we'll show you how to find and use some of the typical String methods.

First, open this page (link opens in new tab): Java 11 - String API.

You'll notice several things:

Today, we'll use the following String methods:

Because Strings are a class in Java, they come with a bunch of useful features that are built in to the language. Much like your homework problems, developers have written several methods and packaged them under the String class so that we're able to access them and use them. Let's look at an illustrative example:


public class StringExample {

  public static void main (String[] argv)
  {
    // Declare a string variable and assign it an actual string:
    String s = "The quick brown fox jumps over the lazy dog";
    System.out.println (s);

    // Extract its length:
    int k = s.length ();
    System.out.println (k);     // 43. Includes spaces.

    // You can identify any character in the string, such as:
    char c = s.charAt(0);
    System.out.println (c);     // Prints 'T'.

    // The very useful concatenation operator +
    String s1 = "The";
    String s2 = "quick";
    String s3 = s1 + s2;        // s3 is "Thequick"

    // You can concatenate any number of strings in an expression:
    String s4 = "brown";
    String s5 = s1 + " " + s2 + " " + s4;
    System.out.println (s5);

    // You can concatenate any other type:
    String allLetters = "";
    for (char ch='a'; ch<='z'; ch++) {
      allLetters = allLetters + ch;
    }
    System.out.println ("The alphabet: " + allLetters);

    // Example with concatenating integers onto a string:
    String num1to10 = "";
    for (int i=1; i<=10; i++) {
      num1to10 = num1to10 + " " + i;
    }
    System.out.println ("Numbers 1 to 10: " + num1to10);

    // Example with a BUG when concatenating integers onto a string:
    String num11to20 = "";
    num11to20 = num11to20 + (11 + 12 + 14 + 15 +16 +17 +18 + 19 + 20);
    System.out.println ("Numbers 11 to 20: " + num11to20);
  }

}
 
Activity 1: Write up the above program, compile and execute in the Java Visualizer.

Now let's understand the code: Strings can contain the escape characters we saw earlier in Lecture Notes 02.

Activity 2: Consider this snippet of code:
    int q = 5;

    System.out.println (q);
    System.out.println ("q");
    System.out.println ('q');

    String s = "q";
    char c = 'q';
    
What gets printed? Ponder about the differences between the three.




The Two Main Categories of Types

Primitive types:

The following are called "Primitive Types": int, float, double, boolean, and char.
They are called "Primitive" because the function they perform is extremely simple:

They are locations in memory inside of which you save the actual value you need.



How are these values saved?
The computer remembers where we are placing a value (the address) and the type of that value. Then, it interprets the zeroes and ones as a value for one of these types.

  • The most simple is the boolean, which needs a single bit to denote true or false.

  • Then, we have the byte, which needs a single Byte. 8 bits allows you to cover: 2^8 = 256 different symbols.

  • Then, we have the char, which in Java uses two Bytes. This allows the coverage of the large set of Unicode characters. ASCII actually needs only 7 bits.

  • The int uses 32 bits (4 Bytes). That means that, when saving a variable of type int, the computer uses 4 words (or 4 boxes) in a row. That also means that the range of values for an int is: -2^{31} to 2^{31} which covers -2,147,483,648 to 2,147,483,648. It uses 31 bits for the magnitude and 1 bit for the sign.

  • We also have the type short, which is for small integers, and it uses 2 Bytes, covering the range: -2^{15} to 2^{15} which covers -32,768 to 32,767.

  • For larger integers than usual, we have the type long, and it uses 8 Bytes, covering the range: -2^{63} to 2^{63} which covers -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807.

  • For real-valued numbers, we have the type float, which uses 4 Bytes using the floating-point description. covering the range: ±1.40239846 x 10^{-45} to ±3.40282347 x 10^{38}

  • The standard type for real-valued numbers is the double, which uses 8 Bytes using the floating-point description. covering the range: ±4.9406564584124654 x 10^{-324} to ±1.7976931348623157 x 10^{308}

Reference types:

As we will see later in the semester, we will be creating objects that contain so much information that it does not make sense to save them inside the box or space assigned to a variable.
Instead, we use a clever mechanism: We leave a forwarding number!, or in other words, the address of where we are keeping the object's information.

When we make a variable of a Reference type, like String, inside the variable, instead of saving the String value, we save the address of the starting location where we are going to keep more information about the value or values of that type.

Activity 3: Let's trace through some primitve types and reference types in memory.




String indices

Each character in a string has its own "location" or "index".

The following is a visualization of the indices for the string: I like Squirrels!


In this example, you can see that:




Substrings

We can extract parts of a string.

Observe the following piece of code:

 28
 29
 30
 31
 32


      int beginIndex = 7;
      // Extract everything after index beginIndex. Input is now a variable.
      String sub1 = s3.substring(beginIndex);// s3.substring(beginIndex) resolves into "Squirrels!"
      System.out.println ("The substring from index " + beginIndex + " to the end in s3 is " + sub1);
 33
 34
 35
 36
 37
 38
 39


      int endIndex = 11;
      // This method takes two inputs
      // below, s3.substring(beginIndex, endIndex) resolves into "Squi"
      String sub2 = s3.substring(beginIndex, endIndex);
      System.out.println ("The substring from index " + beginIndex + // continues below
        " to index " + endIndex + " in s3 is " + sub2);




Next class:

We'll go through some more in-class exercises on strings as part of Homework5

Assignments for next lecture:

Get as far as you can on Homework5 problems on your own.