= ["'twas",'brillig','and','the','slithy','toves']
A print(A[2:5])
['and', 'the', 'slithy']
By the end of this module you will be able to:
Recall: An integer variable takes on values like 5 and -33:
A floating-point variable stores real numbers like:
A string variable stores strings or chars, as in:
A boolean variable stores one of two values: True
or False
For example:
True
and False
are used in their usual sense. A boolean variable can store only one of these values:True
and False
start with capitals. These aren’t the same as the quote-delimited strings 'True'
and 'False'
not
:a
has the value True
then not a
will have the value False
c
will have the value False
.Likewise:
will print True
.
Shown in table form, not
is simple:
a |
not a |
---|---|
True |
False |
False |
True |
Next, consider or
:
How or
works:
a or b
will be True
when one or more of them is True
.a or b
will be False
only when a
and b
are both False
.This table shows all the combinations:
a |
b |
a or b |
---|---|---|
True |
True |
True |
True |
False |
True |
False |
True |
True |
False |
False |
False |
Next, consider and
How and
works: a and b
will be True
only when both are True
.
a |
b |
a and b |
---|---|---|
True |
True |
True |
True |
False |
False |
False |
True |
False |
False |
False |
False |
Let’s look at another example:
Look at:
True
or False
).Next, look at
Here, the value in a
before this executes is True
So, not a
is False
. This gets stored in a
So, after the statement executes a
will have the value False
. Next, look at
a
has the value False
in it, while b
has the value True
. Thus, the and
operator is applied to the values False
and True
. You can picture this as: False and True
.and
to False
and True
.False
. Thus, the value False
is assigned to the variable x
.The next statement is:
Because a
is now False
and b
is True
, the result in y
will be:
Consider
Let’s draw an expression diagram to help us understand what happens with the first expression:
Boolean expressions can be constructed with numeric variables and their comparison operators:
k = 5
m = 3
n = 8
a = True
b = False
first = (m < k) and (n > k)
second = ( (k+m == n) or (k-m < 10) )
third = first and (not second)
fourth = first or a
print(first, second, third, fourth)
Note: Since m
is 3, k
is 5, the expression (m < k)
in
evaluates to True
.
Similarly, the expression (n > k)
also evaluates to True
since n
= 8 in
Thus, the resulting expression on the right side becomes:
Which evaluates to True from the rules (the table) for and
.
To see how a Boolean variable is used in practice, we will work through a somewhat elaborate example that will teach us other useful things.
Let’s start with this program:
def print_search_result(A, search_term):
if search_term in A:
print('Found ', search_term)
B = [15, 3, 23, 9, 14, 4, 6, 2]
print_search_result(B, 4)
Here, the goal is to create a function that takes a list, and a search value (or search term) and looks inside the list to see if it exists. Does the program work?
Now consider the problem of also printing the position where it’s found:
def print_search_result(A, search_term):
for k in range(len(A)):
if A[k] == search_term:
print('Found', search_term, 'at position', k)
B = [15, 4, 23, 9, 4, 6]
print_search_result(B, 4)
Does this work?
Note: We are now traversing the list ourselves:
Here, k
will start at 0 and go up to the last index (one less than the length of the list). At each iteration, we check to see if the search term is equal to the list element at the current position (determined by k
):
If so, we’ve found it.
What we’d like to do is print something when a search term is not found in the list.
Consider this program:
Let’s try another variation:
We’ll now see how a simple Boolean variable is commonly used in these types of problems:
def print_search_result(A, search_term):
found = False
pos = -1
for k in range(len(A)):
if A[k] == search_term:
found = True
pos = k
if found:
print('Found', search_term, 'at position', pos)
else:
print('Not found:', search_term)
B = [15, 4, 23, 9, 4, 6]
print_search_result(B, 4)
print_search_result(B, 5)
Trace through the above.
Possible the most commonly use of Booleans is to write a function that returns True
or False
.
Suppose we want to determine whether or not a list has a negative number:
def has_negative(A):
for k in A:
if k < 0:
return True
return False
B = [2, 4, -10]
print(has_negative(B))
C = [1, 3, 5]
print(has_negative(C))
Trace through the above program.
It is common to want to pull out parts of strings.
For example, if the user in some application types ‘DC 20052’, we may want just the zip code:
Let’s explain:
The slicing expression 0:2
refers to all the chars of the string from the first (the 0-th) up to just before the one at position 2 (which would mean 1).
0:2
refers to characters at positions 0 through 1. Similarly, 3:8
refers to all the chars from position 3 up to 7 (inclusive).3:8
means “starting at and including 3” and “going to but excluding 8”.0:2
means “starting at and including 0” and “going to but excluding 2”.Slicing expressions work for lists too:
Let’s look at slicing when we don’t know the size.
Consider the zipcode example where the 5-digit zip code may preceded by all kinds of text, as in:
So, all we know is that the last 5 chars in the string need to be extracted. Then, we need to get the length of the string at the moment we have the string. Let’s put this in a function:
def extract_zip(s):
start = len(s) - 5
end = len(s)
return s[start:end]
example1 = 'DC 20052'
example2 = 'District of Columbia, 20052'
example3 = '20052'
print(extract_zip(example1))
print(extract_zip(example2))
print(extract_zip(example3))
Note that
gives us the index just past the last index. And
gives the index 5 position before the end. So, the slice becomes: s[start:end]
Suppose we want to determine the longest prefix that two strings have in common, as in:
This should print ‘rive’ but
should find no common prefix.
We will use the following ideas:
Let’s try this:
def find_common_prefix(w1, w2):
for k in range(len(w1)):
if w1[k] != w2[k]:
break
return w1[0:k]
print(find_common_prefix('river', 'rivet'))
break
is used to break out of a loop:Thus, the moment break
executes, execution exits the loop to the statement that follows the loop. Notice how we use slicing once we’ve found the char that’s past the common prefix:
Python comes with many useful functions for strings.
Here’s a sample:
A = ['to','infinity','and','beyond','and', 'even','further']
s = 'infinity'
# Convert to uppercase:
print(s.upper())
# Count occurrences of the char 'i' in s:
print(s.count('i'))
# Locate which index 'f' first occurs in s:
print(s.find('f'))
# Occurrences of 'and' in list A:
print(A.count('and'))
# Occurrences of 'i' in 2nd string in list A:
print(A[1].count('i'))
if A[3].startswith('be'):
print('starts with be')
data = '42'
print(data.isnumeric())
There’s a key difference, for example, between the functions len()
and upper()
:
The function len()
is like the ones we’ve been writing ourselves. In this case, the string s
is given to it as a parameter:
But the function upper()
is quite different:
This is, in some sense, attached to the the string variable s
. The use of the period right after the variable followed by a function is a somewhat advanced topic:
We’ll just use the feature. (The advanced topic is called: objects.)
Let’s emphasize one more feature with this snippet of code:
The string s
itself did not change but its uppercase version was returned by the call
t
.Continuing with the earlier example: - Similar “dot” functions are available for lists too, as in:
A
to count how many times the string 'and'
occurs in the list.k
. Notice how we can call a “dot” like function in a string when the string itself is an element of a list:Here, A[1]
is the second string in the list A
. - This happens to be 'infinity'
. We’re calling its count()
function. - And giving that function the character 'i'
to count. - It returns a number, which gets stored above in k
.
Consider the following partially completed code:
So far, we’ve seen different kinds of variables: Integer variables::
Floating-point variables for real numbers:
String variables::
Boolean variables:
Lists:
We’ve also seen other kinds of features or “things” in Python such as:
def
.len()
and print()
.for
and if
that direct the flow of execution.import
.There happen to other kinds of “things” in Python:
What we want to do here is focus your attention on a concept called type
x = 1 # The type of x is integer
y = 1.2 # The type of y is floating point
z = 'alarm' # The type of z is string
b = False # The type of b is Boolean
One can print the type of a variable as follows:
The type information is often itself representation in a special Python feature called a class (intuitively, as in this “class of item”).
Thus, we see that:
'alarm'
is just that, a string variable. And a variable that holds True
or False
is called a Boolean variable.Converting from one type to another: - It is often useful to go from one type to another. - We’ve seen an example of going from single-char string to integer, and converting from int to string (and vice-versa).
To review, let’s look at a few more examples:
Types and operators: Because our keyboards are limited in the number of symbol keys, we need to use some symbols for multiple purposes. The way we see this is when one operator, like +
, has different meanings when used with different types:
a = 3
b = 4
c = a + b # + for arithmetic
d = 'hello'
e = 'world'
f = d + ' ' + e # + for string concatenation
Rather than list all possible uses of all operator symbols, we will introduce additional uses beyond the common case wherever appropriate. Generally, you should be intentional about using operators: you should know what the purpose is.
For example, consider:
a = 4
b = a * 3
print(b) # prints 12
d = 'yes'
e = d * 3 # * for string concatenation
print(e) # Prints yesyesyes
(The latter is not a frequently used operator with strings.)
Sometimes we need to go beyond what Python has to offer. One way to do this is to find a popular library and use that: What’s a library ? A library is a collection of programs all related for a purpose. For example, there’s a library called NLTK (go look it up) that’s aimed at processing English text: - It can figure out parts of speech from sentences. - It can pick out topics (somewhat approximately) in paragraphs. - It can group so-called stem-related words like “fry”, “fries”, “frying” and separate those from “friar”. However, installing and learning to use these sophisticated packages requires some work. What we will do instead in this course is to provide you with simple programs that you can download and use directly without any installation. - This is the purpose in providing programs like wordtool and drawtool.
Let’s use wordtool to find the longest sentence in a book: Wordtool has a feature to break down text and give you one sentence at a time. For example:
import wordtool as wt
sentences = wt.get_sentences_from_textfile('jabberwocky.txt')
count = 0
for s in sentences:
count += 1
print('Sentence #', count, ':\n', s, '\n', sep='')
Let’s point out: When using functions in another program, one has to import
that program:
It is convenient to use a shorthand (such as wt
) for an imported program: (We could have called it something other than wt). For example:
Wordtool asks you to name the file, expecting it to be a plain text file (not a Word file), and to initiate the process.
Let’s now do something interesting with wordtool: find the longest sentence (by length) in two texts to compare different author’s writing styles.
import wordtool as wt
def get_longest_sentence(filename):
sentences = wt.get_sentences_from_textfile(filename)
maxL = 0
for s in sentences:
if len(s) > maxL:
maxL = len(s)
maxS = s
return maxS
book = 'federalist_papers.txt'
s = get_longest_sentence(book)
print('Longest sentence in', book, 'with', len(s), 'chars:\n', s)
print()
book = 'darwin.txt'
s = get_longest_sentence(book)
print('Longest sentence in', book, 'with', len(s), 'chars:\n', s)
In each of the exercises below, first try to identify the error just by reading. Then type up the program to confirm, and after that, fix the error.
Full credit is 100 pts. There is no extra credit.