Note 5: Dictionaries and Tuples

Reading: Think Python Chapter 11

Notes

Keys and Values

So far, we have seen lists and strings, ordered collections that are indexed based on the position of their content.

Dictionaries, or dicts, are unordered, and consist of pairs of keys and values. They are created with curly braces { and } but indexed with square brackets [ and ].

years = {'DE': 1787, 'MD': 1788, 'VA': 1788}
years['DE']
1787

Note how the keys and values are separated by a colon :.

New elements can be added to a dictionary via indexing and assignment:

years['NC'] = 1788
years
{'DE': 1787, 'MD': 1788, 'VA': 1788, 'NC': 1788}

Dicts are mutable in the same way that lists are.

years['NC'] += 1
years
{'DE': 1787, 'MD': 1788, 'VA': 1788, 'NC': 1789}

Dictionaries are very useful for counting problems. Here’s an example.

This program counts the occurrences of each letter in a string:

You can check if some value is a dictionary key with the in keyword:

years = {'DE': 1787, 'MD': 1788, 'VA': 1788}
'DE' in years
True
'NJ' in years
False

You can extract the keys and values of a dict with the built-in methods <dict>.keys() and <dict>.values():

years.keys()
dict_keys(['DE', 'MD', 'VA'])
years.values()
dict_values([1787, 1788, 1788])

These output special dict_keys and dict_values objects. You can covert these to lists with the list function:

list(years.values())
[1787, 1788, 1788]

Calling len on a dict will give you the number of key-value pairs:

len(years)
3
What can be a key?

Any immutable value, such as ints, floats, and strings, can be a key.

Mutable data types such as lists and other dicts cannot be keys.

Iterating

Looping through a dict with a while loop is cumbersome; using a for loop, it is straightforward. Content iteration over a dict iterates over the keys:

Global Variables

  • You have seen how variables created inside of functions only exist inside of functions
  • We have passed variables in to functions as arguments
  • Functions can also read variables from outside of functions
    • Variables outside of functions are called “global” variables
    • Functions cannot assign to global variables

It is possible to allow a function to assign to a global variable by using the keyword global:

Without Line 2, this program would result in an error.

In practice, you should not reference global variables from functions. Pass arguments in and use return values.

Notes

Immutable Sequences

  • Tuples are similar to lists, but are delimited with parentheses ( and ) instead of square brackets.
    • These parentheses are optional.
1, 2, 3
(1, 2, 3)
(2, "horse", "zebra")
(2, 'horse', 'zebra')
  • Tuples are immutable!
X = (3, 2, 1)
X[1] += 1

Like strings, which are also immutable, you can concatenate tuples (which results in a new tuple):

X = (3, 2, 1)
X + (4, 0)
(3, 2, 1, 4, 0)

Tuples can be used for assignment, which is called “tuple assignment” or “multiple assignment.”

x, y = 3, 4
x
3
y
4

Tuple Return

To get a function to return multiple values, return a tuple of those values:

Tuple Swap

Sometimes you will need to swap (exchange) the values of two variables.

One way to do this is by creating a temporary variable:

L = ['sea', 'lake', 'ocean', 'pond']
temp = L[0]
L[0] = L[1]
L[1] = temp
L
['lake', 'sea', 'ocean', 'pond']

Using tuple assignment makes this easier:

L[0], L[3] = L[3], L[0]
L
['pond', 'sea', 'ocean', 'lake']

Here’s an example:

Practice

Practice Problem 5.1

Practice Problem 5.1

Write a function frequency_dict that takes as argument a string and returns a dictionary of character frequencies.

Practice Problem 5.2

Practice Problem 5.2

Write a function frequent_char that takes as argument a string and returns whichever character in the string appears most frequently (assume this will be a unique character).

Practice Problem 5.3

Practice Problem 5.3

Write a function dict_update that takes two arguments: a dictionary and a string.

  • All keys in the dictionary will be strings
  • All values in the dictionary will be ints
  • If the dictionary already contains the string argument, add one to that string’s value
  • If the dictionary does not contain the string argument, add the string to the dictionary as a key, with value 1

Examples:

d = {'JFK': 4, 'DCA': 2}
r = dict_update(d, 'JFK')
print(r)
{'JFK': 5, 'DCA': 2}
d = {'JFK': 4, 'DCA': 2}
r = dict_update(d, 'IAD')
print(r)
{'JFK': 4, 'DCA': 2, 'IAD': 1}

Practice Problem 5.4

Practice Problem 5.4

Write a function small_value that takes as argument a dictionary and returns the key that has the smallest value. All values will be numbers.

  • small_value({'JFK': 5, 'DCA': 2}) returns 'DCA'
  • small_value({'JFK': 4, 'DCA': 2, 'IAD': 1}) returns 'IAD'

Practice

Practice Problem 5.5

Practice Problem 5.5

Write a function longest_shortest that takes as argument a list of strings and return a tuple:

  • The first item is the length of the shortest string.
  • The second item is the length of the longest string.

Practice Problem 5.6

Practice Problem 5.6

Write a function bubble_sort that takes as argument a list of numbers and sorts the list, largest to smallest:

  • Visit each item in the list
    • Compare the item to the next item
    • If the next item is larger, swap the two items
  • Repeat this until the list is sorted

When done, return the list.

Practice Problem 5.7

Practice Problem 5.7

Write a function read_walden that reads in walden.txt, counts the number of lines, and returns that number.

You should find 18 lines.

The text is an excerpt from Walden by Henry David Thoreau.

Practice Problem 5.8

Practice Problem 5.8

Write a function count_file_lines that takes one argument: a string representing the path to a text file. Your function should return a tuple:

  • The first item is the number of lines in the file.

  • The second item is the number of characters in the file.

  • count_file_lines('walden.txt') should return 18, 1194

  • count_file_lines('metamorphosis.txt') should return 27, 1672

Homework

  • Homework problems should always be your individual work. Please review the collaboration policy and ask the course staff if you have questions. Remember: Put comments at the start of each problem to tell us how you worked on it.

  • Double check your file names and return values. These need to be exact matches for you to get credit.

  • For this homework, don’t use any built-in functions that find maximum, find minimum, or sort.

Homework Problem 5.1

Homework Problem 5.1 (35 pts)

Write a function no_mode that takes as argument a list of ints. Without modifying the original list:

  • Find the int that appears most frequently in the list
  • Create a new list with the same contents
  • Remove all instances of the most-frequently-occurring int
  • Returns the new list.

You can assume that one number will appear more than the others.

  • no_mode([1, 3, 1, 6, 2]) returns [3, 6, 2]
  • no_mode([4, 5, 0, 0]) returns [4, 5]
  • no_mode([1, 1, 2, 2, 2]) returns [1, 1]

Submit as no_mode.py.

Homework Problem 5.2

Homework Problem 5.2 (30 pts)

Write a function common_length that takes as argument a list of words (strings), counts the length of each string, and returns the most frequent string length. If there are multiple lengths that appear with the same frequency, return the smallest one. You can assume that the list of strings contains at least one string.

Type hint: the return value should be an integer.

  • common_length(['flower', 'rind']) returns 4
  • common_length(['bread', 'butter', 'treats']) returns 6
  • common_length(['sun', 'moon', 'rain']) returns 4

Submit as common_length.py.

Homework Problem 5.3

Homework Problem 5.3 (35 pts)

Write a function unique_words that takes as input a string, which will represent English-language text. Return a list of unique words found in the string. Ignore capitalization (return all lowercase, except the word 'I'). Remove any punctuation: periods, commas, exclamation points, apostrophes, quotation marks, and question marks. Order in the returned list does not matter.

Examples:

a_string = "I know you know."
unique = unique_words(a_string)
print(unique)
['I', 'know', 'you']
a_string = "A good idea, a lot of good."
unique = unique_words(a_string)
print(unique)
['a', 'good', 'idea', 'lot', 'of']
a_string = "To be, or not to be?"
unique = unique_words(a_string)
print(unique)
['to', 'be', 'or', 'not']

Submit as unique_words.py.

Homework Problem 5.4

Homework Problem 5.4 (30 pts)

Write a function max_elements that takes as input a list of tuples of equal length. All of the elements in the tuples will be integers or floats. Your function should return a tuple where:

  • The first item of the tuple is the maximum value of all of the first elements in the input list of tuples
  • The second item of the tuple is the maximum value of all of the second elements in the input list of tuples
  • …and so forth.

Examples:

  • max_elements([(1,3,5), (4,3,2)]) should return (4,3,5)
  • max_elements([(1.0,2.0), (2.0,3.1), (5.1,7.3), (3.1,9.9), (0.1,2.4)]) should return (5.1,9.9)

Submit as max_elements.py.

Homework Problem 5.5

Homework Problem 5.5 (25 pts)

Write a function alphabetizer that takes as input a list of strings and alphabetizes (puts in alphabetical order) the strings without any regard to capitalization.

  • alphabetizer(['Banana', 'apple']) returns ['apple', 'Banana']

Submit as alphabetizer.py.

Homework Problem 5.6

Homework Problem 5.6 (45 pts)

Write a function file_character that takes one argument: a string representing the path to a text file. The function should return a tuple:

  • The first item is the count of the second-most common character in the file.

  • The second item is the character.

  • file_character('walden.txt') should return 126, 'e'

Submit as file_character.py.