Note 10: Recursion

We have used while and for loops to perform operations multiple times, either iterating over a collection (both kinds of loop) or iterating until some termination condition is met (while loops, or for loop using a break or return).

We now introduce another way to perform an operation multiple times: recursion.

A Function That Calls Itself

Let’s write a function to sum all of the positive integers less than some input integer: \(\sum_{i=1}^{n} i\)

def sum_less(n:int) -> int:
    total:int = 0
    for i in range(1, n+1):
        total += i
    return total

This function works, and there’s nothing wrong with it. We could, however, approach the problem a different way, by breaking it down into sub-problems:

We add the next smaller number to our total
If the next smaller number is zero, we stop adding

def sum_less(n: int) -> int:
    if n == 1:
        return 1
    else:
        return n + sum_less(n-1)

Examining the execution:

The function successively calls itself
- This is known as recursion
Line 2 critically checks for the case where the input integer is 0
- After this, the function no longer calls itself, and starts to return values
- This is called the base case or (end case)
Otherwise, we are in the recursive case
When breaking down a problem into sub-problems the base case identifies when the problem cannot be further broken down
The recursive case runs whenever the problem can be broken down further

Successive Halving

While using recursion for a linear problem (adding a sequence of numbers) works, recursion is typically applied to problems that are continuously broken into smaller sub-problems that are identical to the original problem.

Recall binary search:

The problem can be seen as successively performing a binary search on the “remaining” region!

A recursive implementation:

def binary_search(L, val, start, end):
    middle = start + (end-start)//2
    if L[middle] > val:
        return binary_search(L, val, start, middle)
    elif L[middle] < val:
        return binary_search(L, val, middle+1, end)
    else:
        return middle

The else at Line 7 is the base case (it does not result in another call to the binary_search function)
The if and elif at Line 3 and Line 5, respectively, are the recursive cases

Improve this search in the practice problems.

A Recursive Sort

Merge sort depends on the merge_lists function, which is the same in the recursive and non-recursive implementations. We’ve added some print statements to illustrate how the recursive version works:

def merge_lists(L1, L2):
    print(f"merging: {L1=} {L2=}", end=" --> ")
    L3 = []
    while L1 and L2:
        if L1[0] < L2[0]:
            L3.append(L1[0])
            L1 = L1[1:]
        else:
            L3.append(L2[0])
            L2 = L2[1:]
    while L1 and not L2:
        L3.append(L1[0])
        L1 = L1[1:]
    while L2 and not L1:
        L3.append(L2[0])
        L2 = L2[1:]
    print(L3)
    return L3

The recursive mergesort function ends up being extremely simple:

The base case is a list of length 1, which is simply returned
The recursive case splits the existing list into two smaller sub-lists
- mergesort is then called on each of these sub-lists
- The result is merged and returned

def mergesort(L):
    # base case
    if len(L) == 1:
        return L

    # recursive case
    L1 = L[:len(L)//2]
    L2 = L[len(L)//2:]
    print(f"splitting: {L} --> {L1=} {L2=}")
    return merge_lists(mergesort(L1), mergesort(L2))

Let’s see what happens for various calls. With an even, power-of-two number of elements (even splits each halving):

mergesort([4, 2, 3, 1])

splitting: [4, 2, 3, 1] --> L1=[4, 2] L2=[3, 1]
splitting: [4, 2] --> L1=[4] L2=[2]
merging: L1=[4] L2=[2] --> [2, 4]
splitting: [3, 1] --> L1=[3] L2=[1]
merging: L1=[3] L2=[1] --> [1, 3]
merging: L1=[2, 4] L2=[1, 3] --> [1, 2, 3, 4]

[1, 2, 3, 4]

With an odd number of elements:

mergesort([4, 2, 5, 3, 1])

splitting: [4, 2, 5, 3, 1] --> L1=[4, 2] L2=[5, 3, 1]
splitting: [4, 2] --> L1=[4] L2=[2]
merging: L1=[4] L2=[2] --> [2, 4]
splitting: [5, 3, 1] --> L1=[5] L2=[3, 1]
splitting: [3, 1] --> L1=[3] L2=[1]
merging: L1=[3] L2=[1] --> [1, 3]
merging: L1=[5] L2=[1, 3] --> [1, 3, 5]
merging: L1=[2, 4] L2=[1, 3, 5] --> [1, 2, 3, 4, 5]

[1, 2, 3, 4, 5]

With an even, non-power-of-two, number of elements:

mergesort([4, 2, 5, 3, 1, -1])

splitting: [4, 2, 5, 3, 1, -1] --> L1=[4, 2, 5] L2=[3, 1, -1]
splitting: [4, 2, 5] --> L1=[4] L2=[2, 5]
splitting: [2, 5] --> L1=[2] L2=[5]
merging: L1=[2] L2=[5] --> [2, 5]
merging: L1=[4] L2=[2, 5] --> [2, 4, 5]
splitting: [3, 1, -1] --> L1=[3] L2=[1, -1]
splitting: [1, -1] --> L1=[1] L2=[-1]
merging: L1=[1] L2=[-1] --> [-1, 1]
merging: L1=[3] L2=[-1, 1] --> [-1, 1, 3]
merging: L1=[2, 4, 5] L2=[-1, 1, 3] --> [-1, 1, 2, 3, 4, 5]

[-1, 1, 2, 3, 4, 5]

File System Operations

A common operation in a file system is to search for a specific file inside a directory structure.

From the root directory:

List all contents
If the content is a file, check if it is the file we are looking for
- If so, return
If the content is a directory:
- Search inside that directory

Python can interact with the file system using the os library:

Import the library using import os
os.listdir will list the contents of a directory
- It returns a list of strings
- When called with no arguments, it lists the contents of the directory the .py file is in
- When called with an argument, the argument should be a string representing the path to a subdirectory
os.path.isdir checks if the argument, a path (represented by a string) is a directory
- This function returns a bool

Implement this search in the practice problems.

Visualizer File System Operations

The visualizer supports only a limited set of file system operations, and cannot reliably be used for any code that interacts with the file system.

Practice

Practice Problem 10.1

Implement a recursive binary search that can be called on a list and a value (and does not need to be given an initial start and end: it will work on the entire list by default).

Contrast our example, which would be called:

binary_search(L, val, 0, len(L))

With your function:

binary_search(L, val)

Hint: use keyword arguments.

Practice Problem 10.2

Write a function that recursively searches for a file named target. The target file will be either in the current directory, or in a subdirectory (of any depth) of the current subdirectory.

Your function should return the complete relative path to the target file.

Homework

Homework problems should always be your individual work. Please review the collaboration policy and ask the course staff if you have questions. Remember: Put comments at the start of each problem to tell us how you worked on it.
Double check your file names and return values. These need to be exact matches for you to get credit.
For this homework, don’t use any built-in functions that find maximum, find minimum, or sort.
These problems are intended to be completed in sequence. Each builds on the previous.

Homework Problem 10.1

Homework Problem 10.1 (40 pts)

Write a function censor that takes one string argument.

Your function should, without any loops or string methods, replace all vowels in the string with the asterisk (*) character, and then return the new string.

censor("Charles") returns "Ch*rl*s"
censor("abcd") returns "*bcd"
censor("hello world") returns "h*ll* w*rld"

Submit as censor.py

Homework Problem 10.2

Homework Problem 10.2 (40 pts)

Write a function underliner that takes two string arguments. The second argument will be a substring of the first argument. Assume the substring appears only once in the original string.

Return a multi-line string that “underlines” the substring (second argument) with carat (^) characters:

underliner("hello world!", "world") returns:

hello world!
      ^^^^^

Do this without use of any string methods or loops.

Submit as underliner.py

Homework Problem 10.3

Homework Problem 10.3 (20 pts)

Write a function mutli_underliner with nearly identical behavior to your underliner function, with one exception: the substring can appear multiple times in the original string:

multi_underliner("entertainment", "e") returns:

entertainment
^  ^      ^

Once again, do this with no use of any string methods or loops.

Submit as multi_underliner.py