Note 10: Recursion

We have used while and for loops to perform operations multiple times, either iterating over a collection (both kinds of loop) or iterating until some termination condition is met (while loops, or for loop using a break or return).

We now introduce another way to perform an operation multiple times: recursion.

A Function That Calls Itself

Let’s write a function to sum all of the positive integers less than some input integer: \(\sum_{i=1}^{n} i\)

def sum_less(n:int) -> int:
    total:int = 0
    for i in range(1, n+1):
        total += i
    return total

This function works, and there’s nothing wrong with it. We could, however, approach the problem a different way, by breaking it down into sub-problems:

  • We add the next smaller number to our total
  • If the next smaller number is zero, we stop adding
def sum_less(n: int) -> int:
    if n == 1:
        return 1
    else:
        return n + sum_less(n-1)

Examining the execution:

  • The function successively calls itself
    • This is known as recursion
  • Line 2 critically checks for the case where the input integer is 0
    • After this, the function no longer calls itself, and starts to return values
    • This is called the base case or (end case)
  • Otherwise, we are in the recursive case
  • When breaking down a problem into sub-problems the base case identifies when the problem cannot be further broken down
  • The recursive case runs whenever the problem can be broken down further

Successive Halving

While using recursion for a linear problem (adding a sequence of numbers) works, recursion is typically applied to problems that are continuously broken into smaller sub-problems that are identical to the original problem.

Recall binary search:

The problem can be seen as successively performing a binary search on the “remaining” region!

A recursive implementation:

def binary_search(L, val, start, end):
    middle = start + (end-start)//2
    if L[middle] > val:
        return binary_search(L, val, start, middle)
    elif L[middle] < val:
        return binary_search(L, val, middle+1, end)
    else:
        return middle
  • The else at Line 7 is the base case (it does not result in another call to the binary_search function)
  • The if and elif at Line 3 and Line 5, respectively, are the recursive cases

Improve this search in the practice problems.

A Recursive Sort

Merge sort depends on the merge_lists function, which is the same in the recursive and non-recursive implementations. We’ve added some print statements to illustrate how the recursive version works:

def merge_lists(L1, L2):
    print(f"merging: {L1=} {L2=}", end=" --> ")
    L3 = []
    while L1 and L2:
        if L1[0] < L2[0]:
            L3.append(L1[0])
            L1 = L1[1:]
        else:
            L3.append(L2[0])
            L2 = L2[1:]
    while L1 and not L2:
        L3.append(L1[0])
        L1 = L1[1:]
    while L2 and not L1:
        L3.append(L2[0])
        L2 = L2[1:]
    print(L3)
    return L3

The recursive mergesort function ends up being extremely simple:

  • The base case is a list of length 1, which is simply returned
  • The recursive case splits the existing list into two smaller sub-lists
    • mergesort is then called on each of these sub-lists
    • The result is merged and returned
def mergesort(L):
    # base case
    if len(L) == 1:
        return L

    # recursive case
    L1 = L[:len(L)//2]
    L2 = L[len(L)//2:]
    print(f"splitting: {L} --> {L1=} {L2=}")
    return merge_lists(mergesort(L1), mergesort(L2))

Let’s see what happens for various calls. With an even, power-of-two number of elements (even splits each halving):

mergesort([4, 2, 3, 1])
splitting: [4, 2, 3, 1] --> L1=[4, 2] L2=[3, 1]
splitting: [4, 2] --> L1=[4] L2=[2]
merging: L1=[4] L2=[2] --> [2, 4]
splitting: [3, 1] --> L1=[3] L2=[1]
merging: L1=[3] L2=[1] --> [1, 3]
merging: L1=[2, 4] L2=[1, 3] --> [1, 2, 3, 4]
[1, 2, 3, 4]

With an odd number of elements:

mergesort([4, 2, 5, 3, 1])
splitting: [4, 2, 5, 3, 1] --> L1=[4, 2] L2=[5, 3, 1]
splitting: [4, 2] --> L1=[4] L2=[2]
merging: L1=[4] L2=[2] --> [2, 4]
splitting: [5, 3, 1] --> L1=[5] L2=[3, 1]
splitting: [3, 1] --> L1=[3] L2=[1]
merging: L1=[3] L2=[1] --> [1, 3]
merging: L1=[5] L2=[1, 3] --> [1, 3, 5]
merging: L1=[2, 4] L2=[1, 3, 5] --> [1, 2, 3, 4, 5]
[1, 2, 3, 4, 5]

With an even, non-power-of-two, number of elements:

mergesort([4, 2, 5, 3, 1, -1])
splitting: [4, 2, 5, 3, 1, -1] --> L1=[4, 2, 5] L2=[3, 1, -1]
splitting: [4, 2, 5] --> L1=[4] L2=[2, 5]
splitting: [2, 5] --> L1=[2] L2=[5]
merging: L1=[2] L2=[5] --> [2, 5]
merging: L1=[4] L2=[2, 5] --> [2, 4, 5]
splitting: [3, 1, -1] --> L1=[3] L2=[1, -1]
splitting: [1, -1] --> L1=[1] L2=[-1]
merging: L1=[1] L2=[-1] --> [-1, 1]
merging: L1=[3] L2=[-1, 1] --> [-1, 1, 3]
merging: L1=[2, 4, 5] L2=[-1, 1, 3] --> [-1, 1, 2, 3, 4, 5]
[-1, 1, 2, 3, 4, 5]

File System Operations

A common operation in a file system is to search for a specific file inside a directory structure.

From the root directory:

  • List all contents
  • If the content is a file, check if it is the file we are looking for
    • If so, return
  • If the content is a directory:
    • Search inside that directory

Python can interact with the file system using the os library:

  • Import the library using import os
  • os.listdir will list the contents of a directory
    • It returns a list of strings
    • When called with no arguments, it lists the contents of the directory the .py file is in
    • When called with an argument, the argument should be a string representing the path to a subdirectory
  • os.path.isdir checks if the argument, a path (represented by a string) is a directory
    • This function returns a bool

Implement this search in the practice problems.

Visualizer File System Operations

The visualizer supports only a limited set of file system operations, and cannot reliably be used for any code that interacts with the file system.

Practice

Practice Problem 10.1

Practice Problem 10.1

Implement a recursive binary search that can be called on a list and a value (and does not need to be given an initial start and end: it will work on the entire list by default).

Contrast our example, which would be called:

  • binary_search(L, val, 0, len(L))

With your function:

  • binary_search(L, val)

Hint: use keyword arguments.

Practice Problem 10.2

Practice Problem 10.2

Write a function that recursively searches for a file named target. The target file will be either in the current directory, or in a subdirectory (of any depth) of the current subdirectory.

Your function should return the complete relative path to the target file.