You must be authenticated to submit your files

CSE 102 – Tutorial 1 – Recursion, iteration, and generators

General instructions
Merge sort
Mandelbrot set
Sieve of Eratosthenes (optional)

General instructions

Sections tagged with a green marginal line are mandatory.

Sections tagged with an orange marginal line are either more advanced or supplementary, and are therefore considered optional. They will help you deepen your understanding of the material in the course. You may skip these sections on a first pass through the tutorial, and come back to them after finishing the mandatory exercises.

After you are finished with the tutorial, please upload your submission to Moodle by completing “Tutorial 1 assignment” before the indicated deadline.

When submitting a solution, the system will do some basic validation of your files and run some automated tests on your solutions. A green tick tells you that your file compiles and has passed all of the automated tests. A red tick tells you either that your file is badly broken (e.g., a syntax error) or that your solutions fail one or more of the automated tests. You can see the result of the tests by clicking on “See my uploads” in the upper-left menu, and then clicking on the right arrow button next to each file you submitted.

Although passing all of our tests provides evidence that your solutions are correct, it might not guarantee it, as typically we would need to run infinitely many tests! You are encouraged to write your own tests, and more generally to think about why your code works and not just rely on the automated tests.

Discussion of the tutorials with friends and classmates is allowed, with acknowledgment, but please read the Moodle page on “Collaboration, plagiarism, and AI”.

Merge sort

We expect you to write your solution to these problems in a file named msort.py and to upload it using the form below. You can resubmit your solution as many times as you like.

Upload form is only available when connected

A sorting algorithm is a procedure for taking a list of elements and putting them in order. The merge sort algorithm, invented by John von Neumann in 1945, is a simple and efficient recursive sorting algorithm that works by splitting a list into roughly equal halves, recursively sorting those two pieces, and then merging them back together, as depicted in the example below:

(example of recursive merge sort procedure)

To implement merge sort, we will begin by implementing the merge operation, which takes two lists that are assumed to be already sorted and returns the combination of all their elements as a sorted list. For example, the merge of the lists [5,8,12,13] and [3,7,14] is the list [3,5,7,8,12,13,14].

Write a recursive function merge(xs,ys) implementing the merge operation based on the following logic (we refer to the first element of a list as its head):
- (base case:) if either list is empty, the merge is the concatenation of the two lists;
- (recursive case:) if both lists are non-empty, the merge is a list comprising the smaller head followed by a merge of the remaining elements (including the larger head).
We have already written the base case below, you should complete the definition by replacing pass with an appropriate expression for the recursive case, making use of the <= operation to compare the heads of the two lists.
```
def merge(xs,ys):
    if not xs or not ys:
        return xs + ys
    pass
```

Next we need to implement the split operation.

Write a (not-necessarily-recursive) function split(xs), which takes as input a list xs and returns a pair of lists (ys,zs) of roughly equal length containing all the elements of the original between them.
```
def split(xs):
    pass
```
There are many ways to split a list in two roughly equal pieces, but for this exercise you can use the simple approach of splitting it down the middle. For example, calling split([7, 0, 4, 9, 1, 2, 6, 5, 3, 8]) can return ([7, 0, 4, 9, 1], [2, 6, 5, 3, 8]).

Finally we are ready to implement merge sort.

Write a recursive function msort(xs), which takes as input a list xs and returns a sorted permutation of xs based on the following logic:
- (base case:) if the list has length 0 or 1, return it since it is already sorted;
- (recursive case:) otherwise, split the list into two pieces of roughly equal length, sort those two pieces, and return their merge.
```
def msort(xs):
    pass
```
(You should replace pass above by appropriate code dealing with both the base case and the recursive case.)

Optional theory exercise: For the purpose of implementing merge sort correctly and with the right asymptotic complexity, formally we require that split satisfies the following specification: for any list xs, calling split(xs) should always return a pair of lists (ys, zs) such that abs(len(ys) - len(zs)) <= 1, and such that xs is either equal to ys ++ zs or is a permutation thereof. Prove that your implementation of split above satisfies this specification. Then come up with at least two different implementations of split that also satisfy this specification. Modify your msort function to call these alternative implementations of split, and compare the running times on large lists. For example, you can try sorting a reverse-sorted listed of 1000 elements by running msort(list(range(1000,0,-1))). Finally, come up with an implementation of split that satisfies the second requirement but not the requirement that abs(len(ys) - len(zs)) <= 1, and again try testing msort with this fourth implementation and compare the running time.

You can write down your answers to these optional questions as comments in your file. You can name your additional implementations of the split function split_v2, split_v3, and split_v4.

Mandelbrot set

We expect you to write your solution to these problems in a file named fractal.py and to upload it using the form below. You can resubmit your solution as many times as you like.

Upload form is only available when connected

The goal of this sequence of exercises is to calculate and visualize the Mandelbrot set.¹ Mathematically, the Mandelbrot set \(\mathcal{M} \subset \mathbb{C}\) can be defined as the subset of the complex plane consisting of just those points \(c \in \mathbb{C}\) such that the infinite sequence of complex numbers

\[ z_0 = 0, \quad z_1 = z_0^2+c, \quad z_2 = z_1^2+c_, \quad \dots \]

remains bounded in absolute value. For example, the imaginary number \(i\) is in the Mandelbrot set, since it produces a sequence of numbers that eventually alternates between \(-1+i\) and \(-i\):

\(c\)	\(z_0\)	\(z_1\)	\(z_2\)	\(z_3\)	\(z_4\)	\(z_5\)	\(\dots\)
\(i\)	\(0\)	\(i\)	\(-1+i\)	\(-i\)	\(-1+i\)	\(-i\)	\(\dots\)

Since all of the numbers in the sequence have bounded absolute value (\(|z_n| \le \sqrt{2}\) for all \(n\)), \(i \in \mathcal{M}\).

On the other hand \(1\) is not in the Mandelbrot set, since it produces a sequence of numbers that quickly grows off to infinity:

\(c\)	\(z_0\)	\(z_1\)	\(z_2\)	\(z_3\)	\(z_4\)	\(z_5\)	\(\dots\)
\(1\)	\(0\)	\(1\)	\(2\)	\(5\)	\(26\)	\(677\)	\(\dots\)

Python has a built-in type of complex numbers, but in this tutorial we will define them for ourselves, representing complex numbers as objects containing a real and an imaginary part.

Complete the definition of a class of complex numbers below supporting addition, multiplication, exponentiation by a natural number, and absolute value. (You should not modify the __init__ method.)
```
class Complex:
    def __init__(self, re, im):
        """Define a complex number with real part re and imaginary part im."""
        self.re = re
        self.im = im
    def __repr__(self):
        return f'Complex({self.re},{self.im})'

    def __add__(self, x):
        """Given a complex number x, return a complex number representing self + x."""
        pass
    def __mul__(self, x):
        """Given a complex number x, return a complex number representing self * x."""
        pass
    def __pow__(self, n):
        """Given a non-negative integer n, return a complex number representing self ** n."""
        pass
    def __abs__(self):
        """Return the absolute value of a complex number."""
        pass
```
For addition, multiplication, and absolute value, it may be helpful to recall the following formulas:

\[\begin{eqnarray*} (a+bi) + (c+di) &=& (a+c) + (b+d)i \\ (a+bi) \cdot (c+di) &=& (ac - bd) + (ad + bc)i \\ |a+bi| &=& \sqrt{a^2 + b^2} \end{eqnarray*}\]

For exponentiation, you should use either recursion or iteration to implement the “fast exponentiation” algorithm (as seen in CSE101), equivalently defined by the following recurrence:

\[\begin{eqnarray*} a^0 &=& 1 \\ a^{2n} &=& a^n\cdot a^n \\ a^{2n+1} &=& a^n \cdot a^n \cdot a \end{eqnarray*}\]

Test your implementation by defining some complex numbers and performing some operations on them. For example, you should be able to reproduce the behavior below in the Spyder console (modulo the precise input/output line numbering):
```
In [1]: z = Complex(1,1)

In [2]: z + z ** 8
Out[2]: Complex(17,1)
```

Our next goal is to represent the infinite sequence of values \(z_0, z_1, z_2, \dots\) described above, which formally is defined by the following recurrence:

\[\begin{eqnarray*} z_0 &=& 0 \\ z_{n+1} &=& z_n^2 + c \end{eqnarray*}\]

Infinite sequences can be represented in at least two different ways in Python: as functions taking a natural number index \(n\) as argument and returning the \(n\)th value of the sequence, or as generators that continually yield successive values. The following exercise asks you to represent the sequence \((z_n)_{n \in \mathbb{N}}\) in both ways.

Use the recurrence above to define a recursive function z_fun(c,n) as well as a non-recursive generator function z_gen(c), with the following specifications:
- z_fun(c,n) should return the complex value \(z_n\)
- z_gen(c) should successively yield the values \(z_0, z_1, \dots\).
where moreover you should assume that \(c\) is a complex number represented as a value of Complex type, and that \(n\) is a non-negative integer. You can write these functions by replacing None and pass by appropriate expressions below:
```
def z_fun(c,n):
    if n == 0:
        return None
    return None

def z_gen(c):
    pass
    while True:
        yield None
        pass
```
(z_fun and z_gen should be defined independently, i.e., they should not refer to each other.)

As already explained, a complex value \(c\) is in the Mandelbrot set \(\mathcal{M}\) just in case the associated sequence \(z_0, z_1, z_2, \dots\) remains bounded in absolute value. In general it is impossible to decide whether an infinite sequence of numbers is bounded, but in this case, as a conservative approximation, we will simply ask that the first few values of the sequence have absolute value \(|z_n| \le 2\), for all \(0 \le n < 16\).

Write a function in_mset(c, timeout=16) which decides if \(c \in \mathcal{M}\) by the aforementioned criterion.
```
def in_mset(c, timeout=16):
    pass
```
Hint: you can implement this function using either z_fun or z_gen, but try to avoid doing unnecessary computations.

Optional theory exercise: Prove that \(c \in \mathcal{M}\) if and only if \(|z_n| \le 2\) for all \(n \ge 0\). This explains why it suffices to reach an iteration \(n\) such that \(|z_n| > 2\) to establish with certainty that \(c \notin \mathcal{M}\). However, for some \(c\) it may require more than 16 iterations to reach a value such that \(|z_n| > 2\). Give an explicit example of a number \(c\) which is not in the Mandelbrot set, but for which \(|z_n| \le 2\) for all \(0 \le n < 16\). In other words, find a number \(c\) such that calling in_mset(c) with the default timeout value returns True, but calling in_mset(c, timeout) with a larger value of timeout returns False. (Again, you can include your answers to this exercise as a comment in your file.)

Now we’re almost ready to visualize! First we’ll define a function that constructs an array of complex values of a given density within a specified sector:

def csector(density, xmin, xmax, ymin, ymax):
    xn = int((xmax-xmin) * density)
    yn = int((ymax-ymin) * density)
    xaxis = [xmin + i / density  for i in range(xn)]
    yaxis = [ymin + i / density  for i in range(yn)]
    return [Complex(a,b) for a in xaxis for b in yaxis]

Then we can use the Matplotlib library to write a function that visualizes the Mandelbrot set \(\mathcal{M}\) by calling your function in_mset above.

import matplotlib.pyplot as plt

def visualize_mset(density=100, xmin=-2, xmax=0.5, ymin=-1.5, ymax=1.5):
    mset = [c for c in csector(density, xmin, xmax, ymin, ymax) if in_mset(c)]
    plt.scatter([c.re for c in mset], [c.im for c in mset], color="black", marker=",", s=1)
    plt.gca().set_aspect("equal")
    plt.axis("off")
    plt.show()

Test your implementation of the Mandelbrot set by calling visualize_mset() and confirming that it produces something resembling the below:

Then try adjusting the density and/or the bounding box of the sector and see what you get.

Optional programming exercise: We can produce a more interesting visualization by keeping track of the number of iterations needed to reach a value \(|z_n| > 2\), for points \(c\) in the complement of the Mandelbrot set \(\mathbb{C} \setminus \mathcal{M}\). To that end, write a function escape_time(c, bound=2, timeout=16), which should compute the number of iterations to reach a value \(|z_n| > bound\), with an upper bound on the number of iterations tried. (In other words, it should return \(\min(timeout, \mathrm{argmin}_n |z_n| > bound)\).) You can then use the function below to produce another visualization, where \(\mathcal{M}\) is rendered in green, while points in \(\mathbb{C}\setminus \mathcal{M}\) are rendered in grayscale depending on their escape time. (You can feel free to modify the coloring scheme!)

def visualize_mset_escape(density=100, xmin=-2, xmax=0.5, ymin=-1.5, ymax=1.5):
    def to_color(t):
        return '#00ff00' if t >= 16 else '#{0:06x}'.format(t * 0x111111)
    cs = csector(density, xmin, xmax, ymin, ymax)
    plt.scatter([c.re for c in cs], [c.im for c in cs], c=[to_color(escape_time(c)) for c in cs], marker=",", s=1)
    plt.gca().set_aspect("equal")
    plt.axis("off")
    plt.show()

Sieve of Eratosthenes (optional)

We expect you to write your solution to these problems in a file named sieve.py and to upload it using the form below. You can resubmit your solution as many times as you like.

Upload form is only available when connected

The goal of the following series of exercises is to use generators to implement the ancient sieve of Eratosthenes algorithm for enumerating primes. To illustrate the sieve algorithm, imagine that we have all of the positive natural numbers bigger than 1 lined up in a row:

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 …

Recall that a natural number is prime just in case it is bigger than 1 and its only factors are 1 and itself. Hence the first number on this list (2) is necessarily prime, while all of its multiples are necessarily composite. So we apply one step of the “sieve”, marking the first number as prime and crossing off all of its multiples as composites:

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 …

The same logic applies to the next number on the list that has not been crossed off. So we apply another step of the sieve, marking this number (3) as prime, and crossing off all of its multiples as composites:

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 …

And another step (this time we skip over 4, which has already been crossed off as composite, to reach the prime number 5):

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25…

And so on, eventually (after infinitely many steps) obtaining an enumeration of all the primes…

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25…

Our strategy for representing the Sieve in Python will be to define a generator sieve() producing an infinite sequence of booleans, such that the \(n\)th boolean is True just in case the number \(n\) is prime. This generator will be defined in terms of more basic generators repeat(x) and cross(n,x,g), which we’ll use to implement the sieving algorithm.

Step 1: Write a generator repeat(x) which repeatedly yields the value x.

Example:

In [3]: g = repeat(42)

In [4]: [next(g) for _ in range(10)]
Out[4]: [42, 42, 42, 42, 42, 42, 42, 42, 42, 42]

Step 2: Write a generator cross(n,x,g) which crosses out every \(n\)-th element of the sequence generated by g and replaces it with the value x.

Example:

In [5]: g = cross(2, False, repeat(True))

In [6]: [next(g) for _ in range(10)]
Out[6]: [True, False, True, False, True, False, True, False, True, False]

Step 3: Write a generator sieve() generating an infinite stream of booleans \(S_1, S_2, \dots\), such that \(S_n\) is True iff the natural number \(n\) is prime.

Example:

In [7]: g = sieve()

In [8]: [next(g) for _ in range(10)]
Out[8]: [False, True, True, False, True, False, True, False, False, False]

Hint: After yielding False once (to capture the convention that 1 is not prime), you should use the sieve of Eratosthenes algorithm, beginning by assuming that everything is prime, and then making repeated calls to cross to cross out the multiples of prime numbers.

Finally, as a simple application of the sieve() generator you can implement the prime-counting function.

Write a function prime_pi(n) which computes \(\pi(n) = \# \{ p \mid p \le n,\; p\ prime\}\). One way that you can test that your implementation is correct by comparing the values to OEIS sequence A000720. How high in the sequence can you go? By computing prime_pi(n) for large enough n you can try to derive experimental evidence for the Prime Number Theorem, which states that the prime-counting function is asymptotically approximated by \(\pi(n) \sim \frac{n}{\log n}\). Write down your observations as a comment.

Named after an Ecole Polytechnique alum, Benoit Mandelbrot, who produced the first high-quality computer visualizations of this fractal while working at IBM in the 1980s. For more on the history of the Mandelbrot set as well as some recent developments, you might enjoy reading this Quanta article.↩︎