You must be authenticated to submit your files

CSE 102 – Midterm Exam

Wednesday, 10 April 2024

General instructions (original)

This exam consists of three independent sections. You do not need to complete every section to pass the exam. Your solutions will be marked automatically based on their correctness, but you can receive partial marks for a partially correct solution.

Unless otherwise stated, you are not allowed to import arbitrary libraries. You are allowed to import the random module from the Python Standard Library (it is required for some questions on this exam). You can do this by adding the line import random to the top of your file.

We expect you to write all of your solutions to all of the sections in a file named midterm.py, and to upload it via Moodle. You must ensure that this is a working Python file with no syntax errors, which you can verify by loading and running it in Spyder. In the problems below we provide you some examples to check that your solutions are correct, but you are responsible for testing your own code on these examples, and for ensuring that your code is otherwise correct (we will test it on other examples).

All problems on this exam are tagged with a green marginal line, interspersed between the explanatory text. Please make sure that you give your functions the correct names.

Post-exam instructions

Now that the exam is over, you can upload your solution file below to the submission server and get feedback from the automated tests. Each problem of the exam was evaluated using several partial correctness tests, and a green checkmark will appear under the “Status” field just in case you passed all of them. You can look at the Log to see the results of all of the partial correctness tests.

Upload form is only available when connected

Section 1: generalized Fibonacci numbers

The Tribonacci numbers \(T_n\) are defined by the recurrence

\[ T_{n} = T_{n-3} + T_{n-2} + T_{n-1} \]

for all \(n \geq 3\), with initial values \(T_0=0, T_1=0, T_2=1\).

Thus, the Tribonacci sequence starts with

\[ 0, 0, 1, 1, 2, 4, 7, 13, \dots \]

  1. Write a recursive function tribonacci(n) which takes the integer \(n\) as input and returns \(T_n\). This function should be fast enough to compute Tribonacci numbers up to \(T_{24}\) in a fraction of a second. However, since it implements the recursion directly (without dynamic programming), it will be noticeably slow for larger \(n\), e.g. \(n=30\).

We generalize Tribonacci numbers further to k-bonacci numbers \((k\geq 2)\), which are recursively defined by taking \(K^k_n = K^k_{n-k} + \dots + K^k_{n-1}\) to be the sum over the \(k\) preceeding k-bonacci numbers for \(n \ge k\), together with the initial values \(K^k_0 = K^k_1 = \dots = K^k_{k-2} = 0\) and \(K^k_{k-1} = 1\).

  1. Write a function kbonacci(n,k) that takes the integers \(n\) and \(k\) and returns \(K^k_n\). This function must use dynamic programming (either top-down or bottom-up) to compute the result efficiently. In particular, it should compute Tribonacci numbers much more efficiently than your function tribonacci. For example, kbonacci(30,3) should return instantaneously, whereas tribonacci(30) can require seconds of computation time.

Here are some more examples to test your implementation:

kbonacci(30,2) = 832040
kbonacci(30,3) = 15902591
kbonacci(30,4) = 28074040
kbonacci(100,5) = 8196759338261258264777004033

Section 2: random graphs

In this section, we are going to study the connectedness of random graphs.

Recall the adjacency list representation of graphs. We consider only symmetric graphs with node labels \(0..n-1\). (As usual, symmetric means that if there is an edge from \(i\) to \(j\) then there is also an edge from \(j\) to \(i\).)

For example, here are two symmetric graphs with \(10\) nodes in this representation:

example_graph1 = {0: [3], 1: [], 2: [7], 3: [0, 7], 4: [8, 9], 5: [9], 6: [], 7: [2, 3], 8: [4], 9: [4, 5]}
example_graph2 = {0: [2, 5], 1: [8], 2: [0, 3, 7, 8], 3: [2, 9], 4: [7], 5: [0, 6], 6: [5], 7: [2, 4], 8: [1, 2, 9], 9: [3, 8]}
  1. Write a function is_connected(g) that takes a symmetric graph in the described adjacency list representation and returns a boolean indicating whether it is connected. Hint: a symmetric graph is connected if and only if all nodes can be reached following a path from node 0. Thus, this property can be tested by a suitable traversal of the graph starting at node 0.

    Examples:

    In [1]: is_connected(example_graph1)
    Out[1]: False
    
    In [2]: is_connected(example_graph2)
    Out[2]: True

We consider the random graph model proposed by Edgar Gilbert. In this model, given an integer \(n\) and a probability \(p\), a symmetric random graph with \(n\) nodes is produced by deciding with probability \(p\) whether to include an edge \(i \leftrightarrow j\) between any pair of distinct nodes \(i \ne j\). (Note the resulting graph is simple, meaning that it has no loops or parallel edges.)

  1. Write a function gilbert_random_graph(n,p) that produces a random graph in the described way. It must return the graph in adjacency list representation. Our two example graphs above were generated by calling gilbert_random_graph(10,0.15).

    (Remember to add an import random at the top of your Python file to get access to the random number generator.)

For plausibility-checking your implementation, we can compute the expected number of edges in such a random graph. The following test function should return about \(50\).

def random_graph_test():
    edges=0
    for _ in range(10):
        g = gilbert_random_graph(100,0.01)
        edges += sum(len(g[k]) for k in g)//2  # divide by 2 since g is symmetric
    return edges/10

More generally, we can use the random graph model to test the expected properties of graphs of a given size \(n\), for a given edge probability \(p\), such as whether or not they are connected.

  1. Write a function connected_probability(n, p, trials) to estimate the probability that a random graph of \(n\) nodes and edge probabiity \(p\) is connected. This function should estimate the probability by testing the connectedness of trials random graphs.

Finally, we can visualize the dependency between edge probability and the probability of being connected using pyplot.

steps = 25
pmax = 0.5
xs = [ ((x+1)/steps)*pmax for x in range(steps) ]
ys = [connected_probability(20, p, 500) for p in xs]

from matplotlib import pyplot as plt
plt.plot(xs,ys)
plt.show()

The resulting plot shows an interesting phase transition!

Section 3: self-avoiding lattice walks

In this section, we are going to generate self-avoiding walks on \(\mathbb{Z}^2\), also called walks over a square lattice.

We define a n-step walk \(w\) as a sequence of \(n\) moves going right (R), left (L), up (U), or down (D), corresponding to a choice of vectors in \(\mathcal{M} = \{(1,0),(-1,0),(0,1),(0,-1)\}\). We can represent an \(n\)-step walk by a list of \(n+1\) integer coordinates \((x_0,y_0),\dots,(x_{n},y_{n})\), where the displacement vectors \((x_{i+1},y_{i+1}) - (x_i,y_i) = (x_{i+1}-x_i,y_{i+1}-y_i)\) are in \(\mathcal{M}\) for all \(0\leq i < n\).

We call a walk self-avoiding if all of its points \((x_i,y_i)\) are distinct.

As an example, we illustrate a 10-step self-avoiding walk (SAW) below, which starts at the origin \((0,0)\) and then takes a series of steps R, U, R, U, L, L, L, D, D, D:

And here is its Python representation as a list of distinct coordinates:

example_saw = [(0,0),(1,0),(1,1),(2,1),(2,2),(1,2),(0,2),(-1,2),(-1,1),(-1,0),(-1,-1)]
  1. Write a function is_saw(w) that takes a list of points and determines whether they form a self-avoiding walk.

    Examples:

    In [3]: is_saw([(0,0),(1,0),(1,-1),(2,-1),(3,-1),(3,0),(3,1)])
    Out[3]: True
    
    In [4]: is_saw([(0,0),(1,0),(1,1),(2,0),(2,-1),(2,-2),(2,-3)]) 
    Out[4]: False 
    
    In [5]: is_saw([(1,1),(1,0),(1,-1),(2,-1),(2,0),(1,0),(0,0)])  
    Out[5]: False

We are interested in generating random self-avoiding walks, for a given number of steps \(n\) and starting point \((x_0,y_0)\).

First, we consider a strategy that repeatedly selects moves in \(\mathcal{M}\) at random, each with the same probability, in order to determine the next coordinate \((x_{i+1},y_{i+1})\) from its predecessor \((x_i, y_i)\). In this strategy the newly selected point can clash with a previous point in the walk, in the sense that \((x_{i+1},y_{i+1}) = (x_j,y_j)\) for some \(j\) between \(0\) and \(i-1\). In this case the whole procedure simply restarts from the initial point \((x_0,y_0)\), which is why we call it the restart strategy.

  1. Write a function random_saw_restart(x0, y0, n) that takes starting coordinates \(x_0,y_0\) and an integer \(n \ge 0\), and applies the restart strategy to return a randomly generated \(n\)-step SAW starting at \((x_0,y_0)\).

    For example, you should see something like the following output (of course the actual SAW returned will vary!):

    In [6]: random_saw_restart(0, 0, 10)
    Out[6]: [(0, 0), (0, 1), (1, 1), (2, 1), (2, 0), (2, -1), (1, -1), (1, -2), (1, -3), (1, -4), (0, -4)]

    Your code should generate \(n\)-step SAWs uniformly at random, and should return within a fraction of a second for \(n=20\). Longer walks will take noticably more time, since typically many restarts are required to find a self-avoiding walk by chance.

Since the restart strategy is inefficient for longer walks, it looks attractive to avoid restarting from scratch and rather find a random SAW by backtracking.

Like the restart strategy, this strategy extends the walk step-by-step. However, in case of a clash the backtracking strategy systematically tries extending the walk by a different step in \(\mathcal{M}\), and if none of those work backtracks to a previous point, until an \(n\)-step self-avoiding walk is produced. The randomness comes from trying the four possible moves in \(\mathcal{M}\) in a random order.

  1. Write a function random_saw_backtrack(x0, y0, n) that takes starting coordinates \(x_0,y_0\) and an integer \(n \ge 0\), and applies the backtracking strategy to return a randomly generated \(n\)-step SAW starting at \((x_0,y_0)\). (We emphasize you should write a function that returns a single SAW, rather than a generator function yielding SAWs.)

    Hint: instead of selecting random moves one-by-one, select a random shuffle of the moves in \(\mathcal{M}\) in order to explore all possible extensions of a walk systematically. You can obtain a random shuffle of a list xs either by calling random.shuffle(xs) (which shuffles the list in place), or by calling random.sample(xs, len(xs)) (which returns a random permutation of the list while leaving xs unchanged).

Interestingly, the backtracking strategy does not generate \(n\)-step SAWs uniformly at random. To observe this (and test your implementations) let us define the displacement of a walk \(w = [(x_0,y_0),\dots,(x_n,y_n)]\) as the end-to-end Euclidean distance between the first and last point: \[d(w) = \sqrt{(x_n-x_0)^2 + (y_n-y_0)^2}\] The function expected_displacement(random_saw, n, trials) below takes as first argument a function for generating random SAWs under some strategy, and then repeatedly calls it to generate \(n\)-step SAWs for some number of trials to get an estimate of the average displacement:

def expected_displacement(random_saw, n, trials):
    def d(walk):
        return ((walk[-1][0] - walk[0][0])**2 + (walk[-1][1] - walk[0][0])**2)**0.5
    return sum(d(random_saw(0,0,n)) for _ in range(trials)) / trials

By instantiating expected_displacement appropriately, you can get an estimate of the expected displacement under the two different strategies. For example, assuming you have implemented the restart strategy correctly, calling

expected_displacement(random_saw_restart, 21, 1000)

should give a value slightly higher than 8 most of the time (but perhaps sometimes slightly lower), whereas assuming you have implemented the backtracking strategy correctly, calling

expected_displacement(random_saw_backtrack, 21, 10000)

should give a value around 6.7 (again with slight variation). This shows that the backtracking strategy preferentially produces self-avoiding walks that are more compact!

Finally, the backtracking strategy can be easily adapted to perform exhaustive enumeration of all SAWs of length \(n\).

  1. Write a generator function all_saw_backtrack(x0, y0, n) that takes starting coordinates \(x_0,y_0\) and an integer \(n \ge 0\), and applies backtracking to yield all possible \(n\)-step SAWs starting at \((x_0,y_0)\). The \(n\)-step SAWs can be generated in an arbitrary order.

    For example, calling list(all_saw_backtrack(0,0,2)) should produce a list of lists of coordinates that is isomorphic to the following (up to permutation of the SAWs):

    saws2 = [[(0, 0), (1, 0), (2, 0)], [(0, 0), (1, 0), (1, 1)], [(0, 0), (1, 0), (1, -1)],
             [(0, 0), (-1, 0), (-2, 0)], [(0, 0), (-1, 0), (-1, 1)], [(0, 0), (-1, 0), (-1, -1)],
             [(0, 0), (0, 1), (1, 1)], [(0, 0), (0, 1), (-1, 1)], [(0, 0), (0, 1), (0, 2)],
             [(0, 0), (0, -1), (1, -1)], [(0, 0), (0, -1), (-1, -1)], [(0, 0), (0, -1), (0, -2)]]

    Hint: the sequence counting the number of \(n\)-step SAWs begins \[ 1, 4, 12, 36, 100, 284, 780, 2172, \dots \] Once you have implemented all_saw_backtrack correctly, you should be able to reconstruct this sequence with [sum(1 for _ in all_saw_backtrack(0,0, n)) for n in range(10)].

(One may be tempted to use exhaustive backtracking to perform uniform random generation by exhaustively enumerating all \(n\)-step SAWs and then selecting one uniformly at random, but this strategy is even more inefficient than the restart strategy. The problem is that the number of SAWs grows too quickly!)