CSE 102 – Tutorial 6 – Programming with graphs
General instructions
Sections tagged with a green marginal line are mandatory.
Sections tagged with an orange marginal line are either more advanced or supplementary, and are therefore considered optional. They will help you deepen your understanding of the material in the course. You may skip these sections on a first pass through the tutorial, and come back to them after finishing the mandatory exercises.
After you are finished with the tutorial, please upload your submission to Moodle by completing “Tutorial 6 assignment” before the indicated deadline.
When submitting a solution, the system will do some basic validation of your files and run some automated tests on your solutions. A green tick tells you that your file compiles and has passed all of the automated tests. A red tick tells you either that your file is badly broken (e.g., a syntax error) or that your solutions fail one or more of the automated tests. You can see the result of the tests by clicking on “See my uploads” in the upper-left menu, and then clicking on the right arrow button next to each file you submitted.
Although passing all of our tests provides evidence that your solutions are correct, it might not guarantee it, as typically we would need to run infinitely many tests! You are encouraged to write your own tests, and more generally to think about why your code works and not just rely on the automated tests.
Discussion of the tutorials with friends and classmates is allowed, with acknowledgment, but please read the Moodle page on “Collaboration, plagiarism, and AI”.
Warmup: graph representations
We expect you to write your solution to this problem in a file named
graphs.py
and to upload it using the form below. You can
resubmit your solution as many times as you like.
In this tutorial we will be mainly using the adjacency list representation of directed graphs, but for the first problem we also consider adjacency matrix form. Recall that for any graph \(G\) with \(n\) nodes, given some ordering of the nodes we can represent the graph by a \(\{0,1\}\)-valued \(n\times n\) matrix \(M\), where \(M_{i,j} = 1\) just in case there is an edge from the \(i\)th node to the \(j\)th node of \(G\).
For example, consider the following graph \(G\) with four nodes and five edges:

If we order the nodes alphabetically (i.e., consider a as the first node, b as the second, etc.) then we get the following adjacency matrix:
\[ M_G = \left(\begin{array}{cccc}0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 1 & 0 & 0 & 1 \\ 0 & 1 & 0 & 0\end{array}\right) \]
We’ll represent matrices in Python as lists as lists, so that this matrix is represented by the value below:
= [[0, 1, 0, 0],
MG 0, 0, 1, 0],
[1, 0, 0, 1],
[0, 1, 0, 0]] [
In the adjacency list representation of a directed graph, we just list the outgoing neighbors of each node. For this example, those are
\[ A_G = \left\{\begin{align*}a &: [b] \\ b &: [c] \\ c &: [a,d] \\ d &: [b] \end{align*}\right. \]
We’ll represent adjacency lists in Python using dictionaries:
= {'a': ['b'], 'b': ['c'], 'c': ['a', 'd'], 'd': ['b']} AG
Write a function
matrix_to_adjlist(nodes, MG)
that takes a list of node labels of length \(n\) and the \(n \times n\) adjacency matrix of a directed graph and returns the adjacency list representation of the same graph.For example, running
matrix_to_adjlist(['a','b','c','d'], MG)
forMG
as defined above should return{'a': ['b'], 'b': ['c'], 'c': ['a', 'd'], 'd': ['b']}
.Hint: you may find it helpful to use the built-in function
enumerate(xs)
, which takes a listxs
as input and generates all the pairs(i,x)
consisting of each element ofxs
paired with its corresponding index (x = xs[i]
).
A graph \(G\) is said to be symmetric if for every edge \(x \to y\) there is an opposite edge \(y \to x\). This is easy to check using the adjacency matrix representation by checking that \(M_{i,j} = M_{j,i}\) for all indices \(i,j\). Likewise, it’s easy to compute the opposite of a graph, which has an edge \(y \to x\) in \(G^{op}\) just in case there is an edge \(x \to y\) in \(G\): the adjacency matrix for \(G^{op}\) is just the matrix transpose \(M^\intercal\). The questions below ask you to implement both of these operations but for the adjacency list representation.
Write a function
is_symmetric(AG)
that takes a directed graph represented in adjacency list form and checks whether it is symmetric.Write a function
revert_edges(AG)
that takes a directed graph in adjacency list form and returns the opposite graph.For example, running
revert_edges(AG)
on the example graph above should return{'a': ['c'], 'b': ['a', 'd'], 'c': ['b'], 'd': ['c']}
.
University life: combining depth and breadth
We expect you to write your solution to this problem in a file named
dfsbfs.py
and to upload it using the form below. You can
resubmit your solution as many times as you like.
A university’s course curriculum may be considered as a graph whose nodes are courses and where there is an edge \(x \to y\) whenever to take course \(x\) you must have first passed course \(y\) (that is, when \(y\) is a prerequisite for \(x\)). For example, the graph below shows a partial rendering of the computer science curriculum at a certain university:

In order to take CSE301, you must have first passed CSE203 and CSE201, which requires having passed CSE102, which requires having passed CSE101. And so on.
Let’s encode this graph in adjacency list form:
= {101 : [], 102 : [101], 103 : [101],
CourseG 201 : [102], 202 : [103], 203 : [], 204 : [102, 201], 205 : [201], 206 : [201, 203],
301 : [201, 203], 304 : [201], 305 : [201, 202], 308 : [203, 206]}
The following exercices can be solved using DFS traversals!
Write a function
required(G, c)
that takes a course graphG
in adjacency list form and the name of a coursec
, and returns the total number of courses that have to be passed in order to passc
, includingc
itself.For example,
required(CourseG, 301)
should evaluate to5
.Hint: you can use a DFS traversal from
c
and count the number of visited nodes during the traversal.Write a function
required_list(G, c)
that takes a coursec
and returns a list of courses that have to be passed in order to passc
(includingc
itself).For example,
required_list(CourseG, 301)
should evaluate to[101, 102, 201, 203, 301]
(or some permutation thereof).Hint: what can you say about the set of visited nodes when doing a DFS traversal from
c
?Write a function
needed_for(G, c)
that takes a coursec
and returns the number of courses that cannot be passed without first passingc
(includingc
itself).For example,
needed_for(CourseG, 102)
should evaluate to9
.Hint: try to express
needed_for
usingrequired
.
When designing a university curriculum, it is important to ensure that there are no cyclic dependencies, i.e., a chain of prerequisite relationships of the form \(c_1 \to c_2 \to \dots \to c_k \to c_1\), since that would make it impossible to enroll in any of the courses along the chain!
Write a function
cyclic_dependence(G)
that checks whether there is a cyclic dependence in the course graph. If there is such a dependence \(c_1 \to c_2 \to \dots \to c_k \to c_1\), the function should return the list of courses along the chain[c1, ..., ck, c1]
(where the first and last elements of the list coincide). If there is no dependence, the function should return an empty list.Notice that, in general, there might be multiple cyclic dependencies. You should return any one of them.
Choosing the program of study that’s right for you can at times feel like navigating a maze. You already saw how to escape from mazes using backtracking in Tutorial 3, but that approach runs into trouble when there are exponentially many paths, and a better approach is to view maze navigation as a graph traversal problem. Indeed, by using the right kind of graph traversal we can always find the shortest escape from a maze.
In the following problems, a maze is defined by a rectangular 2D array where each value is either 0 or 1. Cells marked by 1 are empty space, and cells marked by 0 are walls. For example, the maze

is represented by the following 2D array:
= [ [1, 1, 1, 1, 1, 1, 1],
example_maze 1, 0, 0, 0, 0, 0, 1],
[1, 0, 1, 1, 1, 0, 1],
[1, 1, 1, 0, 1, 1, 1]
[ ]
You are allowed to go from an empty cell to any other empty cell that shares a side with it. For example, here are two different paths through the maze:


The following exercices can be solved using BFS traversals!
Write a function
shortest_escape_len(maze, start, finish)
that takes as arguments a maze and the start and finish cells (given as pairs of numbers). The function should return the length of the shortest path fromstart
tofinish
, ormath.inf
if no such path exist. The length of a path is the number of cells along the path minus one. It is guaranteed that the start and finish cells are empty.For example,
shortest_escape_len(example_maze, (0, 0), (3, 6))
should evaluate to 9, corresponding to the length of the red path in the example.Write a function
shortest_escape(maze, start, finish)
with the same input data as in the previous problem that returns the shortest path from start to finish represented as a list of cell coordinates starting withstart
and ending withfinish
. If there is no path, returnNone
.
Fish coloring (optional)
Note that there are no automated tests for this sequence of optional exercises, it is up to you to ensure that your code is correct!
We expect you to write your solution to this problem in a file named
fish.py
and to upload it using the form below. You can
resubmit your solution as many times as you like.
Imagine you are responsible for setting up a series of fish tanks in a marine biology laboratory. Each tank can accommodate only fish which are compatible.
There are \(N\) fish, labeled from \(0\) to \(N-1\). For each fish, you are given a list of fish which are incompatible with it. Your job is to determine a distribution of the fish into the fish tanks, ensuring that only compatible fish can share the same tank, while trying to minimize the number of fish tanks.
For instance, you are given the following Python dictionary, where each key represents a fish, and each value is its list of incompatibility:
= { 0 : [1,2],
fish 1 : [0,2,3],
2 : [0,1,4],
3 : [1,4,5],
4 : [2,3,5],
5 : [3,4] }
In that case, you can for example use four fish tanks with a
distribution represented by the array [1, 2, 3, 1, 2, 4]
,
meaning that Fish 0 is assigned to Tank 1, Fish 1 assigned to Tank 2,
and so on. But, you could also take the distribution
[1, 2, 3, 3, 2, 1]
using only three fish tanks, which is
better.
Write first a “greedy algorithm”
greedy_tank(fish,n)
, wherefish
is a dictionary describing the incompatibilities between fish, andn
is the number of fish, that returns an array giving a valid assignment of fish. Your algorithm should proceed as follows:- Start with an empty list or array to store the tanks assigned to each fish.
- Begin with the first fish. Assign Tank 1 to it.
- Move to the next fish that hasn’t been assigned yet.
- Look up the fish tanks of its incompatible fish, and from the remaining tanks assign the first one that is available.
- Repeat steps 4 and 5 until all fish are assigned.
Try your algorithm on the example given above, and on the following bigger example:
= { 0 : [1,2,3,7], fish_big 1 : [0,2,3,7], 2 : [0,1,3,8], 3 : [0,1,2,8], 4 : [6,8], 5 : [6,7,8], 6 : [4,5,7,8], 7 : [0,1,5,6,8], 8 : [2,3,4,5,6,7] }
You should get the following behavior:
1]: greedy_tank(fish,len(fish)) In [1]: [1, 2, 3, 1, 2, 3] Out[ 2]: greedy_tank(fish_big,len(fish_big)) In [2]: [1, 2, 3, 4, 1, 1, 2, 3, 5] Out[
Theoretical question: Can you give a bound (in terms of the length of the longest list of incompatibilities) on the number of fish tanks used? Can you give some examples where this algorithm uses a lot of fish tanks, whereas only two fish tanks would be enough?
Theoretical question: The fish tank problem can be translated to a graph coloring problem, where we are given a graph and we want to color its nodes in a way that no two nodes sharing an edge have the same color. Can you elaborate how this translation works? What are the vertices and the edges of the graph?
To try to get an assignment of the fish using fewer fish tanks, consider the following refined algorithm: Sort the list of fish by the length of their incompatibility list, starting with the one with the longest list.
- Begin with the first fish. Assign Tank 1 to it.
- Traverse the (sorted) list of fish, and assign each fish to Tank 1 if it does not create some incompatibilities. You might want to create a “conflicted” set in which you store the union of the incompatibilities of the fish assigned to Tank 1.
- Repeat this procedure with Tank 2, etc.
Implement this algorithm in a function
wp_tank(fish,n)
, where as beforefish
is a dictionary giving the incompatibilities, andn
is the number of fish.Try your algorithm on the examples given above. Check that you get a tank assignment which satisfies the contraints. You can use the function:
def check_tanks(fish,tanks): = len(fish) n if min(tanks) <= 0 : return False for i in range(n): = tanks[i] col for nf in fish.get(i): if tanks[nf] == col: return False return True
The number of tanks should be 3 on
fish
(as for the first algorithm), but should be 4 onbig_fish
(which is an improvement compared to the first algorithm !).Theoretical question: This algorithm is called the Welsh-Powell algorithm. Can you give some examples where this algorithm uses a lot of fish tanks, whereas only two fish tanks would be enough?