This assignment has been closed on January 26, 2026.

You must be authenticated to submit your files

Tutorial 13: Recursion II

G. Pogudin, S.Berkemer apres S. Mengel

Log in to enable code uploads
Setup: before you start
- Files to download
- Context and Objectives
Exercise 8 (optional): More efficient Sudoku

Log in to enable code uploads

You need to log in to be able to upload your work as you go. Use your Polytechnique LDAP login here.

Setup: before you start

Create a new empty project in Spyder, and call it “Tutorial_13”.

Now create a new program file, parsing.py. All the functions you write have to be added to this file.

Files to download

You will need to download the following files, right-clicking on the links and selecting “Save link as…” to save each in your Tutorial_13 project directory.

Context and Objectives

In this tutorial, we will dive deeper into recursive functions namely writing a parser for arithmetic expressions. The aim is to practice recursion, but the exercises could also be solved iteratively.

Skills we will practice today: recursion, arithmetic expressions, sudoku.

Exercise 1: Finding the first digit

In this exercise, we want to locate the position of the first digit in a string, starting from a given position.

Write a recursive function first_digit that scans the string character by character until it finds a digit. Note that your function should return the position of the first digit and not the digit itself (see examples below).

Your function should start as follows:

def first_digit(text, position):
    """Return the position of the first digit in text starting from the given
    position. If there is no digit starting from this position, return None.
    """
    pass  # remove this line and complete the function

Hint: You can use the isdigit() method for strings that returns True if all characters are digits and False otherwise.

Test your function as follows:

In [0]: first_digit('abc123', 0)
Out[0]: 3

In [1]: first_digit('abc123', 2)
Out[1]: 3

In [2]: first_digit('abc123', 3)
Out[2]: 3

In [3]: first_digit('abc123', 4)
Out[3]: 4

In [4]: first_digit('abcdef', 0) is None
Out[4]: True

In [5]: first_digit('', 0) is None
Out[5]: True

In [6]: first_digit('9abc', 0)
Out[6]: 0

In [7]: first_digit('abc9', 4) is None
Out[7]: True

Upload your file parsing.py:

Upload form is only available when connected

Exercise 2: Reading numbers

In this exercise, we will want to read and evaluate arithmetic expressions given as a string. To do so, we will first write a recursive function read_positive_integer to read substrings representing non-negative integer values.

You do not need to use the first_digit function from the previous exercise, but you can use it as inspiration.

Your function should start as follows:

def read_positive_integer(text, position):
    """Return the longest substring of text representing a natural integer
    starting from the given position, together with the first position after
    that substring.  If there is no positive integer value starting at the
    given position, then return ('', position).
    """
    pass  # remove this line and complete the function

Test your function as follows:

In [8]: read_positive_integer('1234', 0)
Out[8]: ('1234', 4)

In [9]: read_positive_integer('1234', 1)
Out[9]: ('234', 4)

In [10]: read_positive_integer('1234', 4)
Out[10]: ('', 4)

In [11]: read_positive_integer('1234q', 0)
Out[11]: ('1234', 4)

In [12]: read_positive_integer('q1234q', 1)
Out[12]: ('1234', 5)

In [13]: read_positive_integer('q1234q', 0)
Out[14]: ('', 0)

Upload your file parsing.py:

Upload form is only available when connected

Exercise 3: Balanced parentheses

Writing arithmetic expressions (and also programming code) requires well-paired parentheses. Write a recursive function is_balanced in parsing.py that will get a string as an input and that will tell if the string contains well paired parentheses. The input contains additionally the position in the expression to tell the algorithm where to start. The output is then a tuple with either True or False at the first position and the position of the last position that has been checked. If the expression is well-formed, the output will be (True, X). Otherwise, the output will be (False,X) where X is the position where parsing stopped, thus at the end of the string for a succesfull parse or at the position of the first wrong parenthesis that has been found.

Here is a high-level idea of how the function is_balanced works:

If position exceeds the length of expression, everything went well when pairing the parenthesis, so the function returns (True,position).
When the function sees (, it:
- recursively checks what comes inside,
- expects to eventually find a matching ),
- then continues parsing after that closing parenthesis.
When it sees ) unexpectedly, it immediately fails.
All other characters (a, b, c, digits, operators, etc.) are ignored and skipped.

It is important to notice that recursion naturally follows the nesting structure of parentheses: each opening parenthesis ( delegates the search for its matching closing one. Reaching the end of the string too early reveals a missing ).

We can also develop step-by-step the behavior of the function, for example is_balanced("((a+2)*12", 0). There are two opening parentheses (( but only one closing parenthesis ).

Step-by-step execution

Step 1 — Initial call

is_balanced("((a+2)*12", 0)

expression[0] == '('
We enter the opening parenthesis case
Recursive call:

(res, pos) = is_balanced(expression, 1)

Step 2 — Second opening parenthesis

is_balanced("((a+2)*12", 1)

expression[1] == '('
Recursive call:

(res, pos) = is_balanced(expression, 2)

Step 3 — Skipping non-parenthesis characters

is_balanced("((a+2)*12", 2)   # 'a'
is_balanced("((a+2)*12", 3)   # '+'
is_balanced("((a+2)*12", 4)   # '2'

None of these characters are parentheses, so the function keeps advancing.

Step 4 — Encountering a closing parenthesis

is_balanced("((a+2)*12", 5)

expression[5] == ')'
According to the code:

return (False, 5)

This signals a closing parenthesis that must be matched by a previous '('.

Step 5 — Returning to the call at position 1

We resume the call:

is_balanced("((a+2)*12", 1)

with:

res = False
pos = 5

So the inner parentheses (a+2) are considered correctly matched.

The function continues after this closing parenthesis:

return is_balanced(expression, 6)

Step 6 — After the inner parentheses

is_balanced("((a+2)*12", 6)   # '*'
is_balanced("((a+2)*12", 7)   # '1'
is_balanced("((a+2)*12", 8)   # '2'

Again, no parentheses are found, so the function keeps advancing.

Step 7 — End of the string

is_balanced("((a+2)*12", 10)

Since the function is called with position >= len(expression), the function returns:

(True, 10)

Meaning: “Everything from here onward is balanced.”

Step 8 — Returning to the initial call

Back to:

is_balanced("((a+2)*12", 0)

We now have:

res = True
pos = 10

The function finally returns:

(False, 0)

We see on the detailed execution of the function that the inner parentheses (a+2) are correctly matched, the outer opening parenthesis at position 0 is never closed. The expression is therefore not well-balanced and the error is reported at position 0, corresponding to the unmatched '('.

Your function should start as follows:

def is_balanced(expression, position):
    """
    Read the expression starting from the given position,
    recursively check if the parentheses in the expression are well balanced.
    """

Test your function on the console as follows:

In [15]: is_balanced("a",0)
Out[15]: (True, 1)

In [16]: is_balanced("()",0)
Out[16]: (True, 2)

In [17]: is_balanced("(abc)",0)
Out[17]: (True, 5)

In [18]: is_balanced("(abc",0)
Out[18]: (False, 0)

In [19]: is_balanced("((a+2)*12", 0)
Out[19]: (False, 0)

In [20]: is_balanced("abc)",0)
Out[20]: (False, 3)

In [21]: is_balanced("((abc)(eee))(123))",0)
Out[21]: (False, 17)

In [22]: is_balanced("((abc)(eee))((123))",1)
Out[22]: (False, 11)

In [23]: is_balanced("((abc)(e(e)e))()((123))", 0)
Out[23]: (True, 23)

In [24]: is_balanced("((abc)(e(e)e))()((123))", 14)
Out[24]: (True, 23)

Upload your file parsing.py:

Upload form is only available when connected

Exercise 4: Balanced parentheses extended

Now that we are able to check if a string has balanced parentheses ‘(’ and ‘)’, we will now write an extended version is_totally_balanced that checks for all different kinds of parentheses (or pairs of symbols) if they are balanced in the input string.

Crossing parentheses is not allowed in well-formed expressions: for instance '([)]' is not allowed even if this expression restricted to ‘(’ and ‘)’ (resp. ‘[’ and ’]’) is balanced.

You can copy the code from is_balanced, change the name (also in the recursive call!) and add additional lines for the extension.

Hint: You can use two lists that contain the symbols to be matched in the same order.

Your function should start as follows:

def is_totally_balanced(expression, position):
    """
    Read the expression starting from the given position,
    recursively check if all the parentheses in the expression are well balanced.
    """
    opening = ['(', '[', '{', '<' ]
    closing = [')', ']', '}', '>' ]

Test your function on the console as follows:

In [25]: is_totally_balanced("a", 0)
Out[25]: (True, 1)

In [26]: is_totally_balanced("(a)", 0)
Out[26]: (True, 3)

In [27]: is_totally_balanced("(a]", 0)
Out[27]: (False, 0)

In [28]: is_totally_balanced("([a{}])", 0)
Out[28]: (True, 7)

In [29]: is_totally_balanced("([a]){", 0)
Out[29]: (False, 5)

In [30]: is_totally_balanced("((abc)(e{e}e))<>[(123)])", 0)
Out[30]: (False, 23)

In [31]: is_totally_balanced("((abc)(e{e}e))<>[(123)]", 0)
Out[31]: (True, 23)

In [32]: is_totally_balanced("((abc)(e{e}e))<>[(123)]", 1)
Out[32]: (False, 13)

In [33]: is_totally_balanced("((abc)(e{e}e))<>[(123)]", 14)
Out[33]: (True, 23)

In [34]: is_totally_balanced("((abc)(e{e}e))<>[(123)]", 13)
Out[34]: (False, 13)

Upload your file parsing.py:

Upload form is only available when connected

Exercise 5: Arithmetic expressions

Now that we can read numbers, let us define what an expression is. We define them as strings having the following properties:

A number as read in the last exercise is an arithmetic expression.
If a and b are arithmetic expressions, then so are (a+b), (a-b) and (a*b).
No other string is an arithmetic expression, i.e. we don’t allow spaces (” “) and only parentheses ‘(’ and ‘)’.
Each arithmetic expression is framed by parentheses e.g. (a) or (a+b), however, single numbers can be read without parentheses (see examples).

For example '10' is an arithmetic expression as well as '(10)', '(1+1)', '((2+3)*10)' and '((1+1)*(1+1))'. The above definition is what is generally called an inductive definition. This kind of definition is pervasive throughout computer science, discrete mathematics, logic and many other fields (so now is a good time to get used to them). Note that to avoid divisions by zero we do not allow division in arithmetic expressions.

We now define the semantics of arithmetic expressions in the obvious way by giving an evaluation function evaluate() that maps every arithmetic expression to a number:

If a is a number, then evaluate(a) is that number.
If a and b are arithmetic expressions, then evaluate((a+b)) = evaluate(a)+evaluate(b), evaluate((a-b)) = evaluate(a)-evaluate(b), and evaluate((a*b)) = evaluate(a)*evaluate(b).

Write the function evaluate(). To simplify the implementation, let evaluate return a second value that gives the first position after the currently read arithmetic sub-expression.

def evaluate(expression, position):
    """Evaluate the expression starting from the given position.
    Return the value and the first position after the read
    sub-expression. If the string starting at the given expression
    is not an arithmetic expression, return (-1,None) as the second value of the
    output tuple.
    """

Hint: you can define an additional function that evaluates the basic expressions, e.g. eval_expression("+", "2", "3") will return 5.

Test your function on the console as follows:


In [35]: evaluate("10", 0)
Out[35]: (10, 2)

In [36]: evaluate("(1+1)", 0)
Out[36]: (2, 5)

In [37]: evaluate("(1*1)", 0)
Out[37]: (1, 5)

In [38]: evaluate("(1-1)", 0)
Out[38]: (0, 5)

In [39]: evaluate("((1-1)+(2-1))", 0)
Out[39]: (1, 13)

In [40]: evaluate("q((1-1)+(2-1))", 1)
Out[40]: (1, 14)

In [41]: evaluate("((1-1)+(2*1))", 0)
Out[41]: (2, 13)

In [42]: evaluate("q((1-1)+(2*1))", 0)
Out[42]: (-1, None)

Upload your file parsing.py:

Upload form is only available when connected

Exercise 6: Collecting the parts

The function evaluate that we have written above does not necessarily find expressions with unbalanced parentheses, e.g.

In [43]: evaluate("1)", 0)
Out[43]: (1, 1)

Thus write another function check_and_evaluate that receives an arithmetic expression as input and first checks if its parentheses are balanced using the function is_balanced. If the parentheses are not balanced, the output should be None. Otherwise, it calls evaluate as a second step whereas the output is either None or the value received after evaluation of the expression.

Test your function on the console as follows:

In [44]: check_and_evaluate("1")
Out[44]: 1

In [45]: check_and_evaluate("(1)")
Out[45]: 1

In [46]: check_and_evaluate("(1+1)")
Out[46]: 2

In [47]: check_and_evaluate("(1+1")

In [47]: check_and_evaluate("(1")

In [47]: check_and_evaluate("1)")

In [47]: check_and_evaluate("((1-1)+(2*1))")
Out[47]: 2

In [48]: check_and_evaluate("((10-1)+(2*1))")
Out[48]: 11

In [49]: check_and_evaluate("((10-1)+(2*1)")

In [49]: check_and_evaluate("q((10-1)+(2*1))")

Note that the output for some expressions (the invalid ones) is None which is not printed on the console.

Upload your file parsing.py:

Upload form is only available when connected

Exercise 7: Sudoku

We will now see how a simple recursive search algorithm can be used to solve Sudokus. We assume that you know Sudoku; if this is not the case, please read the rules before continuing.

We will write a very simple program to solve Sudokus: we choose a cell on the board that is not yet filled, then we fill in a number that does not appear in the row, column or box of that cell and solve recursively (the box of the cell, also called region or block is the relevant 3x3-subgrid that contains the cell). This algorithms is guaranteed to find all solutions if for every empty cell we try all possible values one after the other.

To concentrate on the recursive aspect of the algorithm, we give you a class Sudoku in soduku.py that already implements many methods for you to use:

an __init__() method that initializes a board from a file,
an __str__() method to give a string representation where empty cells are encoded by ‘.’,
methods put() and delete() to set or delete values in a cell,
a method check() that checks if a value appears already in a row, column or box, and
a method find_first_empty() that finds the first empty cell on the board.

The only thing that remains to do is to write the method solve() that begins as follows:

    def solve(self):
        """Solve the given Sudoku. When a solution is found, print 'Found solution!'
        followed by the solution in the next line.
        """

Our strategy will be to use find_first_empty() to find an empty cell, put() a value into that cell, and run solve() recursively. If solve does not find a solution, then we delete the value we have put before before trying the next value.

Test your solver!

In [49]: s = Sudoku('sudoku1.txt')

In [50]: print(s)
┌───┬───┬───┐
│6..│...│..7│
│...│.9.│.2.│
│3.1│..2│59.│
├───┼───┼───┤
│8..│..7│.13│
│...│.8.│...│
│76.│3..│..8│
├───┼───┼───┤
│.78│2..│1.6│
│.5.│.3.│...│
│2..│...│..9│
└───┴───┴───┘

In [51]: s.solve()
Found solution!
┌───┬───┬───┐
│629│415│387│
│547│893│621│
│381│672│594│
├───┼───┼───┤
│894│567│213│
│132│984│765│
│765│321│948│
├───┼───┼───┤
│478│259│136│
│956│138│472│
│213│746│859│
└───┴───┴───┘

You can also create your own Sudokus by editing more files.

Upload your file sudoku.py:

Upload form is only available when connected

Exercise 8 (optional): More efficient Sudoku

If you try your Sudoku solver on the file sudoku-hard.txt, you should notice that this instance is, as the name suggests, in fact hard for your solver. Can you explain why that is?

Come up with ways to speed up the search for a solution in Sudoku puzzles. There are several ways of approaching this, for example:

using a better variable choice heuristic
using smarter reasoning techniques when before filling in values, for example some of those found here. Note that for some of the techniques it might be helpful to represent the board differently.

Which approach is more efficient?

Sudoku is a special case of the Constraint Satisfaction Problem (CSP) which is a very active research field in artificial intelligence. A general finding of that research is that there is a trade-off between the efficiency of search (i.e. plugging in values and recursing) and reasoning (i.e. thinking about which values can lead to solutions before plugging them in).

The two extreme cases are:

just exhaustively searching through all combinations of values which for non-trivial problems leads to a combinatorial explosion and thus bad runtime, and
making very complex reasoning to filter the values that are tried, which leads to less search but often slows down the program because the reasoning is slow.

Generally, good algorithms lie between these two extremes. The reasoning is limited to techniques that shrink the number of combinations that have to be studied effectively while making sure that this reasoning is not too costly. Of course, what the best trade-off is depends heavily on the problem. Can you find a good trade-off for Sudoku?

Upload your file sudoku.py:

Upload form is only available when connected