Percolation Threshold
An application of Union-Find
Grading
This class spans over two weeks. There are 12 exercise in total, after a warm-up if you need to get used to Java. Reaching question 6 after one week is good. Th grading is:
- A : exercises 1-12 (ie. everything)
- B : exercises 1-9
- C : exercises 1-7
Introduction
This class starts with a short introduction to the working environment and to the use of Java through some simple programs in Java.
Then, the objective of the class, which will take place over two weeks, is to determine whether or not there is percolation through a grid of the plane paved with black and white cells, i.e. whether or not there is a path of black cells going from the top to the bottom of such a grid. Then, we estimate the proportion of black cells necessary for percolation to occur. This proportion is called the percolation threshold.
To do this, you will have to create a program, divided into small elementary functions, which you will then optimize step by step, guided by the subject. Don’t forget to test the functions you write whenever possible, and as you go along!
Here you will manipulate the classic Java functions and control structures, and make extensive use of arrays (integers and booleans) and fundamental control statements. In particular, you will program a data structure called Union-Find that allows you to efficiently represent equivalence classes using their canonical representatives. It allows to optimize the solution of some problems in various domains, for example in graph theory or automatic proof.
Starting with exercise 3, you will have to submit your solutions via the form at the end of each exercise. It is possible to submit several files. It is also possible to resubmit a solution if you have made changes to it, and it is not a problem if the file contains more functions than requested, for example because you have already done the next question. The site will give you automatic feedback on your submissions, in the form of a traffic light at the top left of the screen. Click on it to learn more. The green banner Last submission: date only means that you have submitted a file, it does not say anything about the tests. It is important to understand that a positive feedback does not guarantee that your code is good. The tests only detect some simple errors. So if you have a problem with a question, you should also consider the possibility that it was caused by code from a previous question, even if the tests did not detect anything in that question.
Using Java
The version of Java best fitted for the classes is 1.8. You can write your programs using Eclipse, VScode, or some other editor. Come back to us if you have questions or problems.
Warm-up
The following is to check that you master the three steps necessary to execute a program. One does not need to understand Java for that, just follow the instructions.
- Starting.
Once Eclipse or your other environment launched, create a new project
you will call warmup
by clicking on
File > new > Java project
. If
Java project
is not suggested, first chose
Project
, and the option Java Project
should
appear in the next window. Select as JRE runtime environment
JavaSE-1.8
, this will reduce the risk of problems when
submitting files.
This project now appears in the Package explorer
on the
left side of your Eclipse window (if Eclipse is back to the home screen,
you can click on the click on the workbench
arrow at the
top right). It contains the standard Java library
(JRE System Library
but the name may vary depending on your
installation) and an empty src
directory. It is the latter
that will contain the Java source files for your project.
We advise you to create one project per class topic (but not per question).
Empty Program
Create a Java class that you will name
Empty
. To do this, right-click on thesrc
directory and selectNew > Class
. In the opened dialog box, besides indicating the class name, removewarmup
from thePackage
line and check thepublic static void main(String[] args)
box to automatically create the main function of your program that will be called at the beginning of its execution.Eclipse has automatically generated the
Empty.java
file in thesrc
directory while adding an indication(default package)
(the notion of package will be explained later). This file contains the following program:public class Empty { public static void main(String[] args) { // TODO Auto-generated method stub } }
The exact meaning of the different lines of the program and, in particular, of the keywords
public
,class
andstatic
, will be introduced in the coming weeks when the concepts of object-oriented programming are taught to you. Until their mysteries are unveiled, consider them as mandatory syntactic conventions.
In particular, the main function of your program must have the
prototype public static void main(String[] args)
to be
recognized as such by the Java compiler.
The string array args
passed as an argument to this
function corresponds to the arguments communicated to the program by the
user via the command line.
Compiling and running the
Empty.java
fileThe compilation and execution phases, theoretically distinct, are combined in Eclipse by clicking on the
Run
button. The results of the execution appear at the bottom of the window, in theConsole
tab. However, since this program does nothing (the body of themain
function is empty except for a comment line), no result appears here.To make sure that your program has been compiled and executed correctly, modify it to introduce an error, for example by removing the
static
keyword. If you try to compile and run this program again, you should get the following error in the console:Error: Main method not found in class Empty, please define the main method as: public static void main(String[] args)
Correct this error by putting back the previously removed keyword and verify that the error has disappeared.
- Program
Hello
.
We now propose to modify this program to display the message “Hello”. For this, we will create a new class
Hello
in the same project. You can create it as before or copy the empty class.After opening the
Hello.java
file by double-clicking on it, insert the following statement between the braces of itsmain
function:System.out.println("Hello");
We should immediately get into the habit of putting only one instruction per line, avoiding lines that are too long, and indenting properly our programs. Eclipse helps you with the latter.
Compile and execute this program, then verify that the welcome message appears in the console.
- Program
File vs Program.
The name of the program, here
Hello
, which appears in the first lineclass Hello
is not necessarily identical to the name of the file. Thus,class Hello
can be replaced byclass HelloProgram
.However, for Eclipse to know which program to run when it no longer has the same name as the file it is contained in, we need to open the run configuration dialog box via
Run > Run Configurations
(also accessible by clicking on the small downward pointing triangle next to theRun
button). In this dialog box, we need to replaceHello
withHelloProgram
as the main class.In addition, we need to remove the
public
keyword to the left of the class. This point will be explained in a future lesson.Verify the compilation and execution of your
HelloProgram
program.Case of a program that does not terminate.
If we insert:
while (true)
before
System.out.println("Hello");
we obtain a program that displays
Hello
endlessly. To interrupt it, you need to click on the red square on the right side of the console. ## Basic exercises
The following exercises aim to point out a number of common errors (or misunderstandings). Their purpose is to help you avoid them in the future.
For these exercises, continue to use the warmup
project
by adding new classes to it.
Exercise 1: primitive types
Calculate and display
20!
, using a variablei
of typeint
and then a variablen
of typelong
. Can you explain the difference in behavior?What is the result of the following instructions placed in a
main
function:double d = 0; = 1 + 1/2 + 1/3; d System.out.println(d);
What can you do to obtain a result that is more consistent with intuition?
Write a program that calculates and displays the quantities:
\[\begin{gather} 2 + 2 \\ e + \pi \end{gather}\]
The mathematical constants involved here are static fields of the
java.lang.Math
class, whose documentation is provided. For example, to use the constant \(\pi\), simply writeMath.PI
in your program.
Exercise 2: debugging a program
One rarely programs without making mistakes. It is therefore important to be able to detect errors in a program and correct them quickly. To do this, the programmer has several tools at their disposal that make the task considerably easier:
errors indicated on the fly by Eclipse, via red crosses in the margin when you edit your program;
compiler error messages to be looked at in the order of issuance: they are relatively explicit in Java; in particular, the line number where the first error is detected is often relevant;
runtime error messages;
displaying intermediate values when the program does not produce the correct result;
using a debugger such as the one integrated into Eclipse, although we will not delve into this in INF371.
It is very important to check for the absence of red crosses and to compile and test your program very regularly, for example after writing each function.
To practice debugging quickly, you are provided with several
erroneous versions of the same program supposed to calculate the value
of n!
, where n
is an integer entered by the
user from the keyboard.
To do this, download the zip archive Bug.zip. It contains eight Java files
named from Bug1.java
to Bug8.java
. To add them
to your warmup
project, you can right-click on
src
, then select Import
. In the dialog box,
choose General
then Archive File
, click
Next
, then Browse
on the first line and select
the downloaded file Bug.zip
. Finally, click
Finish
.
The objective of this exercise is to find the error in each of these eight programs using the different error messages returned by the compiler or at runtime.
For the future, you can find a summary of common errors here, also accessible directly from the course page.
Percolation Threshold
We will compute an approximate value of the percolation threshold of a top-down path from top to bottom in a matrix containing black and white cells. To do this, we will randomly blacken cells and, after each step, we will detect if there is a path made of black cells going from any top cell (first row) to any bottom cell (last row) any cell at the bottom (last row) of the matrix (without diagonal jumps). If so, the percolation will be reached. The percolation threshold is the proportion of black cells that it is probabilistically necessary to blacken to reach the percolation. This threshold is the same same for any matrix not too small.
To represent a square matrix M
of size N
,
we will use a one-dimensional array T
of size
NxN
: the cell in the row i
and the row
i
and column j
in M
(with
i
and j
between 0
and
N-1
) will correspond to the element of index
i.N + j
in T
. For example, here is the
encoding of a matrix of size 3 by a one-dimensional array of size 9
(each cell of the array contains here the index (row, column) of the
cell of the matrix that it encodes).
(0,0) | (0,1) | (0,2) | (1,0) | (1,1) | (1,2) | (2,0) | (2,1) | (2,2) |
Exercise 3 : defining the class
In a new project, define a Percolation
class containing
the constants size
, initially equal to 10,
and corresponding to the size of a matrix, and length
, of
value size * size
. This class will also contain a static
array of booleans grid
of size length
corresponding to the matrix: a box containing true
(resp.
false
) will be black (resp. white).
Submit your files here.
Exercise 4 : function
init
Define a function void init()
with no arguments
initializing the matrix with only white cells. This function will also
be used to reset the matrix between two percolation experiments.
Submit your files here.
Exercice 5 : function
print
Define an void print()
function with no arguments that
displays the matrix on the standard output. White cells will be
represented by the character '-'
(dash) and black cells by
'*'
(star). Make that the rows correspond to the first
dimension of the matrix and the and the columns to the second dimension
of the matrix.
For example, the table
true |
false |
false |
true |
true |
false |
false |
true |
false |
(encoding a 3x3 matrix) will be displayed like this:
*--
**-
-*-
To test your two functions, add a main
function
initializing the matrix with only white cells except for the cell in row
2 and column 4, then row 2 and column 4, then displaying it.
Submit your files here.
Exercise 6 : percolation
Define a function
int randomShadow()
that randomly darkens one of the of the white cells of the matrix (assuming that there is one) and returning the and returning the corresponding index. To do this, you use the functionMath.random
which returns a floating point number number between0.0
(included) and1.0
(excluded).Algorithmic note: We can assume that the array is not too full, contains at least around 40% of white cells and that we will therefore find a white cell after a few draws. This is the simplest solution. Another solution that guarantees a low computation time even in the worst case is the following: first determine a random order of the squares, then blacken them in this order. If you want to do this, you can use the Fisher-Yates shuffle algorithm to find a random permutation. Experience shows that this does not provide much practical gain.
Define a function
boolean isNaivePercolation(int n)
returning the Booleantrue
if and only if there is a black path path exists in the matrix, between one of its cells in row0
(the top of the matrix) and one of its cells in the rowsize-1
(the bottom of the (the bottom of the matrix), and passing throughn
.For this, it is possible to proceed in two similar steps:
- detecting a (half)path between the
n
cell and the top of the matrix; and - the detection of a (half)path between this same cell and the bottom of the matrix.
Nevertheless, one must be careful not to visit twice the same cell** on the same (half)path on the same (half)path and to factorize the code for the detection of the both (half)paths. To do this, you can define and use an auxiliary recursive function
boolean detectPath(boolean[] seen, int n, boolean up)
detecting a (half)path from the cell of indexn
to the top (resp. bottom) of the matrix ifup
is true (resp. false), considering the arrayseen
of the cells already visited for this (half)path.- detecting a (half)path between the
Add a function
boolean isPercolation(int n)
that just callsisNaivePercolation
. This will be used to easily test different versions of different versions ofisPercolation
without having to modify your whole program.Define a function
double percolation()
that blackens cells of the matrix at random the matrix at random, as long as its percolation has not taken place (the name of the function is just function is justpercolation
, anddouble
is the return type). A once percolation has been reached, it will return the proportion of blackened blackened cells.Submit your files here.
Upload form is only available when connected
Exercise 7 : Monte-Carlo method
Define a function
double monteCarlo(int n)
computing an estimate (between0
and1
) of the percolation threshold by performingn
percolation simulations. Performing such an approximate calculation by successive simulations is called the Monte-Carlo method (hence the name (hence the name of the function). It is used to solve efficiently many practical problems for which an exact calculation is impossible or is impossible or too expensive.Modify the function
main
so as to display the percolation threshold of a white matrix aftern
simulations,n
being provided by the user the user via theArguments
tab of the dialog box accessible fromRun > Run Configurations
. You can use Integer.parseInt to read an integer from a string. Using the functionSystem.currentTimeMillis()
function, also display the execution time required to reach the percolation threshold.Try your program with different matrix sizes and different numbers of different number of simulations.
Submit your files here.
Union-Find
The previous solution is not satisfactory because the percolation test is too inefficient. We are going to use a data structure called called Union-Find to optimize our search. It efficiently computes an equivalence relation via the canonical canonical representative of each equivalence class. Here, two indices of the matrix are equivalent if they are on the same black path path, so that detecting percolation is like detecting the fact that a top that a top element is equivalent to a bottom element of the matrix.
Exercise 8 : defining Union-Find
We will first program a first version of the structure of Union-Find, simple but not very efficient (quick-find).
In a new
UnionFind.java
file, create aUnionFind
class class containing anequiv
array and avoid init(int len)
function initializing this array with a sizelen
and associatingi
to each indexi
(initiallyi
is its own representative in its singleton equivalence class).Add to this new class a function
int naiveFind(int x)
returning the canonical representative associated tox
.Add to this new class a function
int naiveUnion(int x, int y)
realizing the union of the equivalence classes ofx
andy
: the members of equivalence class ofx
will have as new canonical representative that ofy
. This new representative will be in fine returned by the function. The size of an arrayt
is available in the variablet.length
variable.Add functions
int find(int x)
andint union(int x, int y)
that simply callnaiveFind
andnaiveUnion
respectively. just callingnaiveFind
andnaiveUnion
respectively. They will be used to easily test the different versions offind
andunion
without having to andunion
without having to modify your whole program.
Submit your files here.
Exercice 9 : using Union-Find
We will now use our Union-Find structure to optimize our optimize our percolation detection program.
Modify the
init
function of thePercolation
class to also initialize the UnionFind structure (via a call toUnionFind.init
) toUnionFind.init
).Define a function
void propagateUnion(int x)
uniting the class of thex
cell (assumed to be black) to all its black neighbors neighbors: at the end of the execution of this function, all the black black neighbors andx
must belong to the same equivalence class. equivalence class.Modify the
randomShadow
function to call the propagateUnion` function in the appropriate place.Write the function
boolean isFastPercolation(int n)
with a behavior similar to similar toisNaivePercolation
but more efficient due to the use of the the Union-Find structure.Modify the
boolean isPercolation(int n)
function so that it uses to useisFastPercolation
.Submit your files here.
Upload form is only available when connected
Optimizations
Exercice 10 : lazy Union-Find
The previous solution is still not efficient because the union
operation is too expensive: its complexity is linear in the size of the
matrix. We will perform a first optimization at the cost of a
degradation of the complexity of the find
function (from
constant time to logarithmic). The practical efficiency will already be
globally much better.
C1
and C2
, not to modify
all the members of an equivalence class (say C1
), but just
its canonical representative. canonical representative. The latter is
then associated with that of C2
, so that successive unions
define a forest where each tree having root r
represents an
equivalence class having for canonical representative r
.
The array UnionFind.equiv
contains thus not necessarily the
canonical representatives of the different equivalence classes but
rather the different fathers of these of these trees. This principle is
illustrated by the following diagram. The latter shows the result of
three successive unions on a table of size 9. In addition to the
contents of the UnionFind.equiv
array, the trees not
reduced to a root are also represented.
![](resources/images/union.png)
fastFind
and
fastUnion
. Compare then the efficiency of this
implementation with the other versions by modifying find
and union
.
Submit your files here.
Exercise 11 : logarithmic complexity
In the case of a union
union(x, y)
, another optimization is possible. Indeed, for the moment, we have arbitrarily chosen to modify the equivalence class ofx
so that its canonical for canonical representative that of the equivalence class ofy
. To limit the height of the trees, we can choose to keep as canonical representative the one whose height of the tree of the tree corresponding to its equivalence class is the largest by integrating a strictly smaller tree at its root, its height will not change. This optimization allows to limit the height of each tree by the height of each tree by the logarithm of the number of its elements (see poly). The time complexity of each step is then also logarithmic. then also logarithmic.Code this modification in a new function
logUnion
by adding an array of integersheight
of the same length as equivto the
UnionFindclass. The index element
iof this array will contain the height of the tree with
ias root. This array will be initialized in
UnionFind.init`.Another optimization, called path compression, consists in bringing the elements to bring closer to the root the elements encountered during a search. This way, the trees will be less high. We can simplify this optimization by simply pointing to each element encountered in a each element encountered during a search only to its grandfather in the tree (rather than to its father as before, or to the to the root for the unsimplified path compression).
Code the simplified path compression into a
logFind
function and compare the efficiency of this implementation with that of other versions by modifying versions by modifyingfind
andunion
.When we compress the paths, it can change the height of the trees (that’s the point!). Calculate the exact height of the trees, to the optimized version of
union
, would be very expensive. The usual strategy is to ignore the problem, keep the same same function for calculating heights as without compression (we will then talk about rank* rather than height, but for simplicity we will keep the name nameheight
in the code), and it has been proven that the quantity thus calculated calculated in this way brings the same advantages as the real height.
Submit your files here.
Exercice 12 : optimizing the number of searches
Now that the union function is logarithmic, it becomes advantageous to reduce the number of searches, even if it means increasing the number of unions. To do this, we will to do this, we will introduce two distinct elements at the end of the of Union-Find (whose sizes will be equal to that of the matrix + 2) to represent the whole of the top and bottom edges. Thus, it becomes possible to to perform the percolation test by simply comparing the equivalence classes equivalence classes of these two distinguished elements with a single test, without even the last blackened cell.
Note that the definition of the equivalence classes is slightly modified for the first and last modified for the first and the last line: an equivalence class does not group only only the cells on the same black path, but contains the whole first (resp. last) the first (resp. last) line and the corresponding distinguished element as soon as a cell of the when a cell of the first (resp. last) line is on this path.
Nevertheless, it becomes necessary to modify the
propagateUnion
function to propagate the equivalence
classes to these special elements as soon as a cell of the top or bottom
top or bottom box is blacked out.
Code this last optimization into a
boolean isLogPercolation()
function. function, also
modifying the propagateUnion
function appropriately.
Submit your files here.