CSC43042EP - TD7: Support Vector Machines
Before you begin
Download and unzip the archive CSC43042EP-td7-1-handin.zip
.
Overview and Instructions
TD Objectives:
- This TD corresponds to Lecture 7: Linear Models for Classification.
- We will focus on Support Vector Machines (SVM).
- You will implement Soft Margin Kernel SVMs by solving the dual
problem using quadratic programming (via the
qpsolvers
package).
Datasets: The CSV files (e.g.,
hilly-train.csv
,ebike-train.csv
, etc.) are located in thecsv/
directory. These datasets include:- Scooter data
- Ebike data
- Hilly data
Introduction and Objectives
This TD corresponds to Lecture 7: Linear Models for Classification.
We will implement a Soft Margin Kernel SVM by
solving its dual problem using quadratic
programming (via the qpsolvers
package).
- You will code kernels, an SVM trainer and predictor.
- You will test the SVM on multiple datasets.
- You will classify spam emails using your own SVM.
Environment Setup
You will need Python with the following packages:
pip install --user numpy scipy scikit-learn qpsolvers[cvxopt]
Please not that we are using “cvxopt” as the quadratic solver, and it will influence in the results.
1. Quiz
Complete Questions 1-5 in the
quiz on Moodle. You should use the provided Python script
svm_with_sklearn.py
in the quiz/
folder which
can read from standard CSV files. It uses scikit-learn (with SVM based).
Typing python svm_with_sklearn.py
should give you an idea
of how to use this script:
python svm_with_sklearn.py <dataset_train> <dataset_test> [<id_label> <rest_of_cols> ['<char_sep>']]
Supervised Classification
In supervised learning (classification): - Input: \(n\) samples \((x_i, y_i) \in \mathcal{X} \times \mathcal{Y}\). - Goal: Find a function \(f: \mathcal{X} \to \mathcal{Y}\) minimizing the prediction error. - Loss: Use the 0-1 loss \(L(y, \hat{y}) = \mathbb{1}_{y \neq \hat{y}}\). - Risk minimization: Minimize expected risk \(\mathbb{E}[L(Y, f(X))]\).
For SVMs, we seek a predictor separating the data with a maximum margin hyperplane.
Theory: Linear Models and SVMs
Linear Classifiers
- Decision boundary is linear: a hyperplane \(\boldsymbol{x}^T \beta - \beta_0 = 0\).
- For binary classification: labels \(y_i \in \{+1, -1\}\).
Support Vector Machines
- Linearly separable case: maximize the margin while classifying all points correctly.
- Non-separable case: allow some misclassification with slack variables \(\xi_i\).
Soft-margin SVM optimization:
\[ \min_{\beta, \beta_0} \quad \frac{1}{2} \|\beta\|^2 + C \sum_{i=1}^n \xi_i \] subject to:
\[ y_i (\boldsymbol{x}_i^T \beta - \beta_0) \geq 1 - \xi_i, \quad \xi_i \geq 0 \]
Dual problem (which we will solve):
\[ \max_{\alpha} \sum_{i=1}^n \alpha_i - \frac{1}{2} \sum_{i,j=1}^n \alpha_i \alpha_j y_i y_j K(\boldsymbol{x}_i, \boldsymbol{x}_j) \]
subject to:
\[ 0 \leq \alpha_i \leq C, \quad \sum_{i=1}^n \alpha_i y_i = 0 \]
2. Exercises TD
2.1 Implementing Kernels
Theory reminder:
Kernels implicitly map data to a higher-dimensional space.
Common kernels:
- Linear: \(K(\boldsymbol{x}, \boldsymbol{y}) = \boldsymbol{x}^T \boldsymbol{y}\)
- Polynomial: \(K(\boldsymbol{x}, \boldsymbol{y}) = (\gamma \boldsymbol{x}^T \boldsymbol{y} + \text{coef0})^{\text{degree}}\)
- RBF: \(K(\boldsymbol{x}, \boldsymbol{y}) = \exp(-\gamma \|\boldsymbol{x} - \boldsymbol{y}\|^2)\)
- Sigmoid: \(K(\boldsymbol{x}, \boldsymbol{y}) = \tanh(\gamma \boldsymbol{x}^T \boldsymbol{y} + \text{coef0})\)
- Rational Quadratic: \(K(\boldsymbol{x}, \boldsymbol{y}) = \frac{\text{coef0}}{\|\boldsymbol{x} - \boldsymbol{y}\|^2 + \text{coef0}}\)
Task: Implement the different kernels in
kernel.py
.
Test with:
python grader.py 1
Upload kernel.py
.
2.2 SVM Constructor
Theory reminder:
- Extract features and labels.
- Labels must be transformed to \(\pm1\) (: use +1 for label 1, and -1 otherwise.).
- Precompute the kernel matrix \(K_{ij} = K(\boldsymbol{x}_i, \boldsymbol{x}_j)\).
Task: Complete the constructor of SVM
class in svm.py
.
Test with:
python grader.py 2
Upload svm.py
.
2.3 Training the SVM
Theory reminder:
- The bias term is computed using the support vectors.
- Formula:
\[ \beta_0 = \text{average}_{i \in S} \left( y_i - \sum_{j=1}^n \alpha_j y_j K(\boldsymbol{x}_j, \boldsymbol{x}_i) \right) \]
Numerical Stability Reminder
In practice, when building the \(P\)
matrix defined by \(P_{ij} = y_i y_j K(x_i,
x_j)\), numerical issues can arise if \(P\) is not perfectly symmetric or not
strictly positive definite.
To ensure stability for the quadratic solver, it is recommended to:
Symmetrize \(P\) by doing:
\[ P \leftarrow \frac{1}{2}(P + P^T) \]
Slightly regularize \(P\) by adding a small value to the diagonal:
\[ P \leftarrow P + \epsilon I \]
where \(\epsilon\) is a small positive value like \(10^{-8}\).
Thus, before solving the QP, you can do in your code:
= (P + P.T) / 2
P += np.eye(P.shape[0]) * 1e-8 P
This helps avoid solver errors or inaccurate solutions.
Theory reminder: - Solve the dual quadratic
programming (QP) problem. - Use qpsolvers
to solve it.
Task:
- Implement
compute_beta_0
insvm.py
. - Implement the
train
method insvm.py
.
Test with:
python grader.py 3
Upload svm.py
.
2.4 Predicting with the SVM
Theory reminder: - Prediction for a new point:
\[ \hat{f}(\boldsymbol{x}) = \text{sign}\left( \sum_{i=1}^n \alpha_i y_i K(\boldsymbol{x}_i, \boldsymbol{x}) + \beta_0 \right) \]
Task: Implement predict
in
svm.py
.
Test with:
python grader.py 4
Upload svm.py
.
2.5 Testing the SVM
Theory reminder:
- Evaluate model on a test set.
- Build a confusion matrix.
Task: Implement test
method in
svm.py
using ConfusionMatrix
.
Test with:
python grader.py 5
Upload svm.py
.
2.6 Mail Classification
Theory reminder:
Evaluation Metrics:
- Recall = \(\frac{TP}{TP+FN}\)
- False Alarm Rate = \(\frac{FP}{FP+TN}\)
- Precision = \(\frac{TP}{TP+FP}\)
- F-Score = \(\frac{2 \times \text{Precision} \times \text{Recall}}{\text{Precision} + \text{Recall}}\)
Task: Use your SVM to classify emails into spam and ham. (Remember from TD5 the col_class is the first one). Evaluate the metrics.
Use the datasets:
csv/mail_train_server.csv
csv/mail_test_server.csv
A real dataset is given and you can try for fun:
csv/mail_train.csv
csv/mail_test.csv
Complete mail_selector.py
.
Test with:
python grader.py 6
Upload mail_selector.py
.