Machine Learning

See also Course material, Other LABS, Course projects


2. Dimension reduction methods

Goal : use the dimension reduction methods on the digits dataset.
Example : code examples are provided for the old poteries data.

1. Download the digits dataset
# Python
import matplotlib.pyplot as plt
from urllib.request import urlopen
digits=np.loadtxt(urlopen(url), delimiter=',',skiprows=1,usecols=range(1,785))
labels=np.loadtxt(urlopen(url), delimiter=',',skiprows=1,usecols=range(1,2))
plt.imshow(np.reshape(digits[1,:],[28,28])) # plot of the 1st image
# R
digits = read.csv(url,header=TRUE,sep=",")
digits = digits[,-1]
labels = read.csv(url,header=TRUE,sep=",")
labels = labels[,2]
image(matrix(as.matrix(digits[1,]),28,28,byrow=TRUE)) # plot of the 1st image

2. Vizualization with PCA
Run the PCA, plot the observations on the first plan with a color per digit

3. Reconstruction with PCA
Run the PCA, use the eigenvalues barplot to select a good number of components, reconstruct the images in the low dimension space.
Choose one image in the dataset, plot the initial image and its approximation.

4. Vizualization with MDS
Compare the plot obtained in 2. with the one obtained using MDS

5. Vizualization with t-SNE
Run t-SNE on the digits data for different perplexity.
Plot the results with a color per digit.

Codes: Poteries_DimensionReduction_FC.py, Poteries_DimensionReduction.R, digits_DimensionReduction_ToStart.py, digits_DimensionReduction_ToStart.R,