Part I. Latent Space
Question 1. A scatter plot of the r = 2-dimensional representation of the faces in the r = 2-dimensional latent space, together with the faces which represent the two dimensions/features of the two-dimensional representation.
Question 2. A visualization of the faces which define the features of the latent space for r = 5.
Question 3. The reconstruction of the faces with indices i ∈ {0, 10, 20} when using a rank of r ∈ {5, 25, 50, 100} For example, the reconstruction of a picture when using 25, 50 and 100 (centered) principal components should look like those in Fig. 1.
Question 4. Point out differences and similarities of the three methods and describe how the differ- ences in the matrix factorization objective functions are expressed in the latent space.
Part II. Classification methods and metrics
Question 1. Using the Naive Bayes method (you are free to use an implementation at your choice) perform binary classification on the HW3 dataset. Try to maximize the accuracy obtained on the validation set, while training the model on the training set. Report the results of the worst and best performing models using: confusion matrix and accuracy in percentages. Draw the decision boundaries of the worst and best models obtained. Perform a qualitative discussion on the process followed and the results obtained.
Question 2. Using the CART algorithm (you are free to use an implementation at your choice) perform binary classification on the HW3 dataset. Try to maximize the accuracy obtained on the validation set, while training the model on the training set. Report the results of the worst and best performing models using: confusion matrix and accuracy in percentages. Draw the decision boundaries of the worst and best models obtained. Draw the best and the worst CART models obtained. Perform a qualitative discussion on the process followed and the results obtained.
Question 3. Using Support Vector Machine - SVM (you are free to use an implementation at your choice) perform binary classification on the HW3 dataset. Use two kernels for SVM: (a) linear and (b) radial basis function. For both kernels, try to maximize the accuracy obtained on the validation set, while training the model on the training set. For both kernels, report the results of the worst and best performing models using: confusion matrix and accuracy in percentages. For both kernels, draw the decision boundaries of the worst and best models obtained. Perform a qualitative discussion on the process followed and the results obtained.
Question 4. Add to the Neural Networks code that you implemented yourself from scratch for HW3 Part I
f (x) = x, ∀x ∈ 0
f (x) = x, ∀x ≥ 0
f (x) = { 0.3x, ∀x < 0
Please perform and report the following using the HW3 dataset:
Present in the report the derivations of the two above mentioned activation functions to use them for the optimization process of the MLP
For each of the above mentioned activation functions, create one MLP model where the hidden neurons activation is set to chosen activation function, while the remaining configuration is exactly the same as the one defined in the "Problem" section of HW3.
For both MLP models, try to maximize the accuracy obtained on the validation set, while training the model on the training set.
For both MLP models, draw the decision boundaries of the worst and best models obtained.
Perform a qualitative discussion on the process followed and the results obtained.
Question 5. Discuss on the advantages and disadvantages of using all of the methods from part II (Classification methods and metrics) of this resit exam. Which one do you believe that is the best method to use on the HW3 dataset? Why?
Attachment:- Data Mining and Machine Learning.rar