CS 675: Introduction to Machine learning
Fall 2016

Instructor: Usman Roshan
Office: GITC 3802
Ph: 973-596-2872
Office hours:MW 12:00 - 3:00
Email: usman@njit.edu

TA: Xiaoyuan Liang
Office: GITC 3800
Office hours:
Email: xl367@njit.edu

Textbook: Introduction to Machine Learning by Ethem Alpaydin
Grading: 25% mid-term, 30% final exam, 15% course project, 30% programming assignments
Course Overview: This course is a hands-on introduction to machine learning and contains both theory and application. We will cover classification and regression algorithms in supervised learning such as naive Bayes, nearest neighbor, decision trees, random forests, hidden Markov models, linear regression, logistic regression, neural networks, and support vector machines. We will also cover dimensionality reduction, unsupervised learning (clustering), feature selection, and kernel methods. We will apply algorithms to solve problems on real data such as digit recognition, text document classification, and prediction of cancer and molecular activity.

Course plan:

Topic
Date
Notes
Introduction, Bayesian learning, and Python
09/07/2016
Introduction
Linear algebra and probability background
Bayesian learning
Basic Unix command sheet
Instructions for AFS login
Textbook reading: All of chapter 1, 2.1, 2.4, 2.5, 2.6, 2.7
Bayesian learning
09/12/2016
Textbook reading: 4.1 to 4.5, 5.1, 5.2, 5.4, 5.5
Python
09/14/2016
Python
Python example 1
Python example 2
Python example 3
Nearest means and naive-bayes
09/19/2016
Nearest mean algorithm
Naive Bayes algorithm
Assignment 1
Kernel nearest means
09/21/2016
Datasets
Balanced error
Balanced error in Perl
Kernels
Kernel nearest means
Script to compute average test error
Textbook reading: 13.5, 13.6, 13.7
Separating hyperplanes
09/26/2016
Mean balanced cross-validation error on real data
Hyperplanes as classifiers
Textbook reading: 10.2, 10.3, 10.6, 11.2, 11.3, 11.5, 11.7
Multi-layer perceptrons
09/28/2016
Multi-layer perceptrons
Assignment 2: Implement gradient descent for least squares
Predicted labels for ionosphere trainlabels.0 training and eta=.0001
Least squares in Perl
Support vector machines
10/03/2016
Textbook reading: 13.1 to 13.3
Support vector machines
Efficiency of coordinate descent methods on huge-scale optimization problems
Hardness of separating hyperplanes
Learning Linear and Kernel Predictors with the 01 Loss Function
More on kernels
10/05/2016
Kernels
Multiple kernel learning by Lanckriet et. al.
Multiple kernel learning by Gonen and Alpaydin
Logistic regression
10/10/2016
Logistic regression
Solver for regularized risk minimization
Textbook reading: 10.7
Regularized risk minimization
10/12/2016
Assignment 3: Implement hinge loss gradient descent
Regularized risk minimization
Solver for regularized risk minimization
Cross-validation and exam review
10/17/2015
Cross validation
svm_learn
svm_classify
run_svm_light.pl
linear-bmrm-train
linear-bmrm-predict
bmrm.pl
BMRM training config file
BMRM test config file
Mid-term exam review
10/19/2016
Mid-term exam
10/24/2016
Feature selection
10/26/2016 Assignment 4: Implement logistic discrimination algorithm

Feature selection
A comparison of univariate and multivariate gene selection techniques for classification of cancer datasets
Feature selection with SVMs and F-score
Ranking genomic causal variants with chi-square and SVM
Dimensionality reduction
10/31/2016 Dimensionality reduction
Textbook reading: Chapter 6 sections 6.1, 6.3, and 6.6
Mid-term solution
11/02/2016 Assignment 5: Implement a decision tree in Python
Dimensionality reduction
11/07/2016 Dimensionality reduction
Maximum margin criterion
Laplacian linear discriminant analysis
Decision trees, bagging, and boosting 11/09/2016 Decision trees, bagging, and boosting
Univariate vs. multivariate trees
Survey of decision trees
Gradient boosted trees: Slides by Tianqi Chen
Decision and regression trees: Slides by Patrick Beheny
Regression trees: Slides by Cosma Shalizi
Textbook reading: Chapters 9 and 17 sections 9.2, 17.4, 17.6, 17.7
Unsupervised learning - clustering
11/14/2016 Clustering
Assignment 6: Implement k-means clustering in Python
Tutorial on spectral clustering
K-means via PCA
Textbook reading: Chapter 7 sections 7.1, 7.3, 7.7, and 7.8
Clustering
11/16/2016 Course project
Training dataset
Training labels
Test dataset
Error bounds, stacking 11/21/2016 Error bounds
Stacking and random hyperplanes
Random projections in dimensionality reduction
Random Bits Regression: a Strong General Predictor for Big Data
Regression 11/28/2016 Regression
Textbook reading: Chapter 4 section 4.6, Chapter 10 section 10.8, Chapter 13 section 13.10
Hidden Markov models 11/30/2016 Hidden Markov models
Textbook reading: Chapter 15 (all of it)
Feature learning 12/05/2016 Learning Feature Representations with K-means
Analysis of single-layer networks in unsupervised feature learning
On Random Weights and Unsupervised Feature Learning
Feature learning with k-means
Project submission, comparison of classifiers
12/07/2016 Comparison of classifiers
Do we Need Hundreds of Classifiers to Solve Real World Classification Problems?
An Empirical Comparison of Supervised Learning Algorithms
Statistical Comparisons of Classifiers over Multiple Data Sets
Big data 12/12/2016 Big data
Mini-batch k-means
Stochastic gradient descent
Mapreduce for machine learning on multi-core
Review for final, announcement of project winner 12/14/2016 Assignment 7 (optional)
Random hyperplanes