CS 675: Introduction to Machine learning
Fall 2014

Instructor: Usman Roshan
Office: GITC 3802
Ph: 973-596-2872
Office hours: Tue 4-5:30, Wed 2-5
Email: usman@cs.njit.edu

Textbook: Introduction to Machine Learning by Ethem Alpaydin
Grading: 25% mid-term, 30% final exam, 15% course project, 30% programming assignments
Course Overview: This course is a hands-on introduction to machine learning and contains both theory and application. We will cover classification and regression algorithms in supervised learning such as naive Bayes, nearest neighbor, decision trees, random forests, hidden Markov models, linear regression, logistic regression, and support vector machines. We will also cover dimensionality reduction, unsupervised learning (clustering), feature selection, and kernel methods. We will apply algorithms to solve problems on real data such as digit recognition, text document classification, and prediction of cancer and molecular activity.

Course plan:

Topic
Date
Notes
Introduction, Bayesian learning, and Perl/Python
09/02/2014
Introduction
Linear algebra and probability background
Bayesian learning
Basic Unix command sheet
Instructions for AFS login
Bayesian learning and Perl/Python
09/04/2014
Bayesian learning and Perl
09/09/2014
Perl
Python
Perl example 1
Perl example 2
Perl example 3
Python example 1
Python example 2
Python example 3
Bayesian learning and Perl
09/11/2014
Nearest mean algorithm
Naive Bayes algorithm
Assignment 1: Implement naive-bayes algorithm
Nearest means and naive-bayes
09/16/2014
Datasets
Kernel nearest means
09/18/2014
Balanced error
Balanced error in Perl
Kernels
Kernel nearest means
Linear separators
09/23/2014
Mean balanced cross-validation error on real data
Hyperplanes as classifiers
Linear separators
09/25/2014
Assignment 2: Implement perceptron algorithm
Perceptron in Python
09/30/2014
Hardness of separating hyperplanes
Support vector machines
10/02/2014
Support vector machines
10/07/2014
Script to compute average test error
Support vector machines and kernels
10/09/2014
Kernels
Multiple kernel learning by Lanckriet et. al.
Multiple kernel learning by Gonen and Alpaydin
Logistic regression and regularized risk minimization
10/14/2014
Logistic regression
Regularized risk minimization
Solver for regularized risk minimization
Cross-validation and exam review
10/16/2014
Cross validation
svm_learn
svm_classify
run_svm_light.pl
linear-bmrm-train
linear-bmrm-predict
bmrm.pl
BMRM training config file
BMRM test config file
Mid-term exam
10/21/2014
Mid-term solution
10/23/2014
Assignment 3: Implement logistic discrimination algorithm
Feature selection
10/28/2014 Feature selection
A comparison of univariate and multivariate gene selection techniques for classification of cancer datasets
Feature selection with SVMs and F-score
Ranking genomic causal variants with chi-square and SVM
Feature selection
10/30/2014
Dimensionality reduction
11/04/2014 Dimensionality reduction
Dimensionality reduction
11/06/2014 Dimensionality reduction II
Maximum margin criterion
Laplacian linear discriminant analysis
Dimensionality reduction and unsupervised learning
11/11/2014 Dimensionality reduction III
Tutorial on spectral clustering
K-means via PCA

Course project
Training dataset
Training labels
Test dataset
Regression
11/13/2014 Clustering
Assignment 4: Implement cross-validation for support vector machine
Regression
11/18/2014 Regression
Hidden Markov models 11/20/2014 Hidden Markov models
Bagging and boosting 11/25/2014 Bagging and boosting
Feature learning 12/02/2014 Analysis of single-layer networks in unsupervised feature learning
Feature learning with k-means
Project results 12/04/2014
Review for final 12/09/2014