DS 675

DS 675: Machine Learning
Fall 2025

Instructor: Usman Roshan
Office: GITC 2106
Office Hours:M 4:45-5:25, T 3:15-3:55, W 1:45-2:25, Th 2:30-3:10
Ph: 973-596-2872
Email: usman@njit.edu

Textbooks:
Introduction to Machine Learning by Ethem Alpaydin (Recommended)
Learning with kernels by Scholkopf and Smola (Recommended)
Foundations of Machine Learning by Rostamizadeh, Talwalkar, and Mohri (Recommended)

Grading: 35% mid-term, 65% course project

Course Overview: This course is a hands-on introduction to machine learning and contains both theory and application. We will cover linear and non-linear classification methods, image classification, neural network training, logistic regression, feature selection, clustering, dimensionality reduction, and decision trees and random forests. We will also implement methods in Python with scikit-learn and Keras and if time permits study kernel methods, Bayesian learning, and autoencoders.

Deadlines and exam dates Project one page: Oct 3rd, Exam: Nov 14th, Final project powerpoint: Dec 5th

Course material:

Topic	Date	Notes
Linear modeling		Background Basic statistics Linear algebra background Linear models Least squares notes Least squares gradient descent algorithm Regularization Stochastic gradient descent pseudocode Stochastic gradient descent (original paper)
Neural networks		Multilayer perceptrons Basic single hidden layer neural network Back propagation Approximations by superpositions of sigmoidal functions (Cybenko 1989) Approximation Capabilities of Multilayer Feedforward Networks (Hornik 1991) The Power of Depth for Feedforward Neural Networks (Eldan and Shamir 2016) The expressive power of neural networks: A view from the width (Lu et. al. 2017) Convolution and single layer neural networks objective and optimization Softmax and cross-entropy loss Relu activation single layer neural networks objective and optimization Multi layer neural network objective and optimization.pdf Image localization and segmentation
Machine learning - running linear models in Python scikit-learn		Scikit learn linear models Scikit learn support vector machines SVM in Python scikit-learn Breast cancer training Breast cancer test Linear data Non linear data
Cross validation and balanced accuracy		Cross validation Training vs. validation accuracy Balanced error
Deep learning - running neural networks in Scikit-learn		Scikit-learn MLPClassifier Scikit-learn MLP code
Multiclass classification - linear models and neural networks		Multiclass classification Different multiclass methods One-vs-all method Tree-based multiclass Multiclass neural network softmax objective
Deep learning - running neural networks in Keras on tabular data		Categorical variables One hot encoding in scikit-learn Keras multilayer perceptron on tabular data Keras multilayer perceptron on tabular data with feature spaces
Convolutions and image classification		Image classification code Convolutions Popular convolutions in image processing Convolutions (additional notes) Convolutions - example 1 Convolutions - example 2 Convolutions - example 3 Convolutions - example 4 Popular convolutions in image processing Convolutional neural network (Additional slides by Yunzhe Xue) Convolution and single layer neural networks objective and optimization Training and designing convolutional neural networks Flower image classification with CNNs code
Neural networks gradient descent, optimization, batch normalization, common architectures, data augmentation		Optimization in neural networks Stochastic gradient descent pseudocode Stochastic gradient descent (original paper) Image classification code v2 Batch normalization Batch normalization paper How does batch normalization help optimization Gradient descent optimization An overview of gradient descent optimization algorithms On training deep networks The Loss Surfaces of Multilayer Networks Common architectures Transfer learning by Yunzhe Xue Transfer learning in Keras Pre-trained models in Keras Understanding data augmentation for classification SMOTE: Synthetic Minority Over-sampling Technique Dataset Augmentation in Feature Space Improved Regularization of Convolutional Neural Networks with Cutout
Logistic regression		Logistic regression Softmax and cross-entropy loss
Feature selection		Feature selection Feature selection (additional notes)
Clustering		Clustering
Exam review sheet		Exam review sheet
Dimensionality reduction		Dimensionality reduction
Decision trees and random forests		Decision trees, bagging, boosting, and stacking Decision trees (additional notes) Ensemble methods (additional notes)
Kernels (time permitting)		Kernels More on kernels
Maximum likelihood (time permitting)		Bayesian learning
Autoencoders (time permitting)		Generative models and networks Autoencoder