Manifold Learning Based Sparse Representation

Manifold Learning Based Sparse Representation for Image classification

1. Introduction

Image classification is a challenging topic in the computer vision and machine learning research areas due to the complexity of different visual elements in images and the difficulty to correctly understand the semantics of images.
The figure below shows a general image classification framework which contains three major steps.
(i) Feature extraction
(ii) Feature enhancement
(iii) Classification

Figure: A general image classification framework.

2. Sparse Coding

3. Manifold Learning Based Discriminative Sparse Representation

Objective
Capitalize on both REPRESENTATION and DISCRIMINATION
Derive learning method by integrating:
(i) Complete MFA (Marginal Fisher Analysis)
Analyze both column space and null space of within-class scatter matrix
(ii) Discriminative Sparse Representation Model
Fuse sparse representation criterion and discriminative criterion -> To further improve discriminative ability

4. Experiments

Datasets Used

Figure: Some example images from both datasets are shown below.

(i) Computational Art Painting Categorization (Artist and Scene Classification)

Table: Comparison between the proposed method and other popular learning methods
on the Paintings-91 dataset.

(ii) Scene Recognition

Table: Comparison between the proposed method and other popular learning
methods on the 15 Scenes and MIT-67 datasets (scene recognition).

(iii) Object Recognition

Table: Comparison between the proposed method and other popular learning methods on the CalTech 101 and CalTech 256 datasets (object recognition).

(iv) Face Recognition

Table: Comparison between the proposed method and other popular learning
methods on the AR Face and Extended Yale B datasets (face recognition).

t-SNE Visualization

Image: The t-SNE visualization of the initial input features and the features extracted after applying the proposed CMFA-SR method for different data sets.

Evaluation of the Size of the Training Data

Image: The performance of the proposed method when the size of the training
data varies on (a) Caltech 101 data set (b) 15 Scenes data set.

References

[1] MSCNN-1, MSCNN-2 - Multi Scale CNN
K.-C. Peng and T. Chen, "A framework of extracting multi-scale features using multiple convolutional neural networks," in ICME, June 2015, pp. 1-6.
[2] CNN-F3, CNN-F4
K.-C. Peng and T. Chen, "Cross-layer features in convolutional neural networks for generic classification tasks," in ICIP, Sept 2015, pp. 3057-3061.
[3] SRC
J. Wright, A. Y. Yang, A. Ganesh, S. S. Sastry, and Y. Ma, "Robust face recognition via sparse representation," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 31, no. 2, pp. 210-227, 2009.
[4] D-KSVD
Q. Zhang and B. Li, "Discriminative k-svd for dictionary learning in face recognition," in Proc. IEEE Conf. CVPR, 2010, pp. 2691-2698.
[5] LC-KSVD
Z. Jiang, Z. Lin, and L. S. Davis, "Label consistent k-svd: Learning a discriminative dictionary for recognition," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 11, pp. 2651-2664, 2013.
[6] Laplacian-SC
S. Gao, I. W. H. Tsang, and L. T. Chia, "Laplacian sparse coding, hypergraph laplacian sparse coding, and applications," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 1, pp. 92-104, 2013.
[7] LLC
J. Wang, J. Yang, K. Yu, F. Lv, T. Huang, and Y. Gong, "Locality constrained linear coding for image classification," in Proc. IEEE Conf. CVPR, June 2010, pp. 3360-3367.
[8] SPM
S. Lazebnik, C. Schmid, and J. Ponce, "Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories," in Proc. IEEE Conf. CVPR, vol. 2, 2006, pp. 2169-2178.
[9] Places-CNN, Hybrid-CNN
B. Zhou, A. Lapedriza, J. Xiao, A. Torralba, and A. Oliva, "Learning deep features for scene recognition using places database," in Proc. NIPS, 2014, pp. 487-495.
[10] DAG-CNN
S. Yang and D. Ramanan, "Multi-scale recognition with dag-cnns," in The IEEE International Conference on Computer Vision (ICCV), December 2015.
[11] ZFNet CNN
K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale image recognition," CoRR, vol. abs/1409.1556, 2014.
[12] CNN-M
K. Chatfield, K. Simonyan, A. Vedaldi, and A. Zisserman, "Return of the devil in the details: Delving deep into convolutional nets," in BMVC, 2014.