IS392 - Web Mining and Information Retrieval (Spring 2022)
Course Number: IS392-002
Classroom: Tiernan Hall 113 (after Jan. 30)
Class Meets: 11:30 am - 12:50 pm, Monday & Wednesday,
Faculty Instructor: Hua Wei, Ph.D.
E-mail: hua.wei AT njit.edu
Office: GITC 3803H
Office Hours: Monday 1-2 pm, or by appointment
Overview
This course introduces the design, implementation, and evaluation of web mining applications. Topics include automatic indexing, natural language processing, retrieval algorithms, basic machine learning techniques, and their applications to the web data. Students will gain hands-on experience applying theories in case studies.
Prerequisites
- IS218 OR IT114 OR CS114
- Programming, linear algebra, probability, algorithm analysis, data structure.
- Note: Assignments and projects should be implemented in Python.
Textbook
There is no required textbook. Below are some recommended reference books.
- Search Engines: Information Retrieval in Practice, by Croft, Metzler, and Strohman. Publisher: Addision-Wesley
- Paper 1: What Do People from Information Retrieval?, W. Bruce Croft
- Paper 2: Search Engine Optimization Starter Guide, Google
- Jiawei Han, Micheline Kamber and Jian Pei, Data Mining: Concepts and Techniques, 3rd ed.
- Kevin Patrick Murphy, Machine Learning: a Probabilistic Perspective, 2012./li>
Assignments and Grading
Assignment, quiz |
45% (5% quiz, 14% assignment 1, 13% assignment 2, 13% assignment 3) |
Project |
45% (10% report1, 10% report2, 10% report 3, 15% final report) |
Class Attendance |
10% |
- Assignments: All the assignments are done individually.
- Project: The course project is carried as a team.
- Class attendance: Attending class is required. Excused absence should get approved by the instructor BEFORE the class.