Final status: May 5, 2006 (no further updates of this web page are planned)

NEW! Thanks for taking the course!


 

CIS 610: »Data Structures and Algorithms«

Spring 2006

Section 004 (3 credits)
 


Instructor:

Dr. Artur Czumaj

Office:  4413 (GITC Building, 4th floor)
Tel:  (973) 596-3369
Fax:  (973) 596-5777
EMail:  czumaj@cis.njit.edu
WWW:  http://web.njit.edu/~czumaj

Schedule:

Class hours:   Monday    1:00pm -   2:25pm   Room: Faculty Memorial Hall 306
Wednesday   1:00pm -   2:25pm   Room: Faculty Memorial Hall 306


Office hours:
  Wednesday  11:30am - 12:50pm  
Thursday    4:00pm -   5:25pm            No Office hours after May 1


Contents:

The main aim of the course is to teach students to understand algorithms and data structures. We will discuss algorithms and data structures for most fundamental problems. For those problems we shall focus on the design of efficient algorithms and the use of proper data structures. Some issues of the analysis of the algorithms (like correctness, running-time analysis, etc.) will be also covered.

From the NJIT graduate course catalog: Intensive study of the fundamentals of data structures and algorithms. Presents the definitions, representations, processing algorithms for data structures, data structures, algorithms and their applications including linked lists, various tree organizations, hash tables, strings, storage allocation, algorithms for searching and sorting, and a selected collection of other algorithms. Programs are assigned to give students experience in algorithms, data structure design and implementation.


Prerequisite:

  1. CIS 114 (Introduction of Computer Science II), or CIS 505, or equivalent
  2. Math 226 (Discrete Math)

Contact the instructor immediately if you didn't take any of these courses.

Additionally, students must be able to program in either C++ or Java and to understand well programs written in any of these two programming languages.


Grading and Policies:

  • Grading will be based on homework assignments, two large programming assignments, an in-class midterm, and a final.

  • Grading policy:
    • Write Assignments 15%
    • Programming Assignments 15%
    • Midterm 30%
    • Final 40%

  • Programming assignments must be done either in C++ or in Java.

  • You must do all assignments individually. No teamwork.

  • No late submissions. Programming assignments are for several weeks.

  • Hand in all assignments at the beginning of the class period on the due date.

  • Two large programming assignments.

  • The work you turn in must be your own personal work, composed and written by you (unless stated very clearly otherwise). If you discuss a problem with a fellow student, you must mention this clearly in your homework (name the fellow student before the solution of the problem in question). Your work will then be compared to the other student's work to verify that your solution was written by you and reflect your own personal effort. If you don't report it, it will be considered a violation of the course rules, as described in the General Collaboration Policy section. (Collaboration of this kind is not allowed in the midterm and final exams.)

  • Cheating: it is surprisingly easy to detect when students are not doing their own work. Working with others on the assignments does not mean dividing the problems among you, and then swapping the answers.

  • The NJIT Academic Honor Code (http://www.njit.edu/academics/honorcode.php) will be upheld.
    In particular, copying homework solutions or programs, in full or in part, is forbidden. You may discuss ideas and concepts with your fellow students, but you may NOT copy it. Any violations will be brought to the immediate attention of the Dean of Students.

  • Link to the Departmental course web page with syllabus.


Course outline / topics to be covered (tentative):

  • General introduction:
    • what is an "algorithm design"
    • what are "data structures"
    • what does it mean to "analyze an algorithm"; worst-case and average-case analysis
    • we shall learn about techniques, tools, tricks plus a necessary information about the complexity issues
  • Analysis Techniques
    • Induction, Recursion
    • Recurrence Relations
  • Basic Data Structures
    • Abstract Data Types
    • Linked List
    • Stacks
    • Queues
  • Trees
    • Tree Traversals
    • Huffman Codes
    • Priority Queues, Heaps
    • Search Trees
      • Binary Search Trees
  • Hashing
  • Searching Algorithm
    • Sequential Search
    • Binary Search
  • Sorting Algorithms
    • Basic Algorithms
      • Insertion Sort
      • Bubble Sort
      • Selection Sort
    • Integer Sorting
      • Bucket Sort
      • Radix Sort
    • Advanced Algorithms
      • Merge Sort
      • Heap Sort
      • Quick Sort
  • Algorithm Design Techniques
    • Divide and Conquer
    • Greedy Method
    • Dynamic Programming
  • Graph Algorithms
    • Graph Traversals
    • Shortest Path Problems
    • Minimum Spanning Tree


References:

Textbook:

  • Michael T. Goodrich and Roberto Tamassia »Algorithm Design Foundations, Analysis, and Internet Examples«, John Wiley & Sons, Inc. 2002, ISBN 0-471-38365-1.

Additional references:

  • Cormen, Leiserson, Rivest, and Stein, Introduction to Algorithms, McGraw-Hill (2nd edition).
  • Aho, Hopcroft, and Ullman, Data Structures & Algorithms, Addison-Wesley, 1987.
  • S. Sahni, Data Structures, Algorithms and Applications in C++, McGraw-Hill, 1998.
  • R. Sedgewick, Algorithms in C++, Addison-Wesley. (Algorithms in Java is also OK.)


Homework Assignments, Programming Assignments, and Exams:

  • Homework assignment 1 was due February 8, 2006.
  • Homework assignment 2 was due March 10, 2006.
  • Homework assignment 3 was due April 26.
  • Programming assignment 1 was due April 10.

  • Programming assignment 2 was due May 3.
  • To discuss grading of Programming Assignments and Homework Assignments please contact the TA Li Lu (Email: ll72@njit.edu)

  • Midterm Exam took place at 4:00-5:30pm on Wednesday, March 8, 2006.
    • Midterm exam covered all material discussed so far, including all material from the textbook from Chapters 1-3.
    • Midterm was 1.5 hours long and had 4 questions.
    • In order to prepare to the Midterm exam, instructor recommended to work with Exercises from the textbook available at the end of Chapters 1-3 (including Creativity questions).
    • To discuss grading please contact the TA Li Lu
      • Office hours: Tuesday and Thursday, 2:30-4:00pm, Room GITC 4220
      • Email: ll72@njit.edu

  • Final Exam took place on Friday, May 5, 8:30-11:00am (Room: KUPF 106).
    • Exam covered ALL material discussed in the class and presented in the textbook (Chapters 1-7)
    • It was open textbook (but you're not supposed to bring other materials, like notes, laptops, computers, calculators, etc.)
    • Suggested reading: Chapters 1-7 in the textbook (don't forget to solve all exercises from Reinforcement and Creativity sections)
    • NEW! Grades: No grades for the final exam will be posted here. The final grades for the course will be posted directly to the registrar on around Thursday/Friday, May 11/12. From there, students should be able to access the grades via Pipeline.
      Every student can always ask the instructor per email (czumaj@cis.njit.edu) about her/his performance on the final and individual scores for every problem.


Specific contents:

  1. (1/18/06) First class
    • Discussion of the syllabus, requirements, topics to be covered, textbooks.
    • General introduction:
      • what is "algorithm design"
      • what are "data structures"
      • what does it mean to "analyze an algorithm"; worst-case and average-case analysis.
    • Algorithms complexity and how to measure the running time of an algorithm.
    • Suggested reading: Chapter 1 in the textbook.

  2. (1/23/06)
    • Algorithms complexity and how to measure the running time of an algorithm.
    • Basic algorithms and its running-time analysis:
      • Finding the smallest element in an array of size n can be done using n-1 comparisons (actually, n-1 comparisons are necessary)
      • Finding the smallest and the second smallest elements in an array of size n can be easily done using 2n-3 comparisons
      • ... but we can do better: finding the smallest and the second smallest elements in an array of size n can be done using n + log2n - 2 comparisons (and it's impossible to do it better)
      • Task to think at home: how could we find the smallest and the largest elements in an array of size n with at most 1.5 n - 2 comparisons?
    • Introduction to "Big-Oh" notation Big-Oh:
      • f(n) = O(g(n)) iff there are two positive constants n0 and c such that for every n > n0 we have f(n) c g(n)
    • Suggested reading: Chapter 1 in the textbook.

  3. (1/25/06)
    • Brief discussion of how to find the smallest and the largest elements in an n-element array with at most 1.5 n - 2 comparisons.
    • "Big-Oh" notation and asymptotic growth of functions.
      • Big-Oh:
        • f(n) = O(g(n)) iff there are two positive constants n0 and c such that for every n > n0 we have f(n) ≤ c g(n).
      • Big-Omega:
        • f(n) = W(g(n)) iff there are two positive constants n0 and c such that for every n > n0 we have f(n) ≥ c g(n).
      • Big-Theta:
        • f(n) = Q(g(n)) iff f(n) = O(g(n)) and f(n) = W(g(n)).
    • Few basic algorithms and their running-time analysis
      • minimum-finding,
      • SelectionSort,
      • matrix multiplication,
      • prefix-minima.
    • Four important classes of functions (used to describe the running time of algorithms):
      • constant (typical example are functions of the form Q(1), e.g., 5, 15.5, ½, or so)
      • polylogarithmic (typical example are functions of the form (log(n))Q(1), e.g., log2n, (log2n)3, log2n log2log2n)
      • polynomial (typical example are functions of the form nQ(1), e.g., n2, n log2n, n, n7, n0.1, √n)
      • exponential (typical example are functions of the form 2Q(n) or (Q(n))!, e.g., (n/2)!, 2n, en, 23n-11, 19n/19)
    • Suggested reading: Chapter 1 in the textbook.

  4. (1/30/06)
    • Why finding a minimum in an n-element array requires at least n-1 comparisons.
    • Analysis/comparison of asymptotic growth of functions.
    • Searching a set for a given element
      • Problem definition.
      • Basic algorithm using array representation of the input set; running-time is O(n).
      • Basic algorithm using a linked list as input representation; running-time is O(n).
      • Binary search: what we can gain if the input is sorted:
        • Basic iterative algorithm.
        • Recursive version of binary search algorithm.
        • Recursion for the running time of the recursive binary search: T(n) = O(1) + T(n/2).
        • Running time of the iterative binary search algorithm: O(log n).

  5. (2/1/06)
    • Priority queues:
      • Definition.
      • Basic implementations of heaps using arrays:
        • one implementation requires O(1) time for insert and (n) time for deleteMin,
        • another implementation requires O(n) time for insert and only O(1) time for deleteMin.
      • Using priority queues for sorting (n times insert + n times deleteMin).
      • Selection Sort - an implicit use of priority queues (sorting algorithm that requires O(n2) time).
    • Heaps and priority queues.
    • Definition of a heap and its basic properties
      • Efficient implementations of basic operations:
        • insert (insert a new element in the heap) can be performed in O(log n) time
        • deleteMin (removing the smallest element from the heap) can be performed in O(log n) time
      • Using an array to implement a heap
        • Elegant, fast, and space-efficient implementation
        • O(n)-time algorithm for building a heap for any set of n elements
    • HeapSort - an optimal sorting algorithm that uses a heap and sorts n numbers in time O(n log(n)).
    • Suggested reading: Chapter 2 in the textbook (in the class we will not discuss details of linked lists, queues, stacks, and related data structures - it's expected that students will read this material from the textbook (since this material should be known to all students))

  6. (2/6/06)
    • Efficient implementations of heap operations:
      • insert (insert a new element in the heap) in O(log n) time
      • deleteMin (removing the smallest element from the heap) in O(log n) time
      • createHeap (create the heap) in O(n) time
    • Trees and their basic properties.
    • Suggested reading: Chapter 2 in the textbook (in the class we will not discuss details of linked lists, queues, stacks, and related data structures - it's expected that students will read this material from the textbook (since this material should be known to all students))

  7. (2/8/06)
    • Long discussion on Homework 1.
    • Trees and binary trees.
    • Tree traversal algorithms: preorder and postorder.
    • Finding the number of nodes in a tree and the depth of every node in the tree.
    • Suggested reading: Chapter 2.3 in the textbook.

  8. (2/13/06)
    • Basic algorithms on trees:
      • An O(n)-time algorithm for finding the height of a tree (with n nodes).
      • An O(n)-time algorithm for finding the depth of every node in a tree (with n nodes).
    • Binary trees and their basic properties.
    • An O(n)-time algorithm
    • Binary Search Trees and their basic properties:
      • Supports basic operations on a set of objects with total order: create(), isEmpty(), size(), and most importantly findElement(x), insertItem(x), and removeElement(x).
      • Basic O(n)-time algorithm testing in a binary tree is a binary search tree.
    • Implementation of findElement(x)
      • The worst-case running-time is proportional to the height of the tree (and hence we wish we could have trees of small height).
    • Suggested reading: Chapters 2.3 and 3.1 in the textbook.

  9. (2/15/06)
    • Binary search trees (continuation):
      • Implementation of findElement(x) in time O(height) in the worst-case.
      • Implementation of insertItem(x) in time O(height) in the worst-case.
      • Implementation of removeElement(x) in time O(height) in the worst-case.
    • Sorting using binary search tree:
      • Once we have a binary search tree, an inorder numbering of the nodes will give the sorted order of the elements.
      • O(n2)-time sorting algorithm (because in the worst-case, the height of the built tree may be linear).
      • ... but if we could ensure that a binary tree has O(log n) height then we could do it better (get O(n log n) sorting algorithm).
    • In the quest for shallow (O(log n)-height) binary search trees: AVL-trees
      • Definition.
      • Key property: AVL-tree with n nodes has height not greater than 2 log2n + 2.
    • Suggested reading: Chapters 3.1 and 3.2 in the textbook.

  10. (2/20/06)
    • AVL-trees.
    • Insertion algorithm in AVL-trees: O(log n) time (requires 1-2 rotations).
    • Deletion algorithm in AVL-trees: O(log n) time (requires O(log n) rotations).
    • B-trees - basic properties.
    • Suggested reading: Chapters 3.1, 3.2, 3.3 in the textbook.

  11. (2/22/06)
    • 2-4 trees, their definition and basic properties.
    • Red-black trees: their definition and basic properties.
    • O(log n)-time algorithm for insertion in Red-Black trees.
    • Suggested reading: Chapter 3.3 in the textbook.

  12. (2/27/06)
    • Splay trees
    • Skip-lists
    • Suggested reading: Chapter 3.4 in the textbook.

  13. (3/1/06)
    • Hashing
      • Definitions
      • Collision handling schemes
      • Choice of good hash functions
    • Suggested reading: Chapter 2.5 in the textbook.

  14. (3/6/06)
    • Skip-lists.
    • General discussion about dynamic dictionaries.
    • Comparison of binary search trees, balanced binary search trees, AVL-trees, red-black trees, B-trees, spaly trees, skip-lists, and hashing techniques.
    • Discussion about Midterm.
    • Suggested reading: Chapter 3.4 in the textbook.

  15. (3/8/06)
    • Midterm exam. The exam took place at 4:00-5:40 at room GITC 3710.

  16. (3/20/06)
    • Midterm discussion
    • MergeSort sorting algorithm:
      • Divide-and-conquer approach
      • To sort n elements, first (recursively) sort the first half, then (recursively) sort the other half, and then merge the two sorted sequences into one sorted sequence.
      • O(n)-time algorithm for merging.
      • Running-time of MergeSort:
        • recurrence T(n) = O(n) + 2 T(n/2)
        • running-time O(n log n)
    • QuickSort sorting algorithm
    • Suggested reading: Chapters 4.1.1 and 4.1.2 in the textbook.

  17. (3/22/06)
    • Detailed discussion about Programming Assignment I (due date has been moved to April 10)
    • QuickSort algorithm
      • Divide-and-Conquer approach
      • Detailed implementation:
        • pick one element (called a pivot) and partition all elements into three sets:
          • elements smaller than or equal to the pivot,
          • the pivot, and
          • elements greater than or equal to the pivot
        • recursively sort the first set
        • recursively sort the third set
      • O(n)-time implementation of partitioning procedure
      • Running-time analysis: worst-case running-time of O(n2)
      • So why QuickSort is called "quick"? average-case running-time of O(n log n)
        • in an average-case we expect that partition will split n elements into two sets of size close to n/2 each
        • then the running-time would be similar to the solution of the recurrence T(n) = O(n) + 2 T(n/2), that is, it would be O(n log n)
    • Suggested reading: Chapter 4.3 in the textbook.

  18. (3/27/06)
    • Quick-sort: "quick" (but on average only) sorting algorithm using divide-and-conquer approach
    • Analysis of randomized Quick-sort: expected running time is O(n log n)
    • Lower bound for comparison-based sorting algorithms
      • Every comparison-based sorting algorithm requires in the worst-case at least log (n!) = Q(n log n) comparisons to sort
      • Note: the same result holds for average-case running-time for sorting! (not discussed in the class)
    • Bucket-sort: how to beat the W(n log n) lower bound when input has small domain
    • Suggested reading: Chapters 4.3, 4.4, and 4.5.1 in the textbook

  19. (3/29/06)
    • BucketSort and CountingSort (which is "BucketSort without using buckets")
      • if we have n keys, each key being an integer from {1, ..., m} then
        • BucketSort as well as CountingSort will sort these numbers in time O(n+m)
      • BucketSort and CountingSort are very efficient when either m n or m is not much greater than n
    • RadixSort: sorts very fast even if m >> n
    • Comparison (brief) of sorting algorithms
    • Suggested reading: Chapters 4.5 and 4.6 in the textbook

  20. (4/3/06)
    • Selection problem:
      • Sorting gives a trivial algorithm for selection running in time O(n log n)
      • By using priority queues (heaps) we can solve selection in time O(n + k log n)
      • A modification of Quick-Sort (called Quick-Select in the textbook) solves the selection problem in O(n2) worst-case running time, and in O(n) average-case running time
      • Brief discussion on a (worst-case) linear-time algorithm for selection
    • Greedy approach for problem solving
    • The fractional knapsack problem: a greedy algorithm finds an optimal solution in O(n log n) time
    • Suggested reading: Chapters 4.7 and 5.1 in the textbook

  21. (4/5/06)
    • Greedy algorithms: basic task scheduling problem: a greedy algorithm finds an optimal solution in O(n log n) time
    • Divide-and-conquer technique for problem solving and recursive algorithms
    • Master theorem: a generic approach for solving basic recurrences
    • Suggested reading: Chapters 5.1 and 5.2 in the textbook

  22. (4/10/06)
    • Divide-and-conquer technique for problem solving
    • Faster (than naive) algorithm for integer multiplications
      • Naive (school) algorithm for multiplying two n-bits numbers requires O(n2) operations
      • A divide-and-conquer scheme leads to an algorithm that multiplies two n-bit numbers in time O(nlog23) = O(n1.585)
    • Strassen's algorithm for fast matrix multiplication
      • Standard (school) algorithm for multiplying two n x n matrices requires O(n3) operations
      • A clever divide-and-conquer scheme leads to an algorithm that multiplies two n x n matrices in time O(nlog27) = O(n2.808)
    • Dynamic programming: improving running time of brute-force recursive/backtracking algorithms by memorizing partial solutions
    • Suggested reading: Chapters 5.2 and 5.3 in the textbook

  23. (4/12/06)
    • Divide-and-conquer technique for problem solving
    • Faster (than naive) algorithm for integer multiplications
    • Dynamic programming: improving running time of brute-force recursive/backtracking algorithms by memorizing partial solutions
    • 0-1 knapsack problem: efficient solution using dynamic programming
    • Matrix chain product problem: definition, applications, and preliminary discussion
    • Suggested reading: Chapters 5.2 and 5.3 in the textbook

  24. (4/17/06)
    • Matrix chain product problem: definition, applications, and preliminary discussion
    • Counting the number of possible parenthesizations for multiplying a chain of n matrices
    • Computing the minimum number of operations to multiply a chain of n matrices
    • Graphs, and their applications
    • What are graphs: definitions of undirected graphs, directed graphs, weighted graphs
    • Graph representations:
      • Adjacency matrix
      • Adjacency lists
    • Basic graph properties
    • Suggested reading: Chapters 5.3, 6.1, and 6.2 in the textbook

  25. (4/19/06)
    • Discussion on Programming Assignment 2
    • Graph traversal
    • Depth-first search algorithm
    • Breadth-first search algorithm
    • Using breadth-first search to find shorted path from a source node to every other node (in unweighted graph)
    • Suggested reading: Chapters 6.1, 6.3.1, and 6.3.3 in the textbook

  26. (4/24/06)
    • Shortest path problems in graphs
      • Introduction to the problem, basic definitions, and basic properties
    • Dynamic programming all-pairs shortest path algorithm (Floyd-Warshall algorithm) running in time O(|V|3)
    • Introduction to Dijkstra's algorithm
    • Suggested reading: Chapter 7.2 in the textbook

  27. (4/26/06)
    • Shortest path problems in graphs
      • Single-source single-destination shortest path problem
      • Single-source shortest path problem
      • All-pairs shortest path problem
      • Relation between these problems
    • Using matrix multiplication to solve the all-pairs shortest path problems in O(n3 log n) time
    • Dealing with negative weights and negative-weight cycles
    • Dijkstra's algorithm for the single-source shortest path problem (works only when all weights are non-negative)
      • Basic algorithm
      • Simple O(|V|2) time implementation
      • Using priority queues (heaps) to implement Dijkstra's algorithm to run in O((|V| + |E|) log n) time
    • Suggested reading: Chapters 7.1.1 and 7.2 in the textbook

  28. (5/1/06) Last day of classes
    • Minimum spanning tree problem
    • Discussion about Programming Assignment 2 (see for updates!)
      • A few words about Strassen's algorithm
    • Discussion about Final exam
    • Special office hours will be made on Monday 2:35-3:55pm (and these are the LAST office hours in the semester! - no office hours during the Reading Period or Exams)
    • Suggested reading: Chapters 7.3 in the textbook (and work hard to prepare to the final exam)


  29. (5/5/06) FINAL EXAM, Room KUPF 106
    • 8:30-11:00am
    • Exam covered ALL material discussed in the class and presented in the textbook (Chapters 1-7)
    • It was open textbook (but you're not supposed to bring other materials, like notes, laptops, computers, calculators, etc.)
    • Suggested reading: Chapters 1-7 in the textbook (don't forget to solve all exercises from Reinforcement and Creativity sections)
    • NEW! Grades: No grades for the final exam will be posted here. The final grades for the course will be posted directly to the registrar on around Thursday/Friday, May 11/12. From there, students should be able to access the grades via Pipeline. Every student can always ask the instructor per email (czumaj@cis.njit.edu) about her/his performance on the final and individual scores for every problem.


Artur Czumaj 11:37 AM, May 05, 2006