Home » Courses » Course Descriptions

ECS 289G TOPICS IN DATA MINING (4) II

Lecture: 3 hours

Discussion: 1 hour

Prerequisite: Course 165A or 270 recommended

Catalog Description:
Selected topics in efficient data mining algorithm design and its application to novel areas.

Expanded Course Description:

  1. Overview of key data mining tasks
  2. Classification Algorithms
    1. Probabilistic graphical technique
    2. Linear algebra techniques (Support vector machines)
    3. Combinations of predictive models
  3. Clustering Algorithms
    1. Hierarchical
    2. Non-hierarchical algorithms
    3. Speeding up algorithms using KD-Trees and the triangle inequality
  4. Association Rule Algorithms
    1. Basic Apriori and Sequential Apriori Algorithms
    2. Advanced data strucutres to improve efficiency of basic algorithms
  5. Anomaly/Outlier Detection

Textbook:
P. Tan, M. Steinback, V. Kumar, Introduction to Data Mining, Addison Wesley, US edition, May 2, 2005. Technical papers addressing more advance topics.

Project:
There will be homeworks to reinforce key concepts and two projects. The individual project will be the implementation and evaluation of a data mining algorithm and the team project will be the application of these algorithms to a challenging data set.

Computer Usage:
Students will work in the Linux/UNIX workstation environment to develop and evaluate their algorithms. Computer usage is not required for homeworks.

Goals:
This course will provide an overview of data mining algorithms and the challenges of applying them to real data. We will focus on several types of data mining algorithms that are presented in the course text and reading assignments.

Instructor: Ian Davidson

Prepared by: I. Davidson (September 2007)

Overlap Statement:
There is no significant overlap with any other course.