Syllabus - CS378H - Introduction to Data Mining: Honors
Spring 2017
MW 9:30 - 11:00 CLA 0.102
Unique: 52278

Description

Data mining is the process of analyzing data sets for patterns or structure that once discovered can be used for various tasks, generally described as descriptive tasks and predictive tasks. Most businesses and organizations collect data about their operations, products, and customers and use data mining techniques on this data to better understand and improve these activities. Research projects in many areas of science employ data mining to develop descriptive models for understanding in their respective domains and to develop predictive models.

 

Objectives

This course provides the student with an understanding of issues around collecting, understanding, and preparing data for use in a broad range of data mining areas (including classification, association analysis, clustering, anomoly detection, regression). For each of these tasks: classification, association analysis, clustering, anomoly detection, regression, dimensionality reduction; the student will become familiar with multiple techinques for generating descriptive or predictive models, and metrics and methods for evaluating the suitability of generated models for the associated task. To practice these techniques for generating and evaluating models, the student will apply data mining tools to provided data sets.

 

Prerequisites

CS 429 (or 310) or 429H (or 310H).
Discrete Math, Linear Algebra, Probability and Statistics recommended.

 

Textbook

  • Introduction to Data Mining, by Pang-Ning Tan, Mchael Steinbach, Vipin Kumar
    • Pearson/Addison Wesley
    • ISBN: 0-321-32136-7
 

Instructor

David Franke
Email: dfranke at cs.utexas.edu
Office: GDC 6.402
Office Hours:

  • MW 11:00 AM - 12:00 PM
  • Or by appointment

 

TA

Xueyu Mao
Email: maoxueyu at gmail.com
Office: GDC 4.718D
Office Hours:

  • Tuesdays, 2:00 - 3:15 PM (For assignments due Wednesdays)
  • Thursdays, 2:00 - 3:15 PM (For assignments due Mondays)

 

Class Schedule (tentative)

Assignment #
Date Given Due Points Topics - Reading
Jan. 18Introduction
Jan. 23Data
Jan. 25Classification
Jan. 30
Feb. 1  Sections 4.3.7, 4.4
Feb. 6  Sections 4.2, 4.5, 4.6
Feb. 8  Sections 5.1, 5.2
Feb. 13  Sections 2.4, 5.3
Feb. 15  Sections 5.4, 5.5
Feb. 20  Sections 5.6, 5.7
Feb. 22Association Analysis
  Sections 6.2, 6.3, 6.4
Feb. 27
Mar. 1  Sections 6.5, 6.6
Mar. 6  Sections 6.7, 6.8
Mar. 8Midterm Exam
Mar. 13Spring Break
Mar. 15
Mar. 20Clustering
  Sections 8.1, 8.2
Mar. 22  Sections 8.3, 8.4
Mar. 27  Sections 8.5, 9.1
Mar. 29  Sections 9.2, 9.3, 9.4
Apr. 3Anonomly Detection
  Sections 10.1, 5.7
Apr. 5  Sections 10.2, 10.3
Apr. 10  Sections 10.4, 10.5
Apr. 12Regression
Apr. 17
Apr. 19
Apr. 24
Apr. 26Dimensionality Reduction
May 1
May 3Review
May 13, 9-12Final Exam: CLA 0.102

Late Policy

Assignments are due at the start of class (9:30 AM) on the due date, as we will be discussing the assignment solution during that class period.
Penalty for late submission is 40%, and late submissions will only be accepted up to the next assignment due date.

Final Grades

Final grade will be determined as follows:

Class Participation

As the practice of data mining requires interpretation of results and an understanding of the strengths and limitations of techniques and algorithms, is it expected that significant learning will be gained from discussion of problems and experiences. Students are expected to particiapte in class discussions, posing questions and proposing solutions.

Academic Honesty

You are free to discuss approaches to solving the assigned problems with your classmates, but each student is expected to write their own solutions. If duplicate work is detected, all parties involved will be penalized. All students should read and be familiar with the UTCS Rules to Live By.

 

Special Accommodations

Any student with a documented disability who requires academic accommodations should contact Services for Students with Disabilities (SSD) at (512) 471-6259 (voice) or 1-866-329-3986 (video phone). Faculty are not required to provide accommodations without an official accommodation letter from SSD.

  • Please notify me as quickly as possible if the material being presented in class is not accessible (e.g., instructional videos need captioning, course packets are not readable for proper alternative text conversion, etc.).
  • Please notify me as early in the semester as possible if disability-related accommodations for field trips are required. Advanced notice will permit the arrangement of accommodations on the given day (e.g., transportation, site accessibility, etc.).
  • Contact Services for Students with Disabilities at 471-6259 (voice) or 1-866-329-3986 (video phone) or reference SSD’s website for more disability-related information here.
 

Important Dates

Important dates for the Fall 2016 semester can be found on the Academic Calendar.