Lehigh University
COLLEGE HOME | LEHIGH HOME | SEARCH




   

CSE 347  Data Mining  (3)

Instructor: Brad Askins

Current Catalog Description
Overview of modern data mining techniques: data cleaning; attribute and subset selection; model construction, evaluation and application. Fundamental mathematics and algorithms for decision trees, covering algorithms, association mining, statistical modeling, linear models, neural networks, instance-based learning and clustering covered. Practical design, implementation, application and evaluation of data mining techniques in class projects. Credit will not be given for both CSE 347 and CSE 447. Prerequisites: Either CSE 17 and MATH 231, or BIS 120 and ECO 145.

Textbook
Jiawei Han and Micheline Kamber, "Data Mining Concepts and Techniques", 2nd Ed., 2006, Morgan Kaufmann.

References
None

Course Outcomes

Student will have:

  1. Understanding of problems involved with statistical processing of large databases.
  2. Ability to identify associations, classes, and clusters in large data-sets.
  3. Ability to use the R system for data analysis.
  4. Ability to use commercial database management systems for data mining.

Relationship between Course Outcomes and Program Outcomes

CSE 347 substantially supports the following program outcomes:

A. An ability to apply knowledge of computing and mathematics appropriate to the discipline.

C. An ability to design, implement, and evaluate a computer-based system, process, component, or program to meet desired needs.

CSE 347 provides modest support to the following program outcomes:

I. An ability to use current techniques, skills, and tools necessary for computing practices.

K. An ability to apply design and development principles in the construction of software systems of varying complexity.

Prerequisites by Topic
1. Top-down design
2. High-level programming concepts
3. Facility with math through calculus

Major Topics Covered in the Course
1. Review of Statistical Thinking
2. Data Warehousing
3. Data Cubes
4. Data Visualization
5. Frequent Pattern and Association Mining
6. Classification
7. Cluster Analysis

Assessment Plan for the Course

The students are given ten lab/homework assignments, that relate to the material presented in lectures. There is one mid-term about half-way through the course that covers the first four major topics. In lieu of a final there is a course project to identify and tackle a "real-life" type data mining problem. In place of a second mid-term there is an oral presentation of the problem, the plan, and any or all results to date for the class project.

How Data in the Course are Used to Assess Program Outcomes:(unless adequately covered already in the assessment discussion under Criterion 4)

Each semester I include the above data from the assessment plan for the course in my self-assessment of the course. This report is reviewed, in turn, by the Curriculum Committee.

     
image


©2012 P.C. Rossin College of Engineering & Applied Science
Computer Science & Engineering, Packard Laboratory, Lehigh University, Bethlehem PA 18015