Data Mining Techniques


Units: 6

Description: Knowledge discovery in data is "the non-trivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns in data" --- Fayyad et al. (1996) Motivation: Data generated by humans and machines is available everywhere and growing steadily. In today's data-driven world it is crucial for students to acquire the fundamental skills for being able to analyze massive datasets and to develop data-driven techniques toward solving real-world problems. This course will cover the fundamental concepts and techniques in data mining, and equip students with the basic skillset toward becoming good data scientists. Major topics include algorithms and tools for data exploration, fast similarity search, pattern mining, outlier detection, dimensionality reduction, ranking, and recommender systems. The coursework involve mini-projects on various datasets to enable students to gain hands-on experience with data analytics. For course related details (syllabus, assignments, etc.) see:

Learning Outcomes: By the end of this class, students will • develop basic understanding of core data mining concepts • learn algorithmic approaches to various data mining problems • be able to analyze and assess data mining algorithms based on their accuracy, computational/storage complexity, and the tradeoffs thereof. • gain hands-on experience using data analytics techniques on real-world datasets