Machine Learning for Problem Solving

95-828

Units: 12

Description:

Machine Learning (ML) is centered around automated methods that improve their own performance through learning patterns in data, and then using the uncovered patterns to predict the future and make decisions. ML is heavily used in a wide variety of domains such as business, finance, healthcare, security, etc. for problems including display advertising, fraud detection, disease diagnosis and treatment, face/speech recognition, automated navigation, to name a few.

“If I had an hour to solve a problem I'd spend 55 minutes thinking about the problem and 5 minutes thinking about solutions.” — Albert Einstein “A problem well put is half solved.” — John Dewey

The main premise of the course is to equip students with the intuitive understanding of machine learning concepts grounded in real-world applications. The course is designed to deliver the practical knowledge and experience necessary for recognizing and formulating machine learning problems in the wild, as well as of applying machine learning techniques effectively in practice. The emphasis will be on learning and practicing the machine learning process, involving the cycle of feature design, modeling, and evaluation.

“All models are wrong, but some models are useful.” — George Box

As there exists “no free lunch”, we will cover a wide range of different models and learning algorithms, which can be applied to a variety of problems and have varying speed-accuracy-interpretability tradeoffs. In particular, the topics include supervised learning: linear models, decision trees, ensemble methods, kernel methods, nonparametric learning, and unsupervised learning: density estimation, clustering, and dimensionality reduction. The class will include biweekly homework each containing a mini-project (i.e., a problem solving assignment that involves programming) in addition to other conceptual and technical questions, a midterm, a final exam, and a case study at the end of the course. The case study gives students a chance to dig into a substantial problem using a large dataset and apply machine learning concepts they have learned throughout the course.

This course is designed to give a graduate-level student a thorough grounding in the methodologies, technologies, and best practices used in machine learning. This course does not assume any prior exposure to machine learning theory or practice. The prerequisites are basic knowledge of linear algebra and probability as well as proficiency in Python. For course related details (syllabus, assignments, etc.) see: http://www.andrew.cmu.edu/user/lakoglu/courses/95828/index.htm

Learning Outcomes:

By the end of the semester, students should be able to: Approach problems data-analytically: Look at a real world problem and decide if ML is an appropriate approach. -  If so, identify which type of ML problem it is and what types of models and algorithms might be applicable. They should also be able to implement ML solutions by applying ML algorithms on the real world data using best practices, evaluate their solution and technically communicate performance results.

Prerequisites Description:

Students are expected to have the following background:• Basic knowledge of probability • Basic knowledge of linear algebra • Working knowledge of basic computing principles• Basic programming skills at a level sufficient to write a reasonably non-trivial computer program in Python

Syllabus: