Machine Learning in Policy

Units: 12

Description: Machine learning, a field derived primarily from computer science and statistics, has matured and gained wide adoption over past decades. Alongside exponential increases in data measurement and availability, the ability to develop appropriate and tailored analyses is in demand. As practitioners in the social sciences consider machine learning methods, however, we are identifying limitations and externalities of the applications of machine learning techniques, such as overconfidence in settings with concept drift, lack of generalizability due to selection bias, and magnification of inequities. Machine Learning and Policy seeks to (1) demonstrate motivations and successes of machine learning, to (2) contrast them with more classical methods, and to (3) investigate the promise and cautions of machine learning for public policy. The course will cover variety of topics, including: Basics of machine learning; probability/Bayes/likelihood/conjugacy, terminology, code/algorithm design, evaluation, mathematical formulations Popular and well-performing methods; random forests/trees/ensembles, neural networks/backpropagation/embeddings/generalized adversarial networks, generalized linear models/shrinkage/convexity/basis functions, support vector machines/kernels/optimization/Lagrangian Leveraging other data sources; natural language processing/topic modeling/relational (non-i.i.d.)/relational (Markov logic networks)/temporal data Additional topics: causality/confounding/propensity scoring/inverse weighting/causal directed acyclic graphs, fairness/ethics, interpretation/explanation/visualization, anomaly detection, semi-supervised and active learning, reinforcement learning For policy students, Machine Learning and Policy will develop your skills in machine learning methods motivated by policy applications. For machine learning students, Machine Learning and Policy will develop your ability to formulate machine learning techniques that inform real-world policy and will demand that the formulations address the applications, consequences and limitations of existing techniques. Students will present mathematical formulations in TeX and markdown and implement algorithms in Python. Machine Learning and Policy will also involve a substantial discussion component. Approximately 25% of class time will be devoted to discussions of recent applications of machine learning in policy settings. Therefore, attendance in class is required. Many readings will come from the health care field, however, the methods will apply across policy domains.

Learning Outcomes: learn and adapt the mathematical formulations of machine learning methods for principled application perform end-to-end machine learning analysis, including: data exploration, preparation, cleaning, prediction, validation, visualization, and interpretation develop machine learning algorithms tailored to the data and policy research question contrast the strengths, limitations, applications, and externalities of machine learning approaches conduct a machine learning analysis to provide insight into a policy question and present the analysis in a conference-style paper in Latex

Prerequisites: Background in either machine learning or policy is required. This is a PhD level course. Experience in Python is highly recommended.