Using R for Policy Data Analysis
Data analysis is an essential part of quantitative policy analysis; however, focused application of statistical methods is beyond the scope of what can be taught in classes such as Cost Benefit Analysis (CBA) and Program Evaluation. In this course, students who have completed CBA and Program Evaluation will apply a variety of data analysis techniques using R, a free open source statistics and graphical analysis environment that is increasingly used by data miners and analysts. Class sessions will include a combination of instruction on data analysis techniques, in-class application using R, and critical evaluation and discussion of published CBA and Program Evaluation cases. Applications will focus on analyses that support the execution of CBA and Program Evaluation, including cases that focus on consumer protection, affordable housing, and public health. Learning Objectives: Students learn why and how to execute a variety of statistical techniques (e.g. calculating sample means and proportions, confidence intervals, contingency tables, t-tests, ANOVA, regression) to support CBA and Program Evaluation. Students will development familiarity with the R programming environment and will gain experience using R with public and restricted data sources. Students will gain an understanding of how data analysis underlying CBA and Program Evaluation studies affect the outcomes of those studies.
- Import data into R data structures and check for missing data
- Produce scatterplots, histograms, boxplots and other graphs to identify data problems and test data assumptions (e.g. normality, linearity).
- Produce descriptive statistics, including mean, median, mode, proportion, standard deviation and other measures of central tendency and dispersion.
- Calculate confidence intervals. Conduct t-test for differences in sample means and chi-square test to test for differences in categorical variables across groups.
- Conduct bivariate correlation and simple regression analysis