Software Design for Data Scientists


Units: 6


Data science projects significantly benefit from software engineering principles and practices that have evolved over many decades. This course offers an overview of these best practices and principles that data scientists must have in their toolkit to become more effective in their roles. Many data science methodologies, such as KDD, CRISP-DM, ASUM, SEMMA, and TDSP have emerged. This course will help students develop an appreciation for these methodologies and how they differ from each other. As data scientists, students will learn not only how to develop products that align with the business needs but also to develop long term organizational capabilities to use those products effectively. Data Science projects have unique exploratory characteristics for which Agile methodologies from the world of Software Engineering have also been adopted. This course will give an appreciation for such methodologies and how to adopt them in Data Science projects. Students will work in small groups, each group responsible for design and implementation of a small data science project. Case studies will be used to expand their learnings to different contexts.



Learning Outcomes

-Understand the unique nature of Data Science projects and how they are similar to or different from Software development projects
-Develop an appreciation for how software development methodologies have evolved. Analyze their suitability or lack thereof for data science project
-Compare and analyze various data science methdologies and their application in real world projects
-Trace the development of a data-science product starting with an organizational initiative to its deployment.

Prerequisites Description