Statistics? Yes. Computer science? Obviously. But for undergraduates exploring data science, the most important concepts to learn are problem-solving, synthesis—correlating one dataset with another—and storytelling.
Those are the skills that will best help the next generation of data scientists take seemingly unrelated data from multiple sources and discover correlations to better understand how people, businesses and machines behave so they can solve important problems.
That was the principle behind two courses taught for the first time to undergraduates in Columbia’s “Introduction to Data Science” and San Jose State’s “Introduction to Big Data” in the fall of 2012.
At Columbia, “Introduction to Data Science” was taught by Rachel Schutt, a senior statistician at Google’s research division in New York. Schutt has a Ph.D. in statistics from Columbia, and first proposed the course as a seminar, featuring several guest speakers talking about their jobs in data science.