One in a series of articles profiling university programs focusing on big data and analytics education.
The data mining certificate program at University of California at San Diego’s Extension School was created in 2003. The program began just as new data mining tools with graphical user interfaces came on the market, meaning that command line programming was no longer a barrier to entry into data mining.
Natasha Balac, now the director at UCSD’s recently opened Predictive Analytics Center of Excellence, was then a consultant working with the Extension School when she proposed a single class to see if there was any interest in the topic.
“Even 10 years ago, there was a lot of interest,” Balac said. “The reason we had students interested in these classes, even 10 years ago, was the practical aspect. At the time in data mining text books, you would open them and it would be just a bunch of equations and proofs, and very statistically heavy books. The tools evolved, and they had nice [user interfaces], and people started looking around for a program because they could use these tools but didn’t have the technical background yet to understand what they were doing.”
The program features three required courses in data mining, a required data preparation course and one elective. The entire program is completed online.
Balac said students still use the open source data mining tool Weka, and is still focused on the practical application of data mining techniques. Students also learn some programming in R and SAS languages in the program and can take an elective course to learn more.
The first class had 10 people in it, Balac said; now there are 60 people in each class and a waiting list. Cindy Hanson, the assistant director of science and mathematics at the Extension School, said as interest in the program has increased over the last few years the program began adding new features.
Now students are able to take workshops and gain access to UCSD’s supercomputer, named Gordon, to crunch big data projects. Hanson said the program is also developing a capstone course for the certificate program where students will work with local industry partners on projects using business data. That class is in development and should be ready by the winter quarter in 2013.
Courses run quarterly; Hanson said most students complete the certificate in four or five quarters. Each class is $625; students that are new to statistical analysis are asked to take an introduction to statistics course as a prerequisite. Most students have already had some experience in the workplace, but some graduate students take the courses to enhance their studies.
- Program began in 2003; courses are online only.
- Five course certificate program, usually completed in four to five quarters.
- Each course costs $625
- Access to UCSD’s supercomputer for workshops
- Capstone course with local business data in development, ready by 2014.
- Students learn tools like SAS, R, and WEKA.