From Twitter, Instagram and Yik Yak to search engines and route maps, college students spend hours a day consuming, consulting and comparing data. In class, they’re increasingly asked to use complex data to explore everything from voter behavior to changing uses of language to animal biodiversity.
But where does a generation living in a world awash in data learn to work with and think critically about it? The new Data Science Education Program aims to prepare Berkeley students for that future.
This fall, pilot courses for freshmen in data literacy are the first steps in an effort being led by a multi-disciplinary faculty team. Students in the one-semester, four-unit “Foundations of Data Science” will learn computing and statistical concepts that underlie how to analyze data, as well as understanding context in working with data.
Along with that course, students will choose one of six “connector courses,” utilizing those skills in a particular field of study, such as history or health, and to address real-world issues. They will also examine best practices for both individual and project-based computing, privacy and anonymization, data and ethical decision-making, and data validity and reliability.
“Data competency is important; it’s one of the fundamental skills that all Berkeley graduates, in our more than 100 majors on campus, should have,” says David Culler, a professor of electrical engineering and computer sciences. “If our students get this training early, it will enable them to go far – in graduate school, in their careers, and as informed citizens.”
The new data science program also reflects the changing nature of a university education.
“Berkeley is doing something transformational, something no other university has imagined – it’s integrating data science as a core component of liberal education,” says Bob Jacobsen, interim dean of undergraduate studies in the College of Letters and Science. “This program will be a defining feature of a Berkeley undergraduate education, and hopefully a model for other institutions.”
This fall, up to 150 freshmen will take the pilot courses, which will be refined and become permanent in semesters ahead. But Berkeley plans to steadily grow the numbers of undergrads enrolled in them each semester until spring 2017, when a projected 3,000 students at a time will take part.
Waves of student interest
A year ago, Chancellor Nicholas Dirks called together a rapid-response committee to develop a data science curriculum, one of an array of innovations he is launching to benefit Berkeley undergraduates. The demand was pressing: During the past five years, a massive wave of undergrads, many of them seeking data skills, has flooded Berkeley computer science and statistics courses. Today, the bulk of the campus’s 27,000 undergraduates elect to take entry-level courses in computing and statistical reasoning, and many continue on to upper-division courses.
“Students’ minds may start with, ‘I need a pathway to a job,’ and they’re seeing how many great jobs are out there for people with data science skills,” says Cathryn Carson, a professor of history. “For the faculty, we see the research world being transformed by the availability of data, and we see how students need to be skilled at working with it by the time they get to graduate school.”
Campus leaders concur that a thoughtful, integrated approach is needed for the broad student population. “Data science is more than just technical tools, it’s a way of thought,” says AnnaLee Saxenian, dean of the School of Information. “Data science has its own distinct rigor, and at the same time, it brings contextual and ethical issues to play – for instance, where did the data come from, how can you ask good questions with it, and what are its limits?”
The faculty team planning the new curriculum recognizes the need for an approach to data science that is accessible to the entire student population, from students in the humanities who want to learn to work with big data, but haven’t been able to find the right class for all their needs, all the way to those who want to specialize in data science itself.
“These are new, exciting courses, and they will evolve and get refined as the program matures,” says Culler of the pilot offerings. “We are devoted to creating a thoughtful, integrated approach for the broad student population and pathways through the curriculum that open doors for students into data science in ways that suit their current needs, diverse backgrounds and future plans.”
Preliminary data from the team’s preparatory work suggest that there will be broad interest among students. “Data is the future right now,” says Jerome Rufin, a senior in industrial engineering and operations research. “It would be awesome if you could have exposure to that at an early stage.”