Campus & community, Campus news

Berkeley inaugurates Division of Data Science and Information, connecting teaching and research from all corners of campus

By Kara Manke

Different Cal buildings appear on a blue background
The new division will merge the boundaries of existing departments. (UC Berkeley graphic by Hulda Nelson)
Different Cal buildings appear on a blue background

The new division will merge the boundaries of existing departments. (UC Berkeley graphic by Hulda Nelson)

In a direct response to the profound and growing impact of data and computing in a rapidly evolving digital world, UC Berkeley today announced its plan to form a new division, provisionally referred to as the Division of Data Science and Information, which will harness the university’s leadership in the field to prepare thousands of students and researchers to bring data science to bear in the classroom, the laboratory and the workplace. The first-of-its-kind division, situated as a peer of Berkeley’s colleges and schools, represents one of the most profound changes in the university’s organization in decades.

The new division is designed to engage faculty and students across the Berkeley campus and connects departments from the College of Engineering, the College of Letters and Science and the School of Information. It will be led by a new associate provost, who will report directly to the executive vice chancellor and provost. This new structure will accelerate breakthrough research across scientific and technological frontiers and facilitate the creation of new multi- and cross-disciplinary programs of study. The division will enable students and researchers to tackle not just the scientific challenges opened up by pervasive data, but the societal, economic and environmental impacts as well. The university will bring to bear the school’s excellence in social sciences and humanities to explore these widespread implications.

“Berkeley’s powerful research engine, coupled with its deep commitment to equity and diversity, creates a strong bedrock from which to build the future foundations of this fast-changing field while ensuring that its applications and impacts serve to benefit society as a whole,” said Paul Alivisatos, executive vice chancellor and provost. “The division’s broad scope and its facilitation of new cross-disciplinary work will uniquely position Berkeley to lead in our data-rich age. It will simultaneously allow a new discipline to emerge and strengthen existing ones.”

Berkeley’s powerful research engine, coupled with its deep commitment to equity and diversity, creates a strong bedrock from which to build the future foundations of this fast-changing field.

– Paul Alivisatos, Executive Vice Chancellor and Provost

“It is pretty clear that almost every field today needs data-savvy researchers. It has become something as basic as driving a car in our intellectual work,” said Saul Perlmutter, director of the Berkeley Institute for Data Science (BIDS) and winner of the 2011 Nobel Prize in Physics. “Data science — and the new division — offers the university community unusual opportunities to bridge what can otherwise be disparate disciplines. It’s exciting to see the social scientists, the humanities scholars, the natural scientists and the computational and statistical scholars all in the same conversation, with novel collaborations beginning to address previously intractable and fascinating questions.”

“The creation of a new cross-campus data science division at Berkeley will ensure not only that all Berkeley students graduate with high levels of data literacy, irrespective of their majors or concentrations, but also that we are positioned as a research university to both accelerate computational research and at the same time to come to grips with the human, social and political transformations the digital revolution has instigated,” said Carla Hesse, dean of the Social Sciences Division and executive dean of the College of Letters and Science.

Students fill a large auditorium. An instructor stands at the front beneath a

Berkeley’s wildly popular Data 8 course grew from 100 students in fall 2015 to 1300 students in fall 2018. (UC Berkeley photo by Keegan Houser)

The new division is the next step in an evolution that has engaged researchers and students across the university for several years. It will leverage Berkeley’s preeminence in computational and statistical research to lay the foundations of new technologies and methods, while developing the future tools and techniques of data science. It will accelerate and support Berkeley’s already strong programs in data science, including the popular Foundations of Data Science ( Data 8 ) course, which saw enrollment grow from 100 in fall 2015 to 1,300 in fall 2018, its advanced follow-on, Principles and Techniques of Data Science (Data 100), which grew from 100 in fall 2016 to 800 in fall 2018, dozens of other new courses, the integrative data science major launched this fall, the popular masters programs and BIDS, along with several other key research centers.

“Data science doesn’t just have applications in essentially every other academic field, but it is actually changing the foundations of those fields as well,” said David Culler, interim dean for data sciences. “Berkeley is essentially shaping what will it mean to be a research university in the connected century ahead, pursuing new knowledge that transcends disciplinary boundaries.”

In 2018, nearly one-quarter of open faculty searches at Berkeley seek applicants who will explore aspects of data science throughout the sciences, humanities, professions and engineering fields — more than any other discipline. The new division will engage with schools and colleges across Berkeley to spur faculty hiring in data science-related fields.

“The new division makes Berkeley a magnet for data science, machine learning, artificial intelligence and all the fields that engage with them in any way,” said David Wagner, professor of computer science at Berkeley and an instructor for Data 8. “I think it will help us attract the best and the brightest.”

“The new division will provide a framework that facilitates interactions between researchers in computer science and engineering and researchers in other disciplines to forge new multi-disciplinary collaborations for the development and application of new tools and methods of deriving information from large datasets,” said Tsu-Jae King Liu, dean of UC Berkeley’s College of Engineering.

Two students lean over a histogram.

The new division will prepare students and researchers to bring data science to bear in the classroom, the laboratory and the workplace. (UC Berkeley photo by Keegan Houser)

Data and algorithms have become central to how individuals, researchers and businesses learn about the world and make decisions. Banks use them for fraud detection and to calculate credit scores, retailers use them to model supply chains to get products to the right place at the right time and the healthcare industry uses them to prescribe drugs and make diagnoses. Remote sensing produces data and imagery that reveal not only trends in weather and climate, but also uncover patterns of poverty, migration and energy use. Data determines what we see on social media, how easily we can get a loan for a car and what types of healthcare treatments we receive.

But the rapid rise of data and algorithms in our decision-making comes hand-in-hand with major societal questions about how they are used. Recent controversies over facial recognition technology, privacy protection or predictive policing show how computing and data are changing the social order. As the world’s leading public university with a long-standing dedication to public service, Berkeley is uniquely poised to advance research, education, and public engagement with the ethics and human impacts of these new technologies while ensuring that they are made available to diverse communities.

“We need to consider the ethical implications of these technologies as they are being developed — what does the world look like when decisions are made by algorithms rather than people, and how do we ensure that when we analyze data our decisions reflect not just numbers but the humans behind them?” Wagner said. In the new division, faculty from Berkeley’s social science and humanities programs will be working hand-in-hand with technologists. They will enable the division to develop cross-disciplinary expertise at the interface between humans and technology, and their classes will help students navigate societal opportunities and challenges in a world filled with data.

“The opportunities and challenges here are so big and so diverse, and it is going to take people with expertise that is as broad as the Berkeley campus to grapple with the many ways data is changing our world,” said Anno Saxenian, dean of the School of Information.

It is going to take people with expertise that is as broad as the Berkeley campus to grapple with the many ways data is changing our world.

– Anno Saxenian, Dean of the School of Information

The Data 8 class, and the major that is built on it, pay close attention to questions of diversity, the broad range of data science applications, and the human contexts of data no less than technical foundations. In fall 2018, half of the students in Data 8 are women, 11 percent come from an underrepresented ethnic or racial group, more than half have little or no computer programming experience and they pursue 68 different majors.

Employers like Google and Facebook are hungry for graduates who can spot patterns in data, and for research that builds their capacity to understand the data at their fingertips — but the need for data-savvy graduates and research extends far beyond the tech industry.

A poll, conducted by Gallup for the Business-Higher Education Forum , revealed that by 2021, 69 percent of employers expect candidates with data science skills to get preference for jobs in their organizations, while only 23 percent of college and university leaders say their graduates will have those skills. LinkedIn’s August jobs report found significant gaps in the number of data science jobs and those to fill them, particularly in areas such as San Francisco and Los Angeles. “Expanding Berkeley’s data science programs along with our University of California partners is critical to preparing the state’s workforce for the economy of the rest of the century,” Culler said.

“Data science began to emerge as a major phenomenon in industry 10 to 15 years ago, and it continues to grow rapidly,” said Michael I. Jordan, a professor in the departments of statistics and electrical engineering and computer science at Berkeley. “It is, accordingly, high time for the university to respond, creating educational and research initiatives that reflect and help to understand and shape this phenomenon.”

We want everyone to have the power of data and computing at their disposal in a way that works for them.

– Cathryn Carson, professor of history

“The emphasis that Berkeley is placing on data literacy and making it available to diverse communities creates a very talented pool of graduates — employable in traditional computing and engineering fields as well as a variety of others. They will be prepared for what the world will be like in 10 or 15 or even 20 years,” said Vinitra Swamy, an artificial intelligence software engineer at Microsoft. Swamy completed a bachelor’s degree (2017) and a master’s degree (2018) in computer science at Berkeley and served as head graduate student instructor of Data 8, Berkeley’s introductory data science course.

“The best part of the data science program is that the skills taught in the courses are immensely valuable to industry and research. I use data science tools I learned at Berkeley every day,” Swamy said.

“These changes around computation and data are coming at a scale and at a pace like no other scientific change in recent memory. Both its foundational aspects and impacts are comparable to the biggest transformations in society that we have ever seen,” said Cathryn Carson, a professor in the department of history at Berkeley. “We want everyone to have the power of data and computing at their disposal in a way that works for them.”