In recent weeks, alarming statistics have emerged detailing the disproportionate impact of COVID-19 on black communities, Latinx communities and other marginalized groups in the United States.
In today’s Berkeley Conversations: COVID-19 event, Jennifer Chayes, associate provost of the Division of Computing, Data Science, and Society and dean of the School of Information, spoke with three UC Berkeley experts about how relying on data and algorithms to guide pandemic response may actually serve to perpetuate these inequities — and what researchers and data scientists can do to reverse the patterns.
Algorithms are only as good as the input we give them. Ziad Obermeyer, acting associate professor of health policy and management at Berkeley, learned this lesson when he and his collaborators dug into a commonly-used algorithm used to decide who gets access to health care management services in hospitals.
The services are meant to be directed to the sickest patients, or those who need them most. However, his team found that the algorithms were letting in healthier white patients ahead of sicker black patients because the algorithms used health care cost as a proxy for illness, and white patients with more access to health care services are more likely to create more costs.
Obermeyer sees similar potential pitfalls in using testing data to guide the allocation of health care resources during the COVID-19 pandemic. Black, Latinx and other marginalized populations may have less access to the limited number of tests that are available, giving the impression that their communities are healthier, when in fact they are the ones facing the most dire need.
“The [disparities] could be much, much worse, because we are only measuring the tip of the iceberg of the actual COVID cases, and we don’t have the data infrastructure that lets us track what’s actually happening,” Obermeyer said.
Similarly, plans to direct limited health care resources to healthier patients, who are more likely to survive, may also reinforce existing disparities, Obermeyer said. Individuals from marginalized groups also have higher rates of pre-existing illnesses, possibly due to the stress of experiencing racism or poverty.
Niloufar Salehi, an assistant professor in the School of Information, is developing tools to help combat misinformation and conspiracies about the coronavirus that are proliferating online. Her efforts are hampered by the fact that much of this “fake news” is spreading within private groups of friends and family members where it is hard to track, she said.
To help combat data’s potential to perpetuate disparities, she suggests researchers start by challenging the narratives of individual responsibility that often blame marginalized communities for their own struggles.
“I think one [narrative] that is very dangerous is that black and Hispanic people are dying in record numbers because of their own risky behavior,” Salehi said.
As an ethnographer, Sarah Vaughn brings a unique perspective to the conversation: Her data is usually qualitative, based on description and observation, rather than quantitative.
Vaughn, an assistant professor of anthropology, studies how different communities approach disaster preparedness and climate adaptation in the Caribbean. From her work, she’s learned that qualitative understanding of social and cultural context — such as considering who has access to COVID-19 tests and ventilators — also needs to be taken into account when looking at data.
“Data comes from somewhere,” Vaughn said. “We are recognizing that simply collecting demographic data is not enough, and, in fact, (we are) trying to understand the broader relationship in not just how that demographic data is produced, but also relate it to broader ways in which people use or don’t use technologies.”