Tata Consultancy partners with campus to interpret personal genomic variation

Despite the fact that more than 3,000 people have had at least a portion of their genomes sequenced, and that a growing number of personal genomics companies are urging you to be next, scientists still have a poor understanding of what the differences in your genome really mean.

That, say University of California, Berkeley, scientists, is the impetus behind a new campus initiative to develop a pioneering software platform to analyze these differences and bring closer the era when one’s personal genome will be a starting point for health and medical advice.

“What we have now are numerous disparate sets of incompatible databases, and no common infrastructure for integrating and analyzing genetic variation,” said Steven E. Brenner, a UC Berkeley genomics professor in the Center for Computational Biology. “We are focusing on building a robust platform to identify the genetic basis of disease and inherited traits, as well as, ultimately, a resource that clinicians can use to inform their interpretation of genetic information for medical purposes.”

Establishing a Genome Commons

The Genome Commons Navigator interpreter is one of the proposals made by UC Berkeley professor Steven Brenner in 2007 when, in a commentary in the journal Nature, he urged formation of a Genome Commons, “a public knowledge base of human genetic variation and its effect, culled from databases, diagnostic laboratories, and the scientific literature.”

Brenner’s proposals emerged from frustration after the announcement that year of the first two human genomes sequenced, those of Nobelist James Watson and of biologist and entrepreneur Craig Venter.

“There was virtually nothing said about what the sequences actually meant for the biology, health or diseases of these individuals,” said Brenner, who is a professor of plant and microbial biology, a faculty scientist at Lawrence Berkeley National Laboratory and an affiliate of the departments of bioengineering and molecular and cell biology. “Given that interpreting an individual’s genome was one of the ultimate goals of the Human Genome Project, the fact that we were able to do so little with those initial genomes was profound disappointment.”

This was, in part, because the data on genetic variation are dispersed among a wide array of sources, from scientific papers to some 2,000 individual databases focused only on specific mutations. Even today, there are no public resources that actually integrate this information, nor standard accessible tools to analyze the data once it is integrated.

Since 2007, Brenner has focused on remedying the situation. He has already achieved some success by convincing the owners of some of the individual databases to share their information. He recently was named to the board of the Human Genome Variation Society, a group largely comprised of clinical geneticists who compile and maintain these databases.

“I think we have convinced the community of the importance of integrating this information to allow analyses across an entire genome,” he said.

The initiative was launched this year with an initial $900,000 grant over four years, part of a collaborative program with Tata Consultancy Services (TCS), a global IT services and business solutions firm headquartered in India. Apart from funding this project, TCS will offer research and software expertise from its global Innovation Labs to work with UC Berkeley in taking the project forward. The program will provide a fruitful exchange of ideas and researchers between TCS and UC Berkeley. The entire contribution from TCS to the campus is estimated to be $3-4 million.

“TCS is proud to support, and be part of, the Berkeley Interpreter for Genome Variation consortium to help realize the clinical benefits of personalized medicine. The collaboration with UC Berkeley represents an extension of TCS’ successful Co-Innovation Network (COIN™) program with academic institutions. Researchers from TCS Innovation Labs in Hyderabad will work closely with the faculty at Berkeley on this engagement.” said K. Ananth Krishnan, vice president and chief technology officer for TCS.  “Healthcare is a key research theme in TCS and this project is sure to have a big impact on biology and molecular medicine.”

The Center for Computational Biology (CCB) ‑ one of UC Berkeley’s strategic initiatives ‑ includes more than 35 faculty members from nine departments, including molecular and cell biology, mathematics, computer science, biostatistics and plant and microbial biology. The center’s main goal is the computational interpretation of variation in the human genome.

“Given that the $1,000 genome sequence is two to three years away, and given that so little of human genetic variation can be interpreted with respect to its biological significance, the center faculty believes that this is the defining grand challenge for human biology in this century,” said center director Lior Pachter, a UC Berkeley professor of mathematics, molecular and cell biology, and electrical engineering and computer sciences.

“This project will leverage the center’s talents in computer science, experimental biology and model systems to tackle a major problem today, how to assess genetic mutations and predict their implications,” he said. “TCS is making a complementary and very powerful contribution to this effort, bringing its strengths in engineering, global delivery and consulting services to bear on development of the interpreter. We are extremely pleased and excited about the Berkeley-TCS collaboration.”

“Some have questioned the significance of the human genome project for the practice of medicine,” noted Mark Schlissel, dean of the biological sciences in UC Berkeley’s College of Letters & Science. “These critics fail to realize that these are still early days. Our ability to acquire genome data has outpaced our ability to mine its riches. The Berkeley Interpreter for Genome Variation represents a giant step toward unlocking this potential for the benefit of human health.”

The new project will initially focus on building the Genome Commons Navigator interpreter, leaving unification of the databases for the future. Brenner anticipates that TCS’ support will allow the UC Berkeley team to first construct a pilot interpreter that will integrate numerous plug-in modules. The entire project will be open-source, so that the scientific community can use, test, share and refine the software.

“No one has a monopoly on all the good ideas for interpreting the meaning of genetic variation and its likely impact, so we hope to create a system that will let the good ideas blossom,” said former CCB director Jasper Rine, a professor of molecular and cell biology who was instrumental in establishing the collaboration with TCS. “People will be able to test their ideas in parallel or competitively against others to figure out which applications are best for which purposes.”

Work has already begun, with Brenner, Pachter, Rine and others in the computational biology center meeting with TCS engineers to sketch out a design for the software platform.

“The university can play a very important role in actual basic research on molecular biology to help us understand all these genetic variations,” said Pachter. “But outside investment is key to developing an open-source platform that gives all researchers, whether in academia or industry, easy, seamless access. TCS is the first of what we hope will be many corporate sponsors in this open-source consortium.”