University of Minnesota. Home page.
Institute of Technology
Inventing Tomorrow

Harvesting genomes to feed the world

by Paul Sorenson

Genetic engineering promises to revolutionize both medicine and agriculture by giving doctors new tools for diagnosis and treatment and by enabling biologists to create new strains of fruits, vegetables, and grains that are hardier, more nutritious, and more flavorful.

But those breakthroughs won't happen overnight. Before researchers can precisely manipulate an organism's DNA to create a desired effect, they must map and analyze its entire genetic code, or genome, to determine which genes correspond with specific traits. That process involves collecting and organizing enormous amounts of genetic information.

"Researchers are generating [genome] data on a scale no one ever dreamed of before,” says Ernest Retzel, director of biocomputing for the University's Academic Health Centers. The challenge, he says, is to find a way to organize and analyze unprecedented amounts of data.

Toward that end, he and an interdisciplinary research team led by Professor Vipin Kumar are designing new computer techniques for working with genetic data.

"At one time, data acquisition was the slow part,” says Retzel. “We could analyze data using old computer techniques on new CPUs and keep ahead. Now, [researchers] are generating so much data, the bottleneck is in the analysis."

Analyzing genetic data is unusually difficult because so many variables in the data are interconnected, and different labs have varying standards of data collection. Moreover, few existing techniques apply to this kind of “high-dimensional” data.

"This is perhaps the most challenging data-mining problem that exists,” says Jaideep Srivastava, an associate professor of computer science and member of the research team. “The knowledge to be gained is of the most complex kind."

To meet that challenge, the team is developing new computer programs to help biologists integrate and visualize genetic data from various sources and examine it to identify structural and functional patterns linked to specific traits. Once researchers link patterns to functionality, they can begin to manipulate both.

"You don't need to think of a cell as magical anymore,” says Retzel. “It's a system you can tweak like anything else."

Retzel and research associate Elizabeth Shoop provide the biological expertise and data around which the project is built. Kumar, Srivastava, and fellow computer scientists Ravi Janardan, George Karypis, and Shashi Shekhar are working to formulate and solve the computational problems. Once those problems are solved, the team will produce a collection of data-mining, data-integration, and data-visualization software for other biologists to use.

The project is being developed and tested using genetic information from a wide variety of plants—including soybeans, rice, eucalyptus, and corn—because the group wanted to avoid the ethical issues that might arise from working with animal or human DNA. They also share a common concern about the world's food supply.

"Food production and population growth are way out of sync,” says Retzel. “The 21st-century problem will be sustainable agriculture.” He hopes the team's research will speed development of genetically enhanced crops to better meet the nutritional needs of the world's burgeoning population.

"For example, 90 percent of the world relies on rice as a major source of food,” he says. “Using these tools to enhance its yield and improve its resistance to disease should be a top priority.” The group is collaborating with researchers in other University departments and in universities and companies around the world to achieve that goal.

Unfortunately, creating successful new breeds of super-crops may prove far more complicated than mixing and matching genes. “If you change the data in a tomato to make it stay ripe longer, how will that affect its nutritional value?” asks Kumar. “We don't know."

But the pace of genetic research is accelerating, and computer tools may someday allow biologists to predict those effects. “The techniques we're developing will allow molecular biologists to ask and answer questions that they cannot even consider asking today,” says Kumar.

For more information see www.cs.umn.edu.