Do genes express themselves through poetry?

Published May 9, 2016

A new MSU study makes inroads in learning to "read" the genome, a key goal of modern biology.

A new study from Michigan State University makes inroads in learning to “read” the genome, a key goal of modern biology.

The results were recently published in eLife, a premier peer-reviewed open-access scientific journal for the biomedical and life sciences.

Composed of coding regions and regulatory regions, the DNA content of our genomes resembles a complex biological language. Although protein-coding regions in DNA could be compared to a traffic signal—utilizing a simple stop or go message—the regulatory regions in DNA are more like poetry.

“The regulatory sites in DNA operate like a light switch to turn a gene on and off. In animals, it’s extremely complex,” said David Arnosti, MSU professor of biochemistry and molecular biology and the paper’s lead author. “There might be hundreds of protein factors in the cell that bind to the gene and impact activity. And there might be hundreds of binding places.”

He compares the “language” used in these regulatory sites to poetry.

“It may be Emily Dickinson, or Shakespeare or Allen Ginsberg; but all are using 'words' to evoke thoughts and emotions, to control the message,” he said.

“As we enter an era where the DNA sequences of entire human populations are increasingly accessible, we would like to know the functional significance of changes in gene regulatory regions,” Arnosti said.

Arnosti conducted the study with Rupinder Sayal, now assistant professor of biochemistry at DAV University in Jalandhar, India; Jacqueline Dresch, now professor of mathematics at Clark University in Worcester, Mass.; and Irina Pushel, formerly an MSU College of Natural Science Dean’s Research Scholar, now a pre-doctoral researcher at the Stowers Institute for Medical Research in Kansas City, Mo.

The team studied a set of regulatory proteins responsible for switching genes on and off in the Drosophila embryo. A regulatory factor called Dorsal controls a network of genes crucial for development of fruit fly embryos. Dorsal binds to the regulatory region or “enhancer” of a gene called rhomboid; the element has been well studied and is known to be a fairly typical example of an enhancer region. Dorsal is the “kissing cousin” of NF-kappaB (NF-kB), a critical gene for human immunity, and inflammation in disease.

“We analyzed dozens of variants of this gene and quantitatively measured expression in about 1,000 embryos, creating a quantitative data set that could be used to train mathematical models, utilizing parameter optimization,” Arnosti explained. “Our study shows that the regulatory properties of specific control proteins are accessible by employing quantitative experiments and mathematical models.”

By applying an ensemble of models, the research team was able to identify conserved regulatory properties in other sequences to “read” the genome.

“Using this approach, we will eventually be able to do the same thing you would do in English class—pick up a book of haiku or Shakespeare and understand that 'this is a love poem,' or 'this is an elegy,' because we’ll understand how the words—the DNA elements—are used in different contexts to convey different meanings on the regulation of genes,” Arnosti said.

Similar studies will be required to learn how mutations found across the genome may impact gene expression, leading to better diagnosis and treatment of disease.

“For example,” Arnosti said, “we can compare gene expression in a tumor and in normal tissue from the same patient to figure out what went wrong in this gene network. Having the power to read regulatory potential from DNA sequences can contribute to using precision medicine—prescribing certain drugs or treatments that would work specifically on that patient’s cancer.”

This research could also contribute to evolutionary studies to survey and understand genomes where no research has been done before, revealing important regulatory properties that may aid development of new products or disease treatments.

The work was funded by a grant from the National Institutes of Health.