May 10, 2004
Surprising 'ultra-conserved' regions discovered
in human genome
By Branwyn Wagman and Tim
Stephens
Researchers comparing the human genome with the genomes of other species
have discovered a surprising number of matching DNA sequences in a variety
of vertebrate species, including the mouse, rat, dog, and chicken.
The fact that these sequences have remained unchanged over long periods
of evolutionary history indicates that they are biologically important,
but for now their functions are largely a mystery.
Published May 6 by Science Express (the online edition of the
journal Science), these findings are the joint work of UCSC researchers
Gill Bejerano, a postdoctoral researcher, David Haussler, professor
of biomolecular engineering and a Howard Hughes Medical Institute investigator,
and research scientist W. James Kent, together with a team of researchers
from the University of Queensland in Australia.
By scanning the human, rat, and mouse genomes for matching regions
of 200 or more DNA bases (As, Cs, Gs, and Ts), the researchers found
481 regions that were completely unchanged. All of the unchanged regions,
referred to as "ultra-conserved elements," were also found
in the dog and chicken genomes, and two-thirds of them were found in
the fish genome. But they could not be traced beyond the fish to nonvertebrate
species whose genomes have been sequenced, such as the sea squirt, fly,
and worm.
"As far as we can tell, most of these ultra-conserved elements
showed up during the evolution of vertebrates, perhaps during the period
when land animals emerged, or a bit earlier. But their early evolutionary
history is still mysterious," Haussler said.
Although they have been conserved meticulously through hundreds of
millions of years of evolution, only a small fraction of these elements
code for proteins. Protein coding, whereby DNA code directs the production
of a specific protein, is how most genes carry out their functions.
But fewer than a quarter of the ultra-conserved elements overlap coding
regions of the human genome, and in most of those cases they overlap
only a short span of the coding region and extend beyond it to noncoding
areas.
Nevertheless, most of the 481 ultra-conserved elements appear to be
associated in some way with genes, if not overlapping them then residing
near genes or in the noncoding portions of genes. Furthermore, they
tend to be associated with parts of the genome that are involved in
regulating the expression of genes in various ways.
"These parts of the genome are far more conserved than we would
have imagined. We think these segments evolved in the past, then froze
into place and were inherited unchanged from then on," Bejerano
said.
More than half of the ultra-conserved elements that overlap coding
regions are associated with genes that take more than one form, depending
on how they are transcribed to RNA. Through a process known as alternative
splicing, different parts of a gene may be spliced out under different
circumstances, so that a single gene can produce several different proteins.
Bejerano thinks the association of ultra-conserved elements with alternatively
spliced genes is significant.
"It's a cautious hypothesis that these elements may cause some
type of interaction to determine what part of the gene will be spliced
out," he said.
The ultra-conserved elements that do not overlap with any coding region
tend to be found in regions of the genome that are associated with gene
regulation, the transcription of DNA to RNA, or the binding of regulatory
proteins to the DNA.
"There was some speculation among biologists as to whether we
would find new kinds of things when we sequenced the complete genomes
of animals, including our own genome, or if it would just be more of
the same kinds of things that were already known. Well, it's not just
more of the same," Haussler said.
The discovery of ultra-conserved elements in the human genome came
from investigating the 5 percent of the genome known to be highly conserved
among the human, mouse, and rat. Most of this highly conserved DNA--3.5
percent of the genome--is in noncoding regions and has no known purpose.
Bejerano and his colleagues sought to find functional elements within
that 3.5 percent.
"We began by looking for meaningful families within these conserved,
noncoding elements," he said.
Some of them fell into clusters of at least two elements with a common
genomic ancestor. These were most likely duplications within one genome
that were retained and modified over time. In one such family, the different
elements within the human genome were 80 to 90 percent similar, while
each human element was 96 percent similar to the corresponding elements
in the mouse and rat genomes.
Because 96 percent agreement between the similar elements in two species
was surprisingly high, the researchers wondered if they could find other
areas of such agreement. They did, and eventually looked for 100 percent
agreement, which uncovered the 481 ultra-conserved segments that were
the focus of this research.
Looking beyond the human, mouse, and rat genomes for earlier evidence
of these ultra-conserved elements, the researchers found that 97 percent
of them can be aligned with similar regions in the chicken with 95 percent
agreement, even though only 4 percent of the human genome can be aligned
with the chicken genome at all. Human and chicken lines are thought
to have diverged about 300 million years ago. In the fugu fish, which
diverged from the human line more than 400 million years ago, two-thirds
of the ultra-conserved regions could be aligned with 77 percent similarity.
Because they were not able to trace the ultra-conserved segments to
even more distant species, the authors speculate that these particular
parts of the genome represent innovations in the genomes of chordate
species that evolved rapidly at first, then became effectively frozen
in birds and mammals.
"These ultra-conserved elements are long, they evolved rather
rapidly, and they are now evolutionarily frozen. We don't know of a
biomolecular mechanism that would explain them," Haussler said.
When the researchers compared individuals within the human population,
they also found little variation in these ultra-conserved elements.
To determine this, the team combed the conserved elements for genome
variations called single nucleotide polymorphisms (SNPs), which are
changes in individual genomes that are often used in genetic testing
to distinguish one human from another. They found 20-fold fewer SNPs
in the conserved regions than would have occurred if SNPs were randomly
sprinkled throughout the genome.
Interestingly, the rate of change that has occurred in these regions
when compared with other species also appears to be 20-fold less than
expected, supporting the idea that the conserved elements evolve 20
times slower than does the genome as a whole.
The only other part of the genome having a level of conservation approaching
that of these ultra-conserved elements is the DNA that codes for ribosomes
and their actions in the cells. Ribosomes are complex molecular machines
made of RNA and proteins. They translate the genetic code to carry out
protein synthesis in all cells. According to Bejerano, ribosomal sequences
are highly conserved because they are essential to all forms of life.
"Ribosomes are crucial. If anything goes wrong with them, the
organism will not survive," he said.
The DNA sequences that code for ribosomal RNA contain long stretches
of bases that are perfectly conserved throughout evolution. Unlike the
ultra-conserved elements uncovered in this study, though, ribosomal
RNA is ancient and is common to all species.
"The ultra-conserved elements are not nearly as old as the previously
known conserved elements in the ribosomal RNA, which is truly ancient
and fundamental to all branches of life," Haussler noted.
In addition to Bejerano, Haussler, and Kent, the coauthors on the new
paper include University of Queensland researchers Michael Pheasant,
Igor Makunin, Stuart Stephen, and John Mattick.
The work at UCSC on ultra-conserved elements in the human genome was
supported by the National Human Genome Research Institute (NHGRI), the
National Cancer Institute, and the Howard Hughes Medical Institute.
Return to Front Page
|