|
The common ancestor of placental
mammals probably looked like Eomaia scansoria,
the earliest known placental mammal, shown here in an
artist's reconstruction based on a 125-million-year-old
fossil skeleton found in China in 2002. Now UCSC researchers
say they can reconstruct the DNA of the common ancestor
of placental mammals with 98 percent accuracy.
Image by Mark A. Klingler, Carnegie Museum of Natural
History
|
December 6, 2004
Computer analysis shows scientists could
reconstruct the genome of the common ancestor of all placental
mammals
By Tim Stephens
Contrary to the movie Jurassic Park, in which scientists
recreate dinosaurs from ancient DNA, genetic material more than
about 50 thousand years old cannot be reliably recovered. Nevertheless,
a team of scientists has now demonstrated that computers could
be used to reconstruct with 98 percent accuracy the DNA of a
creature that lived at the time of the dinosaurs more than 75
million years ago--a small, furry nocturnal animal that was
the common ancestor of most placental mammals, including humans.
Knowing this ancestral mammal's complete genome--the sequence
of As, Cs, Ts, and Gs in the DNA that made up its chromosomes--would
not mean that scientists could bring it to life. (For one thing,
synthesizing that much DNA would be prohibitively expensive
and technically difficult.)
But that's not the point. The point is to understand the evolution
of humans and other mammals at the molecular level, said David
Haussler, professor of biomolecular engineering at UCSC.
"We will be able to trace the molecular evolution of our
genome over the past 75 million years. It's a very exciting
new way to think about our origins, a kind of DNA-based archaeology
to understand how we came to be," said Haussler, a Howard
Hughes Medical Institute (HHMI) investigator and director of
UCSC's Center for Biomolecular Science and Engineering.
Haussler and Mathieu Blanchette, a postdoctoral researcher
at UCSC who is now at McGill University, teamed up with Eric
Green, scientific director at the National Human Genome Research
Institute (NHGRI) and director of the NIH Intramural Sequencing
Center, and Webb Miller, professor of biology and computer science
and engineering at Pennsylvania State University, to look at
the possibility of reconstructing the ancestral mammalian genome.
A paper describing their findings appears in the December issue
of the journal Genome Research.
The study is an extension of ongoing research in comparative
genomics--the effort to understand the human genome by comparing
it with the genomes of other species. By comparing the human
genome to the ancestral genome, scientists may learn much more
than they can from comparisons with other living species, such
as the mouse, rat, and chimpanzee, Haussler said.
"If we find a DNA sequence in the human genome that is
missing in the corresponding place in the mouse genome, we can't
tell whether that DNA was inserted in the evolution of humans
from the mammalian ancestor or deleted in the evolution of mice,"
he said. "If the ancestral genome is available, this ambiguity
disappears."
The researchers developed a computational procedure for reconstructing
ancestral genome sequences that was based primarily on Miller's
widely used genome comparison software, together with an assortment
of new computer algorithms devised for this project. To test
the reconstruction process, they created an artificial set of
mammalian genome sequences for which the ancestral sequence
was known.
Blanchette, who is the first author of the paper, generated
this artificial evolutionary tree by creating a massive software
program to simulate all the known processes that modify DNA
as it evolves. The program was based on a huge amount of data
from Green's research at NHGRI, where scientists are doing comparative
analyses of genomic sequences from many different vertebrate
species. In particular, the researchers focused on a region
called the CFTR locus, which includes the gene involved in cystic
fibrosis. This region--encompassing 10 genes and adjacent stretches
of DNA, for a total of more than one million base pairs or "letters"
of genetic code--had been completely sequenced in many different
mammals.
"This region served as an example of the evolutionary changes
that happened in all these different mammalian lineages. [Blanchette]
took all of the information we learned from detailed examination
of this one region and incorporated it into a software program
that is able to simulate mammalian evolution at the molecular
level," Haussler said.
The researchers applied the software to a hypothetical ancestral
DNA sequence, artificially evolving the DNA along all the various
pathways of mammalian evolution to create simulated modern DNA
sequences for 20 different species. Then they used their reconstruction
procedure (a completely separate process that incorporates no
information from the simulation process) to recreate the ancestral
sequence. Comparing the reconstructed sequence with the original
ancestral sequence, they found that it was 98 percent accurate.
The next step was to apply the reconstruction process to the
actual genomic sequences for the CFTR locus in the 19 mammalian
species for which this DNA sequence was available, including
humans. The researchers evaluated the accuracy of this reconstruction
by comparing it to species not included in the group from which
it was derived, such as chickens and opossums. These comparisons
suggested an accuracy of more than 99 percent in the places
they could test, including the cystic fibrosis gene itself.
"Overall, we think that we can reconstruct the DNA sequences
of the ancestral genome with an accuracy of more than 98 percent,"
Haussler said. "That is much higher accuracy than people
expected would be possible."
Haussler said he hopes these findings will encourage the additional
genome sequencing that would be needed to do a complete reconstruction
of the ancestral mammalian genome. Living mammals, from apes
to bats to whales, are all variations on a common mammalian
theme. Comparisons with their common ancestor can reveal new
insights into not only the core biology that all mammals share
in common, but also the unique traits that define each species,
he said.
Nearly complete genome sequences are available now for five
mammals, and about 20 would be needed for an accurate reconstruction.
NHGRI and other organizations are planning to fund sequencing
projects for a number of additional mammals, but it remains
to be seen whether and how soon enough genomic data will be
available, Haussler said.
"It's a great challenge to reconstruct the history of the
entire human genome, but it's the key to understanding what
makes us special," he said.
Return to Front Page