by Jeffrey R. Thompson*1
Introduction:
Palaeontology is truly a science of the twenty-first century. Palaeontologists are no longer concerned only with fossils, but also with topics such as genetics, developmental biology and chemistry — although most of us can’t resist digging around in the dirt from time to time! You are almost as likely to find a palaeontology graduate student in a class on molecular biology as in one on stratigraphy. This is because, in recent years, the integration of fossil, developmental and genetic data has fast become one of the most promising ways to study the patterns and processes of evolution.
At this point, it may be helpful to introduce some of the sources of information that palaeontologists use to address large-scale evolutionary questions. Molecular biology mostly involves the study of DNA, the related molecule RNA and proteins. Molecular developmental biology looks at the role of DNA, RNA and proteins in development — the process by which all animals and plants, including you and me, your dog and even a 66-million-year-old Tyrannosaurus rex, grew from microscopic single cells into their more recognizable forms. The processes of development are controlled or regulated by the expression of genes in certain tissues at different times. Genes are segments of DNA that encode the instructions for making RNA and proteins (Fig. 1), which are in turn responsible for further regulating the creation of other DNA and building the various tissues that make up an organism’s body.

Piecing together the evolution of animals, plants and other organisms requires an understanding of the genomes underlying their body plans. Genomes are made up of the entire suite of an organism’s DNA, and researchers have worked out how to read, or sequence, them: the genomes of more than 17,000 plants, animals, fungi, bacteria and archaea have been sequenced so far. Furthermore, advances in molecular sequencing have allowed scientists to extract DNA and whole genomes from subfossils — the remains of dead organisms that are on their way to becoming fossils — albeit only those younger than one million years old. But how can we get a glimpse of the genomic material of older fossils, in which the original DNA has long since decomposed? The answer lies in the emerging field of palaeogenomics. This involves inferring the presence of particular genes from analyses of the fossil record. Certain genes code for RNA and proteins specific to particular characteristics, so the presence of these characteristics in the fossil record implies the presence of those genes. With this reasoning, palaeontologists can attempt to understand the genomes of fossil organisms, and determine the timing of key evolutionary innovations.
Putting the genome in palaeogenomics:
DNA makes RNA and RNA makes proteins. This simple idea was first proclaimed by the British biologist Francis Crick, who is famed for discovering the structure of DNA (together with James Watson). DNA and RNA are both nucleic acids, which are made up of numerous small molecules called nucleotides. Each nucleotide contains one of four nitrogenous bases: adenine, guanine, cytosine or thymine (thymine is replaced with uracil in RNA). The function of these nitrogenous bases is to bind nucleic acids together, and each nitrogenous base forms a pair specifically with one other base: guanine with cytosine, and adenine with thymine or uracil. DNA is double-stranded and arranged in a characteristic ‘double helix’ in animals, whereas RNA is often single stranded and usually arranged in a single, lonely, helix.
As stated above, DNA is responsible for making RNA, a process that is known as transcription. During transcription, the DNA double helix (Fig. 1A) unwinds into two linear strands, and an enzyme known as RNA polymerase then moves along a small portion of one of these strands (Fig. 1B), producing a ‘complementary’ strand of messenger RNA (mRNA; Fig. 1C), which bears nucleotides that are opposite to those of the original DNA strand. For example, if the bases of a strand of DNA read TCGAA, the complementary mRNA will read AGCUU (remembering that pesky uracil takes the place of thymine in RNA). This mRNA is then translated into an amino-acid chain by transfer RNA (tRNA), giving three-nucleotide sequences called codons (Fig. 1D). Each codon corresponds to a specific amino acid, and these amino acids form a chain-like arrangement. When the tRNA has finished translation, the resulting chain of amino acids is converted into a protein (Fig. 1E). Proteins perform a wide range of jobs in cells, including cell–cell signalling, catalysing reactions and, most importantly, controlling the transcription of other genes. All of these tasks are essential to the development of an organism and its tissues, and in this way, genes and groups of genes encoded in the genome are directly responsible for making the various tissues, morphologies and body parts that make up a whole organism.
Modern molecular biology uses methods such as in situ hybridization (Fig. 2) to show exactly where in the body certain genes are expressed during development. The developing organism is exposed to a ‘probe’, a strand of RNA or DNA complementary to the RNA product of the gene of interest. This probe hybridizes (binds) with the complementary nucleic acids present in the embryo, in places where the DNA is actively being transcribed into RNA. The trick to all of this is that the probe contains a marker of some sort, usually a dye or a radioactive isotope, which indicates exactly where the expression is occurring. Places where the probe has bound stand out relative to the rest of the organism when viewed under a microscope. For example, in Figure 2, the dark-purple areas indicate where the RNA probe has bound to RNA in the embryo, thus marking where transcription of the gene of interest was occurring.

Case studies:
You might be wondering how all this helps us to recover the genomes of extinct organisms. To address this, I want to describe two examples related to my favourite group of animals, the echinoderms. Modern echinoderms are classified into five different groups: asteroids (sea stars); ophiuroids (brittle stars and basket stars); crinoids (sea lilies and feather stars); holothurians (sea cucumbers); and echinoids (sea urchins). Echinoderms have been at the heart of palaeogenomics because, apart from being amazingly cool (They can regenerate their body parts!), they have an excellent fossil record and certain species are widely used as model organisms in developmental biology. This combination has made them ideal organisms for palaeogenomics studies.
One of the first uses of palaeogenomics was in 2006, by US palaeobiologist David Bottjer and his colleagues. They used it to date the origin of the genes underpinning the development of the echinoderm skeleton, which is composed of the mineral calcite and has a distinctive porous microstructure called stereom. This skeleton is present in all known modern and fossil echinoderms (Fig. 3). For 30 years, molecular developmental biologists have studied how the genome controls the formation of the echinoderm skeleton and the genes and proteins involved in the formation of sea urchin stereom are very well understood. In situ hybridization (Fig. 3A), paired with sequencing of the genome of the purple sea urchin (Strongylocentrotus purpuratus), has served to identify many of these genes. Among them are transcription factor genes such as Alx1 and Ets1, and the spicule matrix genes, including Sm37, Sm50 and Sm30.

Echinoderms are the only known creatures with a skeleton made of stereom calcite. Animals in the group most closely related to echinoderms, the hemichordates, are predominantly soft-bodied, and do not have many of the genes involved in echinoderm biomineralization. This indicates that the first creatures to evolve stereom and the associated genes must have occurred somewhere along the lineage between the last common ancestor of echinoderms and hemichordates, and the first echinoderms (Fig. 4). Using the fossil record, it is possible to work out the most recent date by which the genes responsible for biomineralization must have evolved. The oldest evidence of stereom in the fossil record comes from slightly above the Cambrian Stage 3 boundary, approximately 520 million years ago. Because many of the genes that make stereom are specific to echinoderms, the presence of stereom 520 million years ago indicates that these genes must also have been present at this time.

Another example of palaeogenomics concerns the echinoids, or sea urchins. All post-Palaeozoic sea urchins (those that have existed since the Permian–Triassic mass extinction, that is for the past 252 million years) belong to one of two clades: the euechinoids (Fig. 5A–D) and the cidaroids (Fig. 5E–H). A number of characters differentiate these two groups, but perhaps the most distinctive difference is the perignathic girdle. This is a series of skeletal protrusions where the muscles that push the jaws in and out of the urchin’s mouth are attached (Fig. 5). In euechinoid echinoids, the perignathic girdle is derived from skeletal plates called auricles, while in cidaroids it is formed from different skeletal plates, which are called apophyses. The first occurrence of a cidaroid echinoid in the fossil record, during the Permian, dates the euechinoid–cidaroid divergence to at least 268 million years ago.

The appearance of this oldest cidaroid echinoid, Eotiaris guadalupensis also allows scientists to date the latest possible appearance of the genes responsible for forming apophyses and auricles. Through the use of in situ hybridization techniques on modern juvenile euechinoid and cidaroid echinoids, it was possible to determine where and when a number of the genes responsible for the formation of apophyses and auricles were expressed. The genes Alx1 and Sm37 were found to be directly involved in the development of the apophyses and auricles (Fig. 6). Furthermore, the gene VegfR was also expressed in the apophyses and auricles, and may have been responsible for the formation of these different structures. These genes must have had similar roles in the first fossil forms with these structures. Eotiaris guadalupensis has apophyses, indicating that Sm37, Alx1 and VegfR must have been involved in the formation of the perignathic girdle by at least 268 million years ago.

Summary:
Palaeogenomics, the study of the genomes of extinct organisms, is a promising new frontier in palaeontology. With the integration of fossil, molecular biological and developmental data, palaeontologists are able to determine not only when organisms first appeared, but also when specific genes evolved. As more genomic and developmental data become available for a wide variety of organisms, palaeontologists will continue to integrate this data with fossils to gain a better understanding of the evolution of genomes, and the organisms they are responsible for building.
Suggestions for further reading:
Bottjer, D. J., Davidson, E. H., Peterson, K. J. & Cameron, R. A. Paleogenomics of echinoderms. Science 314, 956–960 (2006). DOI: 10.1126/science.1132310
Donoghue, P. C. J. & Purnell, M. A. Genome duplication, extinction and vertebrate evolution. Trends in Ecology & Evolution 20, 312–319 (2005). DOI: 10.1016/j.tree.2005.04.008
Gao, F., Thompson, J. R., Petsios, E., Erkenbrack, E., Moats, R. A., Bottjer, D. J. & Davidson, E. H. 2015. Juvenile skeletogenesis in anciently diverged sea urchin clades. Developmental Biology 400, 148–158 (2015). DOI:10.1016/j.ydbio.2015.01.017
Peterson, K. J., Summons, R. E. & Donoghue, P. C. J. Molecular palaeobiology. Palaeontology 50, 775–809 (2007). DOI: 10.1111/j.1475-4983.2007.00692.x
Telford, M. J., Lowe, C. J., Cameron, C. B., Ortega-Martinez, O., Aronowicz, J., Oliveri, P. & Copley, R. R. Phylogenomic analysis of echinoderm class relationships supports Asterozoa. Proceedings of the Royal Society B 281, 20140479 (2014). DOI: 10.1098/rspb.2014.0479
Thompson, J. R., Petsios, E. Davidson, E. H., Erkenbrack, E. M., Gao, F. & Bottjer, D. J. Reorganization of sea urchin gene regulatory networks at least 268 million years ago as revealed by oldest fossil cidaroid echinoid. Scientific Reports 5, 15541 (2015). DOI: 10.1038/srep15541
Zamora, S. & Rahman, I. A. Deciphering the early evolution of echinoderms with Cambrian fossils. Palaeontology 57, 1105–1119 (2014). DOI: 10.1111/pala.12138
1Department of Earth Sciences, University of Southern California, USA. Email: thompsjr@usc.edu