| || |
| Applications of Molecular Phylogenetics |
Although the methods used in cladistic analysis are the same for both molecular and morphological characters, molecular data provides several advantages. First, molecular data offers a large and essentially limitless set of characters. Each nucleotide position, in theory, can be considered a character and assumed independent. The DNA of any given organism has millions to billions of nucleotide positions. In addition, the large size of the genome makes it unlikely that natural selection will be strongly driving changes at any particular nucleotide. Instead, most nucleotide changes are "unseen" by natural selection, subject only to mutation and random genetic drift. If we were to assume that the driving force of natural selection is less prevalant for molecular characters, then we should assume that the probability of convergence for molecular characters is also.
By selecting a particular class of morphological characters, researchers may also bias the analysis in such a way that groups with certain characteristics cluster with others for reasons other than homology. For instance, if the set of characters were weighted toward those involved in carnivory, carnivorous animals may cluster together — not because of homology but because of shared function. This problem would be less likely if using molecular characters.
Another advantage of molecular data is that all known life is based on nucleic acids; thus, studies involving any type of taxa can use DNA sequence data. Some genes or regions of genes evolve quickly. These are most useful in studies of closely related taxa. Conversely, other genes (or regions) are slower to evolve; these are the most useful for studies of more distantly related organisms. At the extreme, some evolutionarily related genes have been found in disparate organisms such as yeast and humans. Rates by which sections of DNA evolve are primarily determined by the extent of functional constraint. Genes and positions within genes that are the most useful generally are the slowest to evolve. This is because they are the least able to tolerate mutational change without substantially reducing the fitness of the individuals that harbor them. Many of these very conserved genes play a role in development. (See the Genetics of Development unit.)
Starting in the late 1970s Carl Woese took on an ambitious project - determining the relationships of all life, which resulted in the reorganization of the tree of life. To do this, Woese and his associates took advantage of a molecule that evolves extremely slowly -- rDNA, the DNA that encodes a small subunit of ribosomal RNA. They found that the sequences cluster in three groups corresponding to the eukaryotes (Eukarya), the archaea, and the eubacteria. We discussed these three domains earlier.
The three-domains model was controversial for several reasons. First, the conclusions Woese drew were initially based on evidence from a single gene. Perhaps there was something unusual about the way that small subunit of rDNA evolved, his critics said. That controversy was easily solved by generating more data. Sequences from other genes that evolve slowly seemed to confirm the rationale for the three domains. A more fundamental problem was that Woese's tree was unrooted. If each domain represents a monophyletic group, three possiblilties existed: (1) that the eubacteria and archaea are sister groups, with the eukaryotes branching off first; (2) that eubacteria and eukaryotes are sister groups; or (3), that archaea and eukaryotes are sister groups. Woese himself suspected this third possibility. A fourth possibility was that the root of the tree lied within one of the domains and, therefore, the domain was not monophyletic. To root a tree, one generally requires an outgroup. But what is the outgroup to all known life? Rocks?
Margaret Dayhoff proposed an ingenious solution to this rooting dilemma: using ancestral genes that are present in multiple copies in the same organism because of gene duplication. If there were such genes that had duplicated before the split among the three domains, these could be used as outgroups to root the tree of life. In 1989, many years after Dayhoff's suggestion, Naoyuki Iwabe and colleagues used this approach.3 Organisms in all three domains have two distinct genes that code for the two subunits (alpha and beta) of the enzyme that hydrolyzes ATP to yield energy, ATPase. DNA sequence similarity strongly suggests that these two genes are derived from a gene duplication pre-dating the divergence of the domains. The ATPase-alpha tree, using an ATPase-beta gene as an outgroup, showed that each of the domains was monophyletic, and that eukaryotes and archaea are sister groups. The same result was obtained when ATPase-beta was used as an outgroup to root the ATPase-alpha tree. Similar trees were obtained with other pairs of duplicated genes. In conclusion, Woese was right.