Biocomputing Unit
Biocomputing
Sequence Analysis Service
Gibson Group
EMBL
EMBL

Practical Courses on Sequence Analysis

Molecular Phylogeny with Maximum Likelihood

by Jose Castresana July 5th, 2001


 



 

Introduction

In this practical we will learn how to generate trees by maximum likelihood from a set of aligned sequences, and we will also use one of the most powerful features of this approach: the possibility to make a statistical evaluation of a tree and to compare alternative hypothesis about the branching order in the tree and about the process of evolution that generated the sequences.

We will use aligned DNA and amino acid sequences of several species in order to study the phylogenetic relationships among these species. The same principles apply to alignments of protein families that include paralogues where the aim is to study the evolution of a protein family.

There are several good programs to do maximum-likelihood analysis from a set of aligned sequences. We will only use here MOLPHY and PUZZLE. Both can do the analysis of DNA and amino acid sequences (the latter is not yet possible in current versions of other very good packages like PAUP* or PHYLIP) and both are free. They have complementary features and therefore it is good to get used to both. If you are mainly doing analysis of DNA sequences you should also have a look at PHYLIP and PAUP*, which also include other methods of tree reconstruction such as distance methods and parsimony (in these cases for both DNA and amino acids).

To better follow the practical it is an advantage to have some knowledge on basic aspects of phylogenetic analysis, including notions on sequence alignments and the hidden multiple substitutions, tree topology, the monophyly and paraphyly concepts, and the meaning of the molecular clock. You can take a quick look at some pages explaining very briefly some of these concepts:





 

Exercise 1: Phylogenetic trees of amniotes with mitochondrial protein sequences: construction of maximum-likelihood trees and evaluation of alternative phylogenies

Amniotes include those vertebrates that possess many adaptations for the terrestrial life: reptiles, birds and mammals. Although it is clear that amniotes form a monophyletic group, the phylogenetic relationships among them have been widely debated from both morphological and molecular points of view. There are three possibilities for the relationship among reptiles, birds and mammals:
 


 

In fact, there may still be other alternatives if reptiles do not form a monophyletic group, as explained below.

 

We will use a given alignment of mitochondrial proteins to construct maximum-likelihood trees of representatives of the main amniote groups. This alignment was generated after concatenating amino acid alignments of the subunits 1, 2 and 3 of cytochrome oxidase, the subunits 1 and 2 of NADH dehydrogenase and the cytochrome b. From these alignments, all positions containing a gap were eliminated since it is not easy to study insertion/deletion events.
 

 

Calculating the neighbor-joining tree with MOLPHY

Calculating a maximum-likelihood tree with MOLPHY (heuristic search)

Calculating a maximum-likelihood tree with PUZZLE (heuristic search; no rate differences)


Calculating a maximum-likelihood tree with PUZZLE (heuristic search; rate differences)


Comparing alternative phylogenies with Puzzle (Kishino-Hasegawa test)

Exhaustive searches of maximum-likelihood trees with MOLPHY
(This exercise is for the experts)


Exercise 2: Phylogenetic trees of primates with cytochrome b sequences: testing the molecular clock (likelihood ratio test)

References



 

Programs and internet resources

Here you can find a very thorough list containing most of the available programs for phylogenetic reconstruction including the ones mentioned in this course:

And here you can find phylogenetic information about many organisms: