Biocomputing Unit
Biocomputing
Sequence Analysis Service
Gibson Group
EMBL
EMBL

Predoc Practical Course on Sequence Analysis

Sequence Alignment and Analysis Practical: Part 1, Clustal X

by Toby Gibson, Chenna Ramu and Christine Gemünd, 28/10/97

In this practical we will use the multiple alignment program Clustal X. This is a windows version of the widely used Clustal W program. We will run Clustal X on UNIX today, but it is available for Macs and PCs too. Clustal X colours the alignment according to conserved features in a column, which is useful in highlighting structural or functionally important aspects of the alignment. It can also mark regions of the alignment that score badly: this feature allows you to home in on problem regions of an alignment. Problem regions may be due to alignment error, inappropriate sequences or errors in the sequences. It is often not realised, how frequent and severe the problems can be with sequence alignments. So the exercises today, will illustrate some of them!


Sequence extraction and alignment tools

We will use:


Exercise 1. Extracting and aligning NusA sequences.

NusA is an RNA-binding transcriptional regulatory protein within the RNA Polymerase complex of E. coli and other prokaryotes, including the Archaea. It influences whether to terminate or read-through termination control sites in operons.

Extracting NusA Sequences:

Aligning NusA sequences:

Questions

Controlling the Alignment:

Questions


Exercise 2. Extracting and aligning EF-Tu sequences

The elongation factors EF-Tu in bacteria and EF-1alpha in eukaryotes are conserved orthologous families: they are important families in molecular evolutionary analysis and have been used to draw trees including the 3 kingdoms of life.

Extracting EF-Tu Sequences:

Aligning the EF-Tu sequences:

Questions

Controlling the Alignment:

Questions


Exercise 3. Making and displaying trees from the EF-Tu sequence alignment

Clustal X can calculate a tree using the Neighbour-Joining algorithm and do simple tests on reliability of the tree branching order. But it cannot display the trees, so tree display packages must be used. On a Mac, you could use TreeView. On Sun UNIX only, you can use TreeTool, which is quite a nice program. Today we will use a simple but useful tree display program NJplot.

Calculating a tree with Clustal X:

displaying the tree with NJplot

Questions


Take Home Lessons

We undertook two alignment exercises that turned out to be "naive" in the sense that it would be incorrect to use those alignments for any further tasks, such as making trees. We would need to go back and redo the alignments without including the problem areas. If not, the sequences would not be fully related, in one case due to lack of co-linearity, in the other because of frameshift errors in determining the sequences. Were we to persist in drawing trees, we might think we had found some unusual evolutionary relationships, but unfortunately it would be rubbish. Sequence alignments must always be carefully controlled, and revised as necessary, before any trees are made. Uncertain regions of sequence alignments must be excluded from phylogenetic analysis.


Goto Next Practical Page