Using ProtTest


ProtTest is tool for selecting, from a set of models of amino acid substitution, the model that best describes the evolutionary processes that produced a given protein multiple sequence alignment.

It can be seen as the protein-sequence equivalent of the ModelTest software (and indeed is developed by the same group), which is used for selecting models of nucleotide sequence evolution.

The software can be downloaded from the ProtTest website, along with a comprehensive manual describing both the theory of model selection in this context,  and how to use the software.

The instructions below provide a simple description of how to run an analysis using the software in our practical sessions - thus, note that running the analysis differently may well provide a more accurate analysis/model selection. However, as potentially more accurate/comprehensive (i.e. testing more models) use of ProtTest can take many hours or even days to complete, our description focuses on obtaining results reasonably quickly.

Alignment format and example alignment file

ProtTest compares amino acid substitution models in the context of an alignment. Thus, you will need to upload an alignment to ProtTest to use it in your analysis.

Provide the alignment in PHYLIP format - this can be obtained from an alignment in a different format (e.g. CLUSTAL or FASTA) by loading the differently-formated alignment into CLUSTALX and saving it in PHYLIP format.

This link leads to the example PHYLIP format alignment file provided with the ProtTest distribution.

Specifying settings and running a ProtTest analysis

To run a ProtTest analysis you need to:
  1. upload an alignment using the "Select file" button in the "Alignment" zone of the main GUI window
  2. specify the source of the tree used in the likelihood calculations
  3. select an Optimization strategy
  4. select the set of models to be compared by clicking the "models" button
  5. run the analysis by clicking "Start"

Processing analysis results

When the analysis is started, ProtTest reports on the progress of the analysis in the console window (see image below).

Once the analysis is complete, use the console window to sort the results by each of the different criteria offered by the "Sort by" drop down box (when sorting by AICc and BIC, choose the default option of basing the size of the sample on the Alignment length).

Finally, click on the Save button of the console window to save the results of the analysis (which is just the text shown in the console window) - here is an example of such an analysis file.

For now, simply note that for each of the different sorts you carried out, the console text reports the best model e.g. as in the excerpt from the example analysis file shown below, where the sort based on AIC reports that the best model of those tested was WAG+G.
Akaike Information Chriterion (AIC) framework
Best model according to AIC: WAG+G
Model deltaAIC* AIC AICw -lnL
WAG+G 0.00 3037.80 0.99 -1510.90
WAG 9.73 3047.53 0.01 -1516.77
WAG+G+F 18.65 3056.45 0.00 -1501.23
WAG+F 26.24 3064.04 0.00 -1506.02
JTT+G 28.06 3065.86 0.00 -1524.93
JTT+G+F 45.64 3083.45 0.00 -1514.72
JTT 47.88 3085.69 0.00 -1535.84
JTT+F 62.86 3100.66 0.00 -1524.33
*: models sorted according to this column

The image below shows the console window, which has just been sorted using the AIC criterion.

Author: Aidan Budd
Back To Gibson Team Training Pages