Biocomputing Unit
Biocomputing
Sequence Analysis Service
Gibson Group
EMBL
EMBL

EMBL

Predoc Practical Course on Sequence Analysis

October 11th-17th, 2000

by Toby Gibson, Chenna Ramu, Christine Gemünd and José Castresana


The course consists of four practicals introducing some of the sequence analysis tools available at EMBL on UNIX computers, or available through the WWW interface. Sequence analysis tasks will include database search and retrieval, sequence alignment and calculation and display of trees. Schedules for the practicals are provided on Web pages accessible by clicking on the links below.

In the practicals, the students will investigate some protein families as an introduction to the sequence analysis tools. The tools used will include SRS, Blast 2, Bioccelerator, GCG, Clustal X and tree display programs. The students will be split into two groups and will be paired up for each X-terminal. Practicals will take place in the computer teaching lab, room V125.

Wednesday 11th October. Group C+D practicals

Thursday 12th October. Group C+D practicals

Monday 16th October. Group A+B practicals

Tuesday 17th October. Group A+B practicals


Practical 1. Web Tools (SRS, Blast2, Bioccelerator, SMART)

This practical introduces some web servers provided at EMBL. These can be accessed from any computer and are simple to use. Web servers are often the nicest way to do sequence analysis. But you should be aware that they can be unreliable, need constant care from their providers and are not suited to every task. Sometimes you have to run programs on local machines too. Database search tools are well-suited to web servers and we will try out the Blast and Bioccelerator servers at EMBL.


Practical 2. Using the GCG Sequence Analysis Package on UNIX

At EMBL, it is especially important to be able to use the GCG package on UNIX. It offers many aspects of sequence analysis that are not available on, or are unsuited for WWW servers. We will do exercises involving one, two or many sequences. Once you have mastered the WPI X-windows interface and SEQLAB multiple sequence editor, you will be able to run many other applications without much difficulty.


Practical 3. Exploring gene sequences with Artemis and Gene2EST

Artemis is a genome sequence display and annotation program from the Sanger Centre. It uses EMBL format sequence files and displays the features listed in the feature table. These features can be edited - Artemis is used to annotate newly sequenced "small" genomes (bacteria and so forth) at the Sanger centre. It is very useful for anyone who has to work with large DNA segments encoding complex genetic information. Gene2EST is a new server developed by our group for querying EST databases with large (gene-sized) queries. In favourable cases (=lots of ESTs) Gene2EST can efficiently reveal spliced gene structure and whether there are alternatively spliced products. Artemis is used to display the graphical output of Gene2EST.


Practical 4. Generating trees from aligned sequences

Molecular Evolution has revolutionised phylogenetic analysis (although it can never replace Paleontology because there is no implicit measure of time). For bench scientists, it can also be useful to produce trees from multiple alignments, even when evolution is only indirectly relevant. For example, it is important to distinguish orthologues from paralogues when extrapolating gene function between organisms.


You can find this page at http://www.embl-heidelberg.de/~seqanal/courses/predoc00/Top.html