Biocomputing Unit
Biocomputing
Sequence Analysis Service
Gibson Group
EMBL
EMBL

EMBL

Practical Course on Sequence Analysis

November 2nd - November 4th, 1999

by Toby Gibson, Chenna Ramu, Christine Gemünd and Jose Castresana


The course consists of four 1/2 day modules that you can take individually or together. A short introductory lecture will be followed by use of sequence analysis programs using typical sequence examples. The course covers the most common tasks in sequence analysis including database search and retrieval, multiple sequence alignment and sequence trees. If you are comfortable with these core tasks, it should not be difficult to find out how to do other analyses in the future.

In the practicals, you will investigate some protein families as an introduction to sequence analysis tools available at EMBL on UNIX and the WWW. These will include SRS, Blast 2, Bioccelerator, SMART, GCG, Clustal X and tree display programs. The students will be split into two groups and will be paired up for each X-terminal. Practicals will take place in the computer teaching lab, room V125.

Tuesday 2nd November.

Wednesday 3rd November.

Thursday 4th November.

Wednesday 3rd November. (changed to avoid clash with staff meeting)


Practical 1 Web Tools (SRS, Blast2, Bioccelerator, SMART)

This practical introduces some web servers for sequence databases provided at EMBL. These can be accessed from any computer and are simple to use. Web servers are often the nicest way to do sequence analysis. But you should be aware that they can be unreliable, need constant care from their providers and are not suited to every task (e.g. computationally intensive ones). Sometimes you have to run programs on local machines too. Database search tools are well-suited to web servers and we will try out the Blast and Bioccelerator servers at EMBL.


Practical 2 Using the GCG Sequence Analysis Package on UNIX

At EMBL, it is especially important to be able to use the GCG package on UNIX. It offers many aspects of sequence analysis that are not available on, or are unsuited to WWW servers. We will do exercises involving one, two or many sequences. Once you have mastered the WPI X-windows interface and SEQLAB multiple sequence editor, you will be able to run many other applications without much difficulty.


Practical 3. Exploring sequence databases with SRS

Like it or not, database retrieval is one of the most important activities in biological research and in the genome era is becoming indispensible. SRS, the Sequence Retrieval System, is a powerful program for indexing and retrieving data from sequence databases. SRS originated at EMBL and, more recently, has been maintained at the EBI (but is now being commercialised by LION AG). Keywords are used to retrieve information from databases indexed in SRS but links between databases allow users to get much more information than the keyword alone could retrieve.


Practical 4. Generating trees from aligned sequences

Molecular Evolution has revolutionised phylogenetic analysis (although it can never replace Palaeontology because there is no implicit measure of time in sequences). For bench scientists, it can also be useful to produce trees from multiple alignments, even when evolution is only indirectly relevant. For example, it is vital to distinguish orthologues from paralogues when extrapolating gene function between organisms.


You can find this page at http://www.embl-heidelberg.de/~seqanal/courses/Nov99/Top.html