Multiple Sequence Alignment "How To"'s
Collecting sequences by keyword search
NCBI Entrez Keyword Query
Go to the NCBI
Entrez website.
Switch the search to be for "Proteins" (assuming you are searching for
protein sequences)
Enter your query into the "for" box, and execute it using the "Go"
button.
This will identify those records within the Entrez protein sequence
databases that match your query.
For help building appropriate queries, look through the following pages
provided by NCBI
Entrez
Help
Entrez
sequence database searching tutorial
Entrez
Search Field Descriptions and Qualifiers
To alter the way the records are displayed:
View the sequences in fasta format by switching "Display" to "fasta".
(other formats e.g. "genpept" can be used if you want to obtain
more information about the sequences.)
To save the results:
Save the results into a file by switching "send to" to "file".
Be sure to store the file somewhere you will find it again, and to
give it an appropriate name.
SRS Keyword Query
asdf
Automatic Alignment of
Multiple Sequences (command line - default settings)
MUSCLE
./muscle -in in_filename.fasta -out out_filename.fasta
MAFFT
./mafft infile.fasta > outfile.fasta
PROBCONS
./probcons infile.fasta > outfile.fasta
CLUSTALW
./clustalw infile.fasta
(creates outfile "infile.aln")
asdadfs
asdfasdf
Automatic Alignment of
Multiple Sequences (Webservers)
MAFFT
Open the link to the Japanese
MAFFT webserver.
To collect the alignment use the "Fasta format" link at the top left of
the page
Back
to Gibson Team course pages at EMBL.