Discovering Bioinformatics: a protein in the World Wide Web

Discovering Bioinformatics: a protein in the World Wide Web


Sami Khuri, Natascha Khuri, Alexander Picker, Aidan Budd, Sophie Chabanis-Davidson and Julia Willingale-Theune

British flag Available in Italian

16 years and up Computer activity Creative commons


In this activity students can search for information about a protein using databases of biological information on the World Wide Web. These databases collect and store information about genes and proteins (sequence, structure, expression) about human inherited diseases for which the genetic cause is known, scientific literature, etc. Many databases that are accessible via the World Wide Web offer so-called 'Query Interfaces': special web pages on which you can enter and combine search terms and restrict them to special sections or fields of the database. In a text search you can enter a search term, (the name of a protein, a disease, a cell type) which is subsequently compared to the textual content of the database. You can also compare the sequence of a protein or gene to the collection of known, annotated sequences stored in a protein or genes database. In other words, you can search these databases to find out what is already known about your favourite protein. As we will see the main biological databases are interconnected (through so-called cross-references), providing links with one another and allowing the user to access different types of information from the result of a single 'Query'.

We are going to look at the Pax6 protein from zebrafish which is involved in eye development. By 'following' this protein on the World Wide Web we can find the human protein corresponding to the zebrafish Pax6 (its Ortholog), information about its function, strucutre, sub-cellular localization, and the molecular basis of diseases linked to mutations in its sequence.

The standard conventions to denote genes and their products (proteins) are as follows:
PAX6 = human gene
pax6 = gene any other species
Pax6 = protein


Bioinformatics PAX6 Module

GCSE Syllabus

Advanced Subsidiary GCE and Advance GCE specifications for Human Biology

  • This activity can be used to challenge students key skills, especially IT (level 3).
  • Plan and use different sources to search for, and select, information required for two different purposes.
  • Explore, develop and exchange information and derive new information to meet two different purposes.
  • Present information from different sources for two different purposes and audiences.

Icon legend

16 years and up    from 16 years up

Computer Activity    Computer Activity

Creative Commons    Creative Commons