|Location:||EMBL-EBI Hinxton near Cambridge, UK|
|Contract Duration:||Until 30.09.2017|
|Closing Date:||8 June 2017|
We are seeking an intern to work for 4-6 months at EMBL-EBI on two bioinformatics projects involving discovery and annotation of genetic variation in human disease associated stem cells, and inbred laboratory mouse models of human disease. Differences in DNA sequences (e.g. SNPs, indels) have regularly been implicated in disease susceptibility, resistance and altered biological function. Elucidating the genetic differences among populations of related samples is an important initial step to further improve our understanding of complex traits including many diseases.
The inbred laboratory mouse is the premier model organism for investigating mammalian biology and human disease. Whole genome sequencing of over 40 inbred laboratory mouse strains will be used to identify DNA sequence differences relative to mouse reference genome. These strains represent a diverse array of models of human disease, and are particularly important to help us understand the unique genetic differences underlying susceptibility and resistance to a wide range of human diseases that these strains are used to study. The Mouse Genomes Project is a collaboration between the Wellcome Trust Sanger Institute and EMBL-EBI (http://www.sanger.ac.uk/science/data/mouse-genomes-project). Several high profile publications have been produced by the project to date (1,2), and in this project you will prepare the next release of genetic variants for the project.
Disease associated cell lines are used to investigate the genetic mechanisms underlying altered biological functions and processes. 80 disease associated human induced pluripotent stem cell lines have been sequenced (WGS) as part of the European Bank for induced pluripotent Stem Cells (EBiSC) project. Predicted DNA sequence variation among these samples will form the initial release of a publicly available variation catalogue.
(1) Keane et al. (2011) Mouse genomic variation and its effect on phenotypes and gene regulation, Nature 477 (7364), 289-294.
(2) Yalcin et al. (2011) Sequence-based characterization of structural variation in the mouse genome, Nature 477 (7364), 326-329
The successful candidate will work primarily on the production of variation catalogues for the Mouse Genomes Project and the EBiSC project. Data generated from the MGP may also lead to scientific meeting attendance to present the work. In particular, the successful candidate will:
- Implement methods for processing NGS data and work with variant calling tools;
- Work with large data sets of whole genome sequencing;
- Optimise and implement variant calling pipelines;
- Perform variant functional annotation using publicly available tools;
- Submit sequencing and variation data to the appropriate public repository.
Qualifications and Experience
- Experience of using the Linux command-line;
- Experience in programming via a scripting language (e.g. Python, Perl etc.);
- Excellent time management;
- Have an interest in understanding biology and disease.
- High performance cluster computing environments (e.g. LSF, Amazon EC2, PBS, OpenLava etc.);
- Experience of next-generation sequencing analysis.
We have an informal culture, international working environment and excellent professional development opportunities but one of the really amazing things about us is the concentration of technical and scientific expertise – something you probably won’t find anywhere else.
If you’ve ever visited the campus you’ll have experienced first-hand our friendly, collegial and supportive atmosphere, set in the beautiful Cambridgeshire countryside. Our staff also enjoy excellent sports facilities including a gym, a free shuttle bus, an on-site nursery, cafés and restaurant and a library.
To apply please submit a covering letter and CV, with two referees, through our online system.
Applications are welcome from all nationalities - visa information will be discussed in more depth with applicants selected for interview.
This position is limited to the project duration specified.
Applications will close at 23:00 GMT on the date listed above.