Bioinformatician
Location: EMBL-EBI, Hinxton near Cambridge, UK
Staff Category: Staff Member
Contract Duration: 2 years
Grading: 5 (monthly salary starting at £2,738 after tax)
Closing Date: 6 August 2020
Reference Number: EBI01642
The MGnify resource faciliates the analysis and archiving of microbiome derived sequence data (metabarcoding, metagenomic and metatranscriptome) submitted to the European Nucleotide Archive (ENA). This includes publicly available data, as well as privately submitted datasets, covering a wide range of environments and biomes.

MGnify provides taxonomic, and functional analysis, as well as assembly of shotgun metagenomic and metatranscriptomics dataset as a service. The MGnify resource is produced by the Microbiome Informatics Team, led by Rob Finn, and provides access to one of the largest collections of publicly available analysed microbiome data.

As part of this analysis service role, the MGnify team also undertakes collaborations with the wider research community on specific projects that enable the showcasing of MGnify analyses, the development of new methodological approaches to enhance the existing set of analyses, and/or the integration of different data types.   

There are currently four 2-year positions within the MGnify team to take on such activities. More specifically, these focus on the analysis of human, animal and marine microbiomes, with the aim of incorporating metatranscriptomics and metaproteomics data, as well as mining the microbiome proteins with unknown function, the so-called protein “dark matter”. These new functions may be investigated by project partners for biotechnological applications, or for their associations with disease states.  

Your role

All of the projects involve the development and/or execution of analysis pipelines for metagenomic, metatranscriptomic and metaproteomic datasets. The data analysis may also require the evaluation of tools, basic statistical analyses and/or aggregations, and interpretation of multiple threads of analyses to gain insights into function. Where appropriate, new developments will be propagated to the MGnify production pipelines, which are expressed in workflow description languages (CWL and Nextflow), allowing the combination of third party tools and in-house software (primarily written in Python)

Projects include the development and application of computational approaches for metagenomic, metatranscriptomic and metaproteomic data integration, with a view to elucidating function in the considerable portion of microbe-bourne proteins which are currently proteins of unknown function. A particular focus will be on the analysis of existing multi-omic datasets, for example chronic diseases such as inflammatory bowel disease, type 1 diabetes, and Parkinson’s disease.  Another project aims to build both genomic and gene catalogs for animal microbiomes associated with food production. In particular, this will involve the analysis of large, longitudinal datasets provided by project partners. Another project specifically aims to analyse marine metagenomics datasets (metagenomics assembly and functional annotation) for novel examples of enzymes that are important for industrial biotechnology applications. Finally, another project will investigate the repertoire of bacteriocins encoded within microbial populations. Candidates are welcome to highlight projects that particularly appeal or match their skill sets.

The primary direction for the developments and data analyses will come from both the team leader Rob Finn, and the line manager Lorna Richardson. The successful candidate will be responsible for the smooth and timely development, testing and implementation of the tools and pipelines required. They will also be responsible for the ongoing maintenance of these workflows, and the throughput of analysis.

Further requirements may come from other MGnify team and external project members. There will be interaction with other teams at EMBL-EBI, such as those that look after the ENA and PRIDE databases, as well as the those that are part of the technical services clusters (infrastructure). The postholder will be required to interact with various stakeholders both at EMBL-EBI and those associated more broadly with the project to ensure the smooth flow of data, and to report results as required.  

You have

Masters level or equivalent qualification/experience in a computational, biological or related scientific discipline, as well as experience running analysis pipelines and contributing to their development. You will have strong scripting skills in Python (although other languages would be considered for the right candidate), be comfortable with software version control (e.g. GitHub), and have experience with relational database querying, basic schema design and optimization (e.g. MySQL, Postgresql).

Knowledge of metagenomics, metatranscriptomics and/or metaproteomics data is essential, as is an understanding sequence analysis for function annotation. Evidence of prior work on similar multi-omics/data mining  research activities would be a distinct advantage.   

You will need to be able to work independently, as well as interact with the rest of the MGnify team. You will be required to interact with mutiple stakeholders in the project, who may be internal and external staff, so good communication skills and attention to detail are essential, as is the ability to work to deadlines.  

You might also have

Experience in multi-omics data analysis would be highly desirable. It would also be desirable if you have experience with workflow languages (e.g. CWL, Nextflow), the use of compute cluster job schedulers (e.g. LSF, Oracle grid engine), and an understanding of the software development cycle within a production setting. Experience with using Cloud technologies would also be advantageous, but not essential.  

Why join us

At EMBL-EBI, we help scientists realise the potential of ‘big data’ in biology by enabling them to exploit complex information to make discoveries that benefit mankind. Working for EMBL-EBI gives you an opportunity to apply your skills and energy for the greater good. As part of the European Molecular Biology Laboratory (EMBL), we are a non-profit, intergovernmental organisation funded by over 27 member states and two associate member states. We are located on the Wellcome Genome Campus near Cambridge in the UK, and our 850 staff are engineers, technicians, scientists and other professionals from all over the world. EMBL is an inclusive, equal opportunity employer offering attractive conditions and benefits appropriate to an international research organisation. The remuneration package comprises a competitive salary, a comprehensive pension scheme and health insurance, educational and other family related benefits where applicable, as well as financial support for relocation and installation. For more information about pay and benefits click here

We have an informal culture, international working environment and excellent professional development opportunities but one of the really amazing things about us is the concentration of technical and scientific expertise – something you probably won’t find anywhere else. If you’ve ever visited the campus you’ll have experienced first-hand our friendly, collegial and supportive atmosphere, set in the beautiful Cambridgeshire countryside. Our staff also enjoy excellent sports facilities including a gym, a free shuttle bus, an on-site nursery, cafés and restaurant and a library.

What else you need to know

To view a copy of the full job description please click here

To apply please submit a covering letter and CV through our online system. Applications are welcome from all nationalities and this will continue after Brexit. For more information please see our website. Visa information will be discussed in more depth with applicants selected for interview. EMBL-EBI is committed to achieving gender balance and strongly encourages applications from women, who are currently under-represented at all levels. Appointment will be based on merit alone. This position is limited to the project duration specified. Applications will close at 23:00 GMT on the date listed above.