About the team/job
This is an exciting opportunity to get involved in a research project building upon innovative AI approaches to define gene x drug exposure x phenotype relationships at scale over continuous time in large human cohorts. We are looking for a highly motivated bioinformatics expert or data scientist to join this exciting project, and to drive specific aspects of this cutting-edge research, including the analysis of population scale genetic cohorts with electronic healthcare and prescriptions records. You will be based in the Birney research and Open Targets groups at EMBL-EBI and work closely with the Gerstung group at DKFZ.
The Birney research group has several ongoing related projects and the candidate will benefit from extensive knowledge into population scale modelling of healthcare records, genetics and exposures. You will also benefit from the expertise in pharmacogenetics and drug target discovery within the Open Targets team, and be embedded in the EMBL Human Ecosystems transversal theme community, connecting to others involved in exposome and drug safety research.
Your role
You will work on direct linkage across a diverse set of human traits with detailed healthcare, genetic and additional information in multiple human cohorts, and as such we are looking for an experienced post-doctoral fellow with a strong interest in health-related data science and proven expertise in modern AI techniques.
You will lead on a project funded by the EMBL Human Ecosystems transversal theme programme to investigate gene x drug exposure x phenotype relationships at scale over continuous time in large human cohorts. The successful candidate will join the Birney research team at EMBL-EBI working closely with Open Targets to apply and further develop an AI framework based on generative transformers (Delphi) for multi-disease and multi-drug modelling across continuous time in human populations. Some of the initial work will include incorporating prescription exposome data into Delphi and developing a pharmacogenetic analysis model, correlating genetic variation with exposure to drug outcomes based upon known associations and exploring novel genetic-drug exposure-phenotype associations.
Drug exposure is a strong environmental modifier and is intrinsically linked to disease risk and outcome in human populations. There remain many open questions and undiscovered interactions between drug exposure, genetics and disease onset/outcome that can be investigated using large human cohorts with detailed health records and genetics, providing the candidate with lots of research opportunities and potential novel findings. One key factor which makes this type of research more powerful is recent innovations in generative AI making it possible to model all disease and other important factors, such as drug exposure and genetics at the same time, mapped to the same internal space (embedding) and across continuous time. This not only allows us to assess the overall impact of genetics and exposures on disease risk across a population but also provides a framework for assessing at which time across a life course these effects most strongly manifest.
Key responsibilities
* Integration of prescribing data and genetic data into the DELPHI model initially using UK BioBank data
* Developing benchmarking and validation datasets
* Benchmarking the model against known associations
* Analysis utilising the DELPHI model to investigate gene x drug exposure x phenotype relationships over continuous time
* Exploring the ability to run the model in other human cohorts with genetic and prescribing data
You have
* Advanced degree (MSc, PhD) in computer science, bioinformatics, software development, or a related field
* Strong proficiency in Python and experience with Large Language Model integration
* Proven experience in applying modern ML/LLM frameworks and concepts
* Good understanding of ML principles including embeddings, cross-validation and fine-tuning
* Proficiency in common data preprocessing tasks and normalisation
* Experience with large scale compute infrastructure, including high performance compute facilities (HPC) and / or cloud based workflow managers such as Nextflow
* Exposure to source code version control software such as Git and GitHub
* Experience in independent problem-solving and examples of resolving complex issues
* Fluency in written and spoken English
* Ability to effectively communicate ideas or issues and work with team members from multidisciplinary backgrounds
* Interest in promoting your work and the ways we have solved complex challenges
You might also have
* Experience in MLOps including experiment tracking and model deployment
* Experience with current LLM frameworks, such as LangChain and open-source LLM deployment (e.g., llama-cpp, ggml, Xorbits Inference)
* Knowledge of human genetics, genomics and/or pharmacogenetics - or are interested in learning about these topics
* Experience on working with prescribing data, electronic medical records, large scale cohort biobanks such as UK BioBank
* Enthusiasm in novel research and discovery
Why join us
Do something meaningful
At EMBL-EBI you can apply your talent and passion to accelerate science and tackle some of humankind's greatest challenges. EMBL-EBI, part of the European Molecular Biology Laboratory, is a worldwide leader in the storage, analysis and dissemination of large biological datasets. We provide the global research community with access to publicly available databases and tools which are crucial for the advancement of healthcare, food security, and biodiversity.
Join a culture of innovation
We are located on the Wellcome Genome Campus, alongside other prominent research and biotech organisations, and surrounded by beautiful Cambridgeshire countryside. This is a highly collaborative and inclusive community where our employees enjoy a relaxed atmosphere. We are committed to ensuring our employees feel valued, supported and empowered to reach their professional potential.
Enjoy lots of benefits:
* Financial incentives:Monthly family and child allowances, stipends reviewed yearly, death benefit (optional), long-term care, accident-at-work and unemployment insurances
* Flexible working arrangements
* Private medical insurance for you and your immediate family (including all prescriptions and generous dental & optical cover)
* Generous time off:30 days annual leave per year, in addition to eight bank holidays
* Relocation package including installation grant (if applicable)
* Campus life:Free shuttle bus to and from work, on-site library, subsidised on-site gym and cafeteria, casual dress code, extensive sports and social club activities (on campus and remotely)
* Family benefits: On-site nursery, 10 days of child sick leave, generous parental leave, holiday clubs on campus and monthly family and child allowances
* Benefits for non-UK residents: Visa exemption.
For more details, please see our employee benefits page.
What else you need to know
* Contract duration:This position is a fixed-term project-limited contract of 2 years.
* International applicants:We recruit internationally and successful candidates are offered visa exemptions. Read more on our page for international applicants.
* Diversity and inclusion: At EMBL-EBI, we strongly believe that inclusive and diverse teams benefit from higher levels of innovation and creative thought. We encourage applications from women, LGBTQ+ and individuals from all nationalities.
* Job location:This role is based in Hinxton, near Cambridge, UK. You will be required to relocate if you are based overseas and you will receive a relocation package to support you.
* How to apply: To apply please submit a cover letter and a CV through our online system. Please include a link to your GitHub/open source codebase contributions if possible as part of your application.
* DORA: EMBL is a signatory of DORA and is committed to hiring and training outstanding research, service, and administrative personnel.
* Interviews: anticipated to take place January 2025. Exact details will be confirmed with successful candidates after job closes.
#J-18808-Ljbffr