We are seeking a talented Machine Learning (ML) Scientist to join a groundbreaking initiative to develop Digital Twins for rare diseases. You will work within a multidisciplinary project team, across the Open Targets, Molecular Systems and Petsalaki research groups at the EMBL European Bioinformatics Institute (EMBL-EBI). This project is funded through the Chan Zuckerberg Initiative with a strong emphasis on making datasets and models open source where possible. Rare diseases collectively impact approximately 300 million individuals worldwide, but their study is hindered by limited patient-level data. To address this challenge, this project aims to develop ‘Digital Twins’ of rare disease patients by combining mechanistic, GenAI and other machine learning framework models to integrate patient-level multi-omics and clinical data to provide insights into rare diseases. The models will utilize extensive public datasets of single-cell multiomics including transcriptomics from diverse disease conditions, and simulations from mechanistic models. This will be applied to the challenge of limited multi-omics data for rare disease, with the aim of developing rare disease Digital Twins to provide new insights into disease mechanisms and potential treatments. The role involves designing and implementing ML models that integrate multi-omics data and clinically relevant endpoints, contributing to the creation of virtual patient models to simulate disease trajectories and therapeutic responses. This is a unique opportunity to develop and apply advance ML methodologies and significantly contribute to understanding rare disease biology, enabling applications such as diagnosis, drug repurposing, and new treatment development. Your role The ML Modeller’s primary tasks include developing and applying advanced ML and GenAI frameworks to integrate and analyze multi-omics datasets. The role involves working collaboratively with biocurators, bioinformaticians, and mechanistic modellers to ensure seamless integration of data into Digital Twin models. Responsibilities include: Designing, implementing, and optimizing ML models tailored to rare disease datasets. Applying GenAI approaches to enhance data imputation, integration, and prediction capabilities. Integration of multi-omics datasets (single-cell and bulk transcriptomics, genomics, and clinical data). Collaborating with biocurators to curate and preprocess datasets for model training and validation. Ensuring robust model performance through testing and validation using benchmarking datasets. Contributing to open-source dissemination of datasets and models, following FAIR principles. Engaging with global consortia to align models with the needs of the rare disease research community. Documenting workflows, models, and results for reproducibility and transparency. You have PhD (or equivalent experience) in Computer Science, Computational Biology, Bioinformatics, or a related field. Proven track record in developing and deploying ML models for large datasets. Experience with advanced ML, VAE, GenAI frameworks and large-scale data modelling. Proficiency in Python, R, or similar programming languages. Experience with ML frameworks such as TensorFlow, PyTorch, or Scikit-learn. Strong knowledge of advanced statistical techniques and modern deep learning methods. Expertise in pipeline workflow management tools like Nextflow or Snakemake. Excellent communication skills, both written and verbal, for collaborative teamwork and reporting. Self-motivated and capable of working independently and within multidisciplinary teams. Enthusiasm to advancing research in disease modelling and patient care. Demonstrated capacity to prioritize and manage multiple independent projects in a dynamic environment. You may also have Experience in developing ML models for biological or clinical datasets. Hands-on experience with multi-omics data integration and analysis. Experience publishing in high-impact journals and presenting at international conferences. Familiarity with single-cell transcriptomics, bulk omics data, and genomics. Strong knowledge of FAIR principles and open data standards. Experience with cloud computing platforms and high-performance computing environments. Strong ability to convey complex ML concepts to non-technical stakeholders. Contract length: 2 years fixed-term grant-limited, to work on the CZI Digital Twin grant. Salary: Grade 5 or 6 depending on qualifications and experience, monthly salary at £3,229 or £ 3,612 after tax but excluding pension and insurance contributions. Plus generous benefits. Why join us Do something meaningful At EMBL-EBI you can apply your talent and passion to accelerate science and tackle some of humankind's greatest challenges. EMBL-EBI, part of the European Molecular Biology Laboratory, is a worldwide leader in the storage, analysis and dissemination of large biological datasets. We provide the global research community with access to publicly available databases and tools which are crucial for the advancement of healthcare, food security, and biodiversity. Join a culture of innovation We are located on the Wellcome Genome Campus, alongside other prominent research and biotech organisations, and surrounded by beautiful Cambridgeshire countryside. This is a highly collaborative and inclusive community where our employees enjoy a relaxed atmosphere. We are committed to ensuring our employees feel valued, supported and empowered to reach their professional potential. Watch this video to see how EMBL-EBI makes an impact. Enjoy lots of benefits: Financial incentives: Monthly family, child and non-resident allowances, annual salary review, pension scheme, death benefit, long-term care, accident-at-work and unemployment insurances Flexible working arrangements - including hybrid working patterns Private medical insurance for you and your immediate family (including all prescriptions and generous dental & optical cover) Generous time off: 30 days annual leave per year, in addition public holidays Relocation package including installation grant (if required) Campus life: Free shuttle bus to and from work, on-site library, subsidised on-site gym and cafeteria, casual dress code, extensive sports and social club activities (on campus and remotely) Family benefits: On-site nursery, 10 days of child sick leave, generous parental leave, holiday clubs on campus and monthly family and child allowances Benefits for non-UK residents: Visa exemption, education grant for private schooling, financial support to travel back to your home country every second year and a monthly non-resident allowance. For detailed information please visit our employee benefits page here. What else you need to know International applicants: We recruit internationally and successful candidates are offered visa exemptions. Please take a look at our International Applicants page for further information. EMBL is a signatory of DORA. Find out how we apply DORA principles to our recruitment and performance assessment processes here. Diversity and inclusion: At EMBL, we strongly believe that inclusive and diverse teams benefit from higher levels of innovation and creative thought. We encourage applications from women, LGBTQ & individuals from all nationalities. How to apply: To apply please submit a cover letter and a CV through our online system. We aim to provide a response within two weeks after the closing date. Closing Date 16/03/2025