We are looking for a Genomics Bioinformatician to join the Genomics team.
About Us
Basecamp Research is a market leader in mapping biodiversity for AI-based design of biological systems. With the world’s largest biological database built from biodata collected from every corner of the world, often in unexplored locations and biomes, we are able to create a foundational dataset tailored for AI. This leads to tremendous opportunities to both expand basic understanding of biology but also tackle some of the toughest challenges in the life sciences.
Basecamp Research is backed by leaders across technology, life sciences and biopharma and collaborates with global organisations such as Nvidia, Johnson Matthey, and world-leading academic labs such as the David Liu Lab of the Broad Institute. We are also listed as Bloomberg's top 25 UK startups to watch for in 2024.
The Role
The successful candidate will take ownership, expand and manage our sequencing and metagenomic data analysis production operations in the first instance.
* They will have the opportunity to investigate new methods to maximize the curation and annotation of the microbial dark matter and our unique sequencing datasets.
* This will be a strong collaborative role working closely with all teams at all data collection and analysis points. This will include, but is not limited to, our biodiversity partners, field scientists, sequencing ops, ML scientists, data engineers, and commercial stakeholders.
Responsibilities
Develop and run software to support the genome collection and sequencing operations, post-curation and labelling of data and the overall goals of the Genomics team. Our sequencing data stack includes second and third generation technologies.
Responsibilities may include, in coordination with other team members:
* Taking ownership in building, improving and managing the in-house genomic assembly and annotation pipeline. This will entail:
o Manage and audit the data workflow of our samples from collection to data warehousing. This includes the quality control of appropriate files and datasets.
o Collaborate with the Data Engineering team in building and managing the pipeline in our in-house designed infrastructure platform.
o Investigate, benchmark and integrate novel analyses into the pipeline.
o Write and document high-quality code and methodology of processes.
* Methods development to leverage the in-house sequencing datasets to create full high-quality genomes for context analysis.
* Contribute to problem-solving discussions within and across teams to generate ideas that will benefit all aspects of the organisation.
* Opportunity to lead from the front when it comes to bringing new ideas and approaches to the table.
Required Skills and Experiences
* A graduate (MSc/PhD) degree in the life sciences, computer science or similar.
* At least three years of post-bachelors experience in the life science and genomics, preferably in industry or a high throughput research institute.
* Have directly worked in an environment that involved processing hundreds to thousands of samples. This is beyond just downloading from the public databases but handling data at the raw level and the ability to track each datapoint
o If microbial or metagenomics, have demonstrated managing in the upper hundreds.
o If human or clinical resequencing, have demonstrated managing in the thousands.
* Have experience writing complex pipelines using workflow languages and tools for genomic and protein analysis and have demonstrated the ability to benchmark what to choose for each step of a pipeline
o Dagster, Nextflow, Snakemake, Stepfunctions, other.
* Have worked with environmental metagenomic or microbiome datasets. This means not one single organism, or parasite or cultured bacterium.
* Have extensive experience working with second and third generation sequencing datasets for downstream analysis
o Experience with methods development with sequencing read datasets.
* Proficient at using Unix-based operating systems, libraries, and tools.
* Knowledge and experience of tools used in bioinformatics both in genomics, metagenomics and/or protein biology.
* Proficiency with a programmatic scripting language (Python, Bash, etc).
* Excellent analytical and problem-solving skills.
* Excellent communication skills and ability to work closely with interdisciplinary teams.
* Fluency in English.
Advantageous Skills and Experiences
* Have worked and developed analysis/methods with population sequencing data or analysing variation within metagenomic sequencing data.
* Have worked with novel deep learning methodology in analysing genomic data.
* Have worked in developing novel computational methods or analyses towards the annotation of phage, viral and/or archaea genomes.
* Experience in the techbio/biotech industry that is focused on product development (therapeutics, protein/drug discovery, CRO, etc).
* Experience with building databases (relational and/or non-relational).
* Experience using an HPC environment.
* Experience with containerization (Docker, Singularity).
* Experience with Git and GitLab or GitHub.
* Experience with Agile software development.
#J-18808-Ljbffr