Salary for this Role: From £47,000 with benefits, subject to skills and experience Job Title: HPC & Research Data Systems Engineer Reports to: Wei Xing Closing Date: 03/Dec/2024 23.59 GMT Job Description: HPC & Research Data Systems Engineer Reports to: Research Computing Platforms Manager This is a full-time, permanent position on Crick Terms and Conditions of Employment. SUMMARY The Crick’s mission is discovery without boundaries; we don’t limit the direction our research takes. We want to understand more about how living things work to help improve treatment, diagnosis, and prevention of human disease, and generate economic opportunities for the UK. Much of our research is both data- and compute-intensive and relies on advanced Scientific Computing systems, services, and skills. As an HPC Systems Engineer you will provide user support and assist in the design, implementation, development, and service delivery of the institute’s HPC, Cloud, and Research Data Storage and Management hardware and software through a mix of on-premises systems and cloud services. This role is part of the HPC team within ITO infrastructure and cloud platform, which supports the Crick’s research community and works closely with Research Labs, other STPs. The Crick has powerful CPU and GPU HPC clusters and a 16 Petabyte Spectrum Scale high performance storage system. This is a fantastic opportunity to apply your skills and experience in a stimulating environment to make a real difference KEY RESPONSIBILITIES These include but are not limited to: User Support and Training: Help researchers make effective use of HPC and data storage systems by responding to support queries, providing advice, training and documentation. Deploy Linux and Windows scientific applications on HPC platforms. HPC training for non-computing research scientists Systems Administration: Monitor health, security and performance of systems software/hardware and scientific applications, working actively with vendors and members of the enterprise IT team to troubleshoot and quickly restore services when required. Assist in the management of scheduler policies, access permissions, quotas, directory structures, and distribution of data across storage systems and tiers. (Accessed from Windows, Mac and Linux clients) Automate operational tasks and perform changes to systems software/hardware required to improve management and service delivery. Produce documentation for internal systems and support processes. Create reports for presentation to management, governance groups and other key stakeholders regarding key research computing services managed by the Crick. Systems Engineering: Assist in the deployment of proof-of-concept systems and services to meet evolving scientific requirements. Assist in the specification, selection and implementation of new research storage and HPC systems. Work with researchers to integrate scientific instruments and software with research data analysis and management platforms. Develop in-depth knowledge and skills to deliver new technologies for research, as well as data management and processing techniques and best practice. KEY EXPERIENCE AND COMPETENCIES The post holder should embody and demonstrate our core Crick values: Bold, Imaginative, Open, Dynamic and Collegial, in addition to the following: Essential: A degree in a computing/science/engineering subject with a significant computational component, or equivalent skills and experience. Excellent Unix/Linux systems administration skills. Experience in using/managing HPC systems and scheduler policies – e.g. SLURM, SGE, etc. Experience in using/operating IBM Spectrum Scale storage system – e.g., GPFS. Excellent interpersonal and communication skills, and demonstrable ability to work collaboratively and flexibly as part of a technical team. High attention to detail and accuracy, ability to analyse and interpret complex data, and to use it to solve complex technical problems quickly and effectively. Enthusiasm to learn new skills and stay up to date on the most recent technologies. Desirable (one or more will be advantageous): Experience in using OS deployment, configuration management and continuous integration tools - e.g. xCAT, Ansible, Terraform, Git, Github, Jenkins, etc. Experience in development and/or deployment of scientific research software - Conda, Easybuild, Spack, Singularity, Shifter, etc. Experience in management of Infiniband networks - e.g. Mellanox, etc. Demonstrable experience in Unix/Linux systems integration & DevOps skills, including: scripting and automation using at least one high level language - e.g., Python, Perl. networking - Ethernet, TCP/IP, ideally InfiniBand – e.g. Mellanox. Monitoring/logging tools - e.g. ICINGA, Grafana, Splunk, ELK Stack, etc. Experience in using or managing public/private cloud computing and storage resources – e.g. AWS, Microsoft Azure, Google Cloud Platform. Masters in a computational research field or equivalent professional experience in a Research & Development work environment, preferably related to Biomedical research. Find out what benefits the Crick has to offer: For more information on our great pay and benefits package please click here: https://www.crick.ac.uk/careers-and-study/life-at-the-crick/pay-and-benefits Equality, Diversity & Inclusion: We welcome applications from all backgrounds. We are committed to providing equal employment opportunities, regardless of ethnicity, nationality, gender, sexual orientation, gender identity, religion, pregnancy, age, disability, or civil partnership, marital or family status. We particularly welcome applications from people who are Minority Ethnic as they are currently underrepresented in the Crick at this level. Diversity is essential to excellence in scientific endeavour. It increases breadth and perspective, leading to more innovation and creativity. We want the Crick to be a place where everyone feels valued and where diversity is celebrated and seen as part of the foundation for our Institute’s success. The Crick is committed to creating equality of opportunity and promoting diversity and inclusivity. We all share in the responsibility to actively promote dignity, respect, inclusivity and equal treatment and it is our aim to ensure that these principles are reflected and implemented in all strategies, policies and practices. Read more on our website: https://www.crick.ac.uk/careers-and-study/life-at-the-crick/equality-diversity-and-inclusion