Director of Engineering (High Performance Computing Team)
Cambridge x3 days/week in the office, up to £170,000 per annum + benefits
We are looking for an experienced and innovative Director of Engineering to lead our clients global engineering team. This key leadership role is part of the Engineering IT Leadership team and will be responsible for overseeing several critical technical areas, including High-Performance Computing (HPC), Engineering Platform Access, Engineering Collaboration and Linux Platforms.
You will lead a global team to ensure seamless product development by maintaining and improving the infrastructure that supports engineering teams.
Key Responsibilities:
* High-Performance Computing (HPC): Manage and lead a large-scale HPC environment (handling half a million cores), using LSF (or similar schedulers) to ensure high availability, scalability, and operational efficiency.
* DevOps & Automation: Drive the implementation of DevOps best practices (CI/CD, Terraform, Ansible, GitLab) to automate infrastructure and improve the efficiency of development workflows.
* Engineering Collaboration Tools: Manage and optimize the Atlassian suite (Jira, Confluence) for enhanced engineering collaboration and compliance.
* Linux Platform Leadership: Oversee the Linux Platform team responsible for managing Linux-based infrastructure, especially for HPC servers.
* Virtualization & Kubernetes: Lead virtualization efforts involving VMware and Kubernetes clusters, ensuring efficient orchestration and resource utilization.
* Platform Access & Security: Lead teams handling login servers and user access solutions, ensuring seamless authentication experiences for engineers using OpenText ETX.
Leadership & Strategy:
* Strategic Roadmap: Define and implement a clear roadmap for the Engineering Platform that aligns with business goals and engineering needs.
* Team Leadership: Provide technical leadership, mentorship, and guidance to highly skilled teams, fostering a culture of innovation and continuous improvement.
* Cross-Functional Collaboration: Work closely with key stakeholders from engineering, IT security, and infrastructure teams to drive best practices and ensure excellent service delivery.
* Budget Management: Ensure cost-effective investments in technology while meeting the organization's strategic goals.
Required Skills & Experience:
* Expertise in HPC Environments: Strong experience managing large-scale HPC systems, preferably with LSF or similar schedulers.
* DevOps & Infrastructure as Code (IaC): Proficient in DevOps methodologies, CI/CD pipelines, and tools such as Terraform, Ansible, and GitLab.
* Experience with Cloud Platforms: In-depth knowledge of cloud platforms (AWS, GCP, Azure), with AWS being the primary focus.
* Leadership: Demonstrated ability to lead and inspire large, technically diverse teams (30-40 people) in a fast-paced environment.
* Background in Product Engineering: Experience in software development, especially in Python, and a product ownership mindset.
* Budget and Resource Management: Proven ability to manage budgets and resources effectively.
Preferred Backgrounds:
* Candidates from semiconductor companies or those with experience in high-performance computing (HPC) environments, or Oil and Gas etc.
* Experience in large-scale infrastructure management, such as virtualized environments and containerization (Kubernetes).
The culture is collaborative and supportive, with high expectations for delivering results. You will be joining a team that values innovation, efficiency, and seamless collaboration across functions.
The client is looking to pay up to £170,000 per annum + benefits. This is hybrid working role with a minimum of 3 days in the office per week in Cambridge.
For more information please send your CV to me on kamni.sharma@lafosse.com