Our client, a multinational semiconductor and software design company, seeks a Linux Platform Engineer for a 6-month contract initially to start in January, based in Cambridge (2 days per week), Inside IR35.
Job Overview:
The Linux Platform Engineer will be responsible for follow-the-sun global operational support and management of the Linux Platforms hardware, OS, and related services. Working closely with core teams, project managers, and end users across Infrastructure, Engineering & Applications (IE&A) domains, you will be a key contributor in maintaining HPC hardware, Linux OS, coordinate break-fix, upgrade, and patches. You will focus on the detailed implementation, improvement, and lifecycle management of RedHat Linux-based systems, services, and vendor tools to ensure efficient, consistent, and continuous support is provided within SLAs. You’ll need to be a positive team player, enthusiastic, self-starter with a flexible attitude in applying different techniques to help drive successful outcomes.
Responsibilities
1. Responsible to look after Linux servers, OS, hardware, and services for maintenance and support of large scale HPC cluster environment.
2. Improve the management of our hardware by maintaining and updating OME tools for BIOS/firmware updates and server management, including lifecycle management.
3. Maintain the availability of HPC hardware and OS by proactive monitoring for improved reliability and efficiency.
4. Swiftly coordinate & address hardware break-fix from the ServiceNow queue, while also handling additional system administration tasks.
5. Build clear and concise documentation of hardware & server infrastructure.
6. Be on call & be available to support Service Improvements at weekends as scheduled.
Required Skills and Experience
1. Advanced level RedHat Linux administration skills on performance tuning, server deployment, boot process, PXE boot, IPMI tool, out of band management, redfish API and vendor toolkits.
2. Hands-on experience with OS deployment, server configuration/tuning, hardware deployment, configuration and solve any issues during server deployment.
3. Experience with rack power calculation & allocation and network & power patching.
4. Hands-on experience with vendor OEM operation tools like Dell OME, HP OneView and NLYTE.
5. Solid understanding and experience in monitoring and alerting tools like PRTG, Nagios and other SNMP based incident management.
6. Hands-on experience with scripting language and automating repeated tasks using (CSH, BASH, PYTHON) or any other programming language.
7. Exposure to Infrastructure management tools like Foreman, Puppet, Ansible, Git, vCenter, ESXi, HAProxy, Bluecat DNS.
8. Excellent communication skills and the ability/willingness to learn new technologies.
#4639229 - Karl Randall
#J-18808-Ljbffr