Our client, a multinational semiconductor and software design company, seeks a Linux Platform Engineer for a 6-month contract initially to start in January, based in Cambridge (2 days per week), Inside IR35, up to £650 per day.
Job Overview:
Linux Platform Engineer will be responsible for follow-the-sun global operational support and management of the Linux Platforms hardware, OS and related services.
Working closely with core teams, project managers, and end users across Infrastructure, Engineering & Applications (IE&A) domains, you will be a key contributor in maintaining HPC hardware, Linux OS, coordinate break-fix, upgrade and patches.
You will focus on the detailed implementation, improvement, and lifecycle management of RedHat Linux-based systems, services and vendor tools to ensure efficient, consistent and continuous support is provided within SLAs.
You’ll need to be a positive team player, enthusiastic, self-starter with a flexible attitude in applying different techniques to help drive successful outcomes.
Responsibilities
Responsible to look after Linux servers, OS, hardware, and services for maintenance and support of large scale HPC cluster environment
Improve the management of our hardware by maintaining and updating OME tools for BIOS/firmware updates and server management, including lifecycle management.
Maintain the availability of HPC hardware and OS by proactive monitoring for improved reliability and efficiency
Swiftly coordinate & address hardware break-fix from the ServiceNow queue, while also handling additional system administration tasks
Build clear and concise documentation of hardware & server infrastructure
Be on call & be available to support Service Improvements at weekends as scheduled
Required Skills and Experience
Advanced level RedHat Linux administration skills on performance tuning, server deployment, boot process, PXE boot, IPMI tool, out of band management, redfish API and vendor toolkits
Hands on experience with OS deployment, server configuration/tuning, hardware deployment, configuration and solve any issues during server deployment
Experience with rack power calculation & allocation and network & power patching
Hands-on experience with vendor OEM operation tools like Dell OME, HP OneView and NLYTE
Solid understanding and experience in monitoring and alerting tools like PRTG, Nagios and other snmp based incident management
Hans-on experience with scripting language and automating repeated tasks using (CSH, BASH, PYTHON) or any other programming language!
Exposure to Infrastructure management tools like Foreman, Puppet, Ansible, Git, vCenter, ESXi, HAProxy, Bluecat DNS
Excellent communication skills and the ability/willingness to learn new technologies