We are currently recruiting for an experienced SRE - Site Reliability Engineer to join the Risk IT Ops group at a leading investment bank in London. The successful candidate will be tasked with improving the high availability and resilience, enhance the load management with L4 & L7 load balancers and build a dynamic and scalable infrastructure to accommodate the high-volume business transactions. In order to be considered for this role, applicants must have the following set of skills: SRE & DevOps experience ideally within an investment bank or within a high volume transactional environment Ansible Unix Commands & Shell scripting Jenkins - CI/CD Pipelines using Groovy scripts Terraform Load Balancers (Apached or Nginx) SRE practices, automation and monitoring. Main Tasks and Responsibilities: Design, develop and implement systems software/scripts that improve the stability, scalability, availability, and latency of the Risk system applications. Solve problems occurring with our highly available production systems and build solutions & automation using combination of scripting & tooling to prevent them from happening again. Defines and drives adoption of a best-in-class monitoring framework to accomplish end-to-end flow monitoring and effective alerting. Monitoring system performance and capacity levels to ensure high availability of applications with minimal downtime. Build and run capacity tests to manage the growth of systems. Investigating any service disruptions or other service issues to identify their causes. Performing regular audits of servers to check for signs of degradation or malfunction which involves infra hygiene and end of life. Conducting examinations of failed systems to identify and address root cause. Accountable for maintenance and improvement of IT continuity strategies Accountable for generation, reporting and improvements of various Production KPIs, SLs and dashboards. Be an advocate of release engineering best practices such as ZERO Downtime, Canary release, Incremental rollouts etc., Share the on-call rotation and be an escalation contact for incidents. Works with Development, DevOps and IT operational team throughout the Software Development Life Cycle to ensure sustainable software releases. Please submit your CV immediately in order to be considered for this role Your International Talent Provider iKas International Limited is providing recruitment services for this role. By clicking 'APPLY NOW', you confirm that you understand that any personal data you submit through your application will be used to provide you with our recruitment services. For further detail on how iKas International Limited process your data, please read the iKas Privacy Statement .