Our Client A sports brand that originated in the United States in 1886, already has a footprint in Canada, Australia, Japan and a number of other countries. are excited to expand our international footprint and are searching for someone to define and lead their SRE practice. Your Responsibilities Design and implement monitoring solutions and dashboards for all critical components - we should be the first to know if there’s a problem Liaise with development and support leads, and the IT Manager to set up and maintain outage notification policies and escalation plans Review the performance of our platform and develop load test strategies Review existing structures and help optimise the design of new systems and configurations under reliability and security aspects Review and help to automate infrastructure further Develop an Incident Response process and planDevelop a (blame-free) post-mortem culture within teams, where we can learn from failure Develop Playbooks and Documentation to help share knowledge amongst the various teams who interact with the website platform. We want to break down knowledge-silos. Your Qualifications AWS including Kubernetes (EKS) PostgreSQL (RDS) Infrastructure as Code using Terraform Preference for hosted solutions as opposed to “roll-your-own” or “reinventing the wheel” (hence RDS and EKS) New Relic for monitoring and observability The front-end of the website is detached from Drupal, written with Ruby on Rails APIs built in either Ruby and/or Node.js, powering services such as data ingestion Halian Group With over 20 years of experience, we have come to understand that innovation is the only way to provide agile, practical solutions that transform businesses and careers. Our resourcing and smart services help you to realize tomorrow’s potential. Discover the amazing things possible when you bring the right people and the right technologies together.