We are seeking a talented and experienced Site Reliability Engineer (SRE) with a strong focus on observability to join our growing team. As a contract Site Reliability Engineer, you will play a crucial role in ensuring the reliability, observability, and performance of our systems through the use of advanced observability tools and practices. The system is an established product, with an experienced team, and we’re looking for an SRE to help us get to the next level. K ey Responsibilities: Design, implement, and maintain robust monitoring, logging, tracing, and alerting systems to provide comprehensive visibility into the health and performance of our production environment. Proactively identify areas for improvement in our observability infrastructure and drive initiatives to enhance monitoring coverage, reduce noise, and increase actionable insights. Develop and maintain automated dashboards, reports, and analysis tools to facilitate data-driven decision-making and troubleshooting. Participate in incident response drills and post-mortem processes of any actual issues during early adoption, leveraging observability data to quickly diagnose issues, mitigate impact, and implement preventive measures. Conduct capacity planning and performance analysis to ensure our systems can handle current and future growth while meeting performance targets. Implement and enforce security best practices in system design and operation. This is a hybrid-based opportunity with 2 days per week in the office (Manchester). Skills and competencies: Proven experience as a Senior SRE or similar role, with a strong background in software development and operations. Expertise in DevOps SRE methodologies, tools, and best practices. Strong knowledge of cloud platforms (Azure), Virtual Machines, and containerization technologies. Able to communicate with all levels of stakeholders. Proficient in analytical thinking and problem-solving.