City Plumbing is seeking an experienced Senior Site Reliability Engineer to join our newly created IT team and help ensure the reliability, scalability, and performance of our IT systems. This Senior Site Reliability Engineer requires technical expertise and a customer-centric mindset.
Think you could be our new Senior Site Reliability Engineer? Then why not join us and be part of a wider tech & digital transformation project that will ensure we’ll become the digital leader in plumbing, heating, and sustainable heating solutions!
We are seeking a talented and experienced Site Reliability Engineer (SRE) with a strong focus on observability to join our growing team. As our Senior Site Reliability Engineer, you will play a crucial role in ensuring the reliability, scalability, and performance of our systems through the implementation and management of advanced observability tools and practices. It is a great time to join the City Plumbing team as we continue with our Digital transformation - so why not join now as our new Senior Site Reliability Engineer.
Responsibilities:
* Design, implement, and maintain robust monitoring, logging, tracing, and alerting systems to provide comprehensive visibility into the health and performance of our production environment.
* Proactively identify areas for improvement in our observability infrastructure and drive initiatives to enhance monitoring coverage, reduce noise, and increase actionable insights.
* Develop and maintain automated dashboards, reports, and analysis tools to facilitate data-driven decision-making and troubleshooting.
* Participate in incident response and post-mortem processes, leveraging observability data to quickly diagnose issues, mitigate impact, and implement preventive measures.
* Conduct capacity planning and performance analysis to ensure our systems can handle current and future growth while meeting performance targets.
* Work closely with cross-functional teams to develop and maintain infrastructure as code (IaC) using tools such as Terraform and Helm.
* Implement and enforce security best practices in system design and operation.
This is a hybrid-based opportunity with up to 2 days per week in the office. You should be based in the UK and our Offices are based in Aston (Birmingham), Salford (Manchester), Glasgow and Crick (Northamptonshire).
You’ll live and breathe our digital-first ethos, with a proactive and “can-do” approach. You’ll enjoy working collaboratively with the wider SRE and IT team and you’ll be committed to acting with integrity and honesty in everything you do.
Minimum Requirements:
* Proven experience as a Senior SRE or similar role, with a strong background in software development and operations.
* Expertise in DevOps + SRE methodologies, tools, and best practices.
* Strong knowledge of cloud platforms (AWS) and containerization technologies (Kubernetes).
* Able to communicate with all levels of stakeholders.
* Proficient in analytical thinking and problem-solving.
* Continuous learning mindset, keeping up to date with industry trends.
We’re passionate about creating an inclusive workplace that celebrates and values diversity. Bring your whole self to work regardless of age, disability, gender identity or reassignment, marital or civil partner status, pregnancy or maternity, race, colour, nationality, ethnic or national origin, religion or belief, sex or sexual orientation. We don’t want you to ‘fit’ our culture, we want you to enrich it.
#J-18808-Ljbffr