Senior SRE Contractor (Lead)
Contract details
* Lead Site Reliability Engineer
* OUTSIDE IR35 contract
* Hybrid in London area
Our client is seeking an experienced Site Reliability Engineering (SRE) Consultant to take the lead in a team of six engineers, ensuring the resilience, efficiency, and ongoing optimisation of key components within our client’s digital infrastructure. Your work will directly enhance the end-user experience by driving performance improvements and system stability.
About the Role
This opportunity involves assessing and refining existing processes to enhance reliability, efficiency, and operational stability. As the principal contact for the business, you will steer strategy, improve collaboration, manage risks, and oversee client communications.
Key Responsibilities
* Define and execute the strategic roadmap for the SRE function, setting clear goals and ensuring timely delivery of objectives.
* Establish robust monitoring frameworks to enhance system stability, optimise performance, and prevent downtime.
* Act as the bridge between engineering teams, product managers, and senior leadership to foster alignment and operational excellence.
* Analyse incidents and system performance, providing transparent reporting to stakeholders while driving reliability improvements.
* Embed best practices in incident management, automation, and reliability engineering to bolster system resilience.
* Identify vulnerabilities within the infrastructure and implement mitigation strategies to safeguard business continuity.
What You Bring
* Proven experience leading distributed engineering teams in dynamic, fast-paced environments, ideally within SRE or operations-focused roles.
* Expertise in incident response, including defining and maintaining Service Level Objectives (SLOs) .
* Strong analytical skills to conduct in-depth post-incident evaluations, identifying root causes and implementing preventative measures.
* Hands-on technical experience with scalable system architectures, ensuring high availability of complex distributed systems.
* Familiarity with cloud-native technologies, such as microservices and serverless architectures (preferred).
* Ability to collaborate effectively across engineering, product, and operational teams to align priorities and deliver results.
* Experience in measuring customer satisfaction and implementing feedback mechanisms to refine service reliability.
* A problem-solving mindset with a proactive approach to process improvements and operational challenges.
* Excellent communication skills, with the ability to convey complex technical topics to a non-technical audience.
* Strong organisational skills, with the ability to manage priorities, meet deadlines, and balance multiple demands.
* Knowledge of service management frameworks such as ITIL is a plus.
Experience & Technical Skills
* Extensive background in Site Reliability Engineering or a similar discipline, with a focus on scalability, performance, and reliability .
* Practical experience with public cloud platforms (e.g., AWS, Azure, or Google Cloud).
* A proven track record of deploying and refining monitoring and alerting systems to ensure system health and performance.
Leadership & Collaboration
* A strong leader with the ability to mentor and support remote teams.
* Skilled in aligning technical teams and business stakeholders to drive shared success.
* Passionate about continuous improvement, fostering innovation, and embedding a culture of reliability within teams.
If you are a results-driven professional with a passion for system reliability and a knack for leadership, please apply online!
Unfortunately we cannot accept applicants from outside of the UK, and our client is unable to provide sponsorship