Job Title: Senior Site Reliability Engineer
Role Overview:
As a Senior Site Reliability Engineer, you will be responsible for maintaining and enhancing the reliability, scalability, and performance of our clients' platform. You’ll collaborate with engineering teams to troubleshoot, prevent issues, and build proactive solutions that improve both system operations and customer experience.
Key Responsibilities:
* Develop and implement monitoring, alerting, and diagnostic tools to identify and resolve infrastructure, platform, and application issues quickly and effectively.
* Proactively monitor system health to spot potential reliability, performance, and operational improvements.
* Lead the incident response process, conducting root cause analysis, and driving improvements to prevent future incidents.
* Optimise resource usage in cloud environments, with a particular focus on AWS, to improve cost-efficiency and scalability.
* Create and maintain tools that promote best practices in service reliability, ensuring smooth adoption across the organisation.
* Write clean, efficient code that enhances system scalability, performance, maintainability, and security.
* Collaborate with cross-functional teams to share knowledge, provide technical guidance, and contribute to the broader engineering efforts.
* Mentor other team members on best practices for monitoring, deployments, and risk management.
Qualifications:
* 5+ years of experience as a Site Reliability Engineer or in a similar DevOps role.
* Proven experience managing the reliability, scalability, and performance of high-traffic cloud-based SaaS systems.
* Strong hands-on experience with cloud platforms, particularly AWS.
* Expertise in setting up and managing robust monitoring systems and alerts.
* Experience with PostgreSQL databases.
* Proficiency in one or more programming languages (e.g., Python, Go, Ruby, etc.).
* Familiarity with infrastructure automation tools, such as Terraform.
* Solid understanding of Cloud, PaaS, and SaaS environments.
* A self-starter who thrives in a fast-paced, evolving environment.
Requirements:
* Occasional on-call duties (only for high-priority issues).
* Willingness to work outside regular business hours to accommodate different time zones.
Details:
* Flexible remote or hybrid working arrangements.
* Access to a co-working space in Manchester with amenities such as gym access.
* Work Anniversary Rewards.
* Regular social events & employer-funded travel (throughout the UK, Europe and internationally).
* Opportunities to grow your career and make an impact quickly.
Seniority level: Mid-Senior level
Employment type: Full-time
Job function: Information Technology
Industries: Software Development
#J-18808-Ljbffr