Job Description
Job Title: SRE Consultant
Location: Reading - Hybrid - (2-3 days a week onsite is mandatory)
Duration: 6 months Rolling Contract
Budget: £300 - £350 per day, all inclusive
PROJECT IS INSIDE IR35
Job Description:
We are seeking a skilled Site Reliability Engineer (SRE) with experience in Azure-based software platforms to ensure the reliability, performance, and scalability of our cloud-hosted applications. This role blends software engineering with operations, focusing on automation, monitoring, and incident response strategies within Azure environments.
Key Responsibilities:
* Design, implement, and maintain scalable, highly available cloud infrastructure on Azure.
* Develop automation and monitoring solutions to improve system reliability and operational efficiency.
* Lead incident response, root cause analysis, and post-mortem reviews to drive continuous improvement.
* Optimize system performance, capacity planning, and cost management within Azure environments.
* Collaborate with development and operations teams to implement best practices for CI/CD, observability, and security.
* Utilize Infrastructure as Code (IaC) tools (eg, Terraform, Bicep) to manage cloud resources efficiently.
Key Skills & Experience:
* Proven experience as an SRE or DevOps Engineer in an Azure-based environment.
* Strong expertise in Azure services, including Azure Monitor, Application Insights, Azure Kubernetes Service (AKS), and Azure DevOps.
* Hands-on experience with automation and configuration management tools (eg, Terraform, Ansible, PowerShell, Python).
* Proficiency in monitoring, logging, and alerting solutions for cloud-native applications.
* Strong understanding of incident management, SLOs, SLIs, and error budgeting.
* Experience with CI/CD pipelines and containerization (Docker, Kubernetes).
* Excellent troubleshooting and problem-solving skills in large-scale distributed systems.