Seeking a versatile and proactive Platform Engineer / Site Reliability Engineer (SRE) to join our team. You will be a cornerstone of our infrastructure and DevOps practice, responsible for designing, supporting, and improving our cloud-based platforms. Collaborating closely with developers, you will play a critical role in ensuring stability, security, and performance across our systems. This role requires flexibility, strategic thinking, and practical problem-solving skills.
Responsibilities:
* Azure Infrastructure Management: Design, build, deploy, manage, and support core infrastructure components on Microsoft Azure, including Azure Web Apps, Azure SQL Managed Instances, and associated services.
* CI/CD Pipeline Management: Implement, maintain, and optimize Continuous Integration and Continuous Deployment (CI/CD) pipelines using Azure Pipelines to enable efficient and reliable software delivery.
* Monitoring & Incident Response: Establish, configure, and manage comprehensive monitoring and alerting systems (e.g., Azure Monitor) for the platform. Proactively identify potential issues, respond to alerts, troubleshoot incidents, and perform root cause analysis.
* Platform Evolution & Maintenance: Stay informed about Azure service updates, new features, and depreciation notices. Plan and execute necessary upgrades, migrations, or changes to ensure the platform remains current, secure, and efficient.
* Developer Collaboration & Support: Act as a key technical liaison for application development teams. Provide infrastructure support, guidance on best practices, and troubleshoot platform-related issues impacting development or deployment.
* Technical Documentation: Develop and maintain clear, accurate, and up-to-date technical documentation for infrastructure configurations, processes, runbooks, and architectural decisions.
* Security & Compliance: Understand and implement network and security best practices within the Azure environment. Play a key role in maintaining our security posture and ensuring the company successfully achieves and maintains Cyber Essentials Plus certification annually.
* Operational Excellence: Drive automation, improve system reliability, and optimize performance across the platform.
Required Skills and Experience (Must-Haves):
* Proven hands-on experience designing, building, managing, and supporting infrastructure within Microsoft Azure.
* Demonstrable experience with Azure Web Apps and Azure SQL Managed Instances (or comparable PaaS database services).
* Solid experience implementing and managing CI/CD pipelines, specifically using Azure Pipelines (Azure DevOps).
* Experience with cloud monitoring tools (e.g., Azure Monitor, Application Insights) and implementing effective alerting strategies.
* Strong understanding of core networking concepts (TCP/IP, DNS, HTTP/S, Firewalls, NSGs) and cloud security principles.
* Broad general IT knowledge covering infrastructure, operating systems, and security concepts.
* Experience contributing to or managing systems in line with security compliance standards (familiarity with Cyber Essentials / Plus is highly advantageous).
* Excellent analytical and problem-solving skills with a proactive approach to identifying and resolving issues.
* Excellent communication skills (both written and verbal) with the ability to articulate technical concepts clearly to diverse audiences.
* Ability to work independently, manage multiple priorities, and thrive in a dynamic, small-team environment.
* Experience with Infrastructure as Code (IaC) tools such as Terraform, ARM, or Bicep Templates.