Role Overview
As an Azure Solutions Architect / DevOps Engineer, you will be a key player in shaping, securing, and scaling the infrastructure. You will solve complex infrastructure challenges by designing and implementing solutions that are robust, secure, and scalable. While Kubernetes on Azure (AKS) will be central to your toolkit, your role goes far beyond cluster management—you’ll help design systems, improve workflows, and ensure platform readiness for demanding AI-driven workloads.
You will collaborate closely with ML/AI teams and Research Engineers, enabling their workflows by delivering reliable infrastructure, optimised pipelines, and secure environments. Additionally, you will play a pivotal role in strengthening the company's security posture and advancing our progress towards ISO 27001 certification.
Key Responsibilities
* Solve Complex Infrastructure Challenges: Architect scalable, highly available, and secure solutions across Azure, ensuring performance, cost-efficiency, and alignment with business goals.
* Provision and Manage AKS: Design, provision, and optimise AKS clusters as part of a broader, cloud-native infrastructure strategy, ensuring resiliency, security, and observability.
* Enable System Design: Contribute to system design, evaluating trade-offs, and identifying the best tools, patterns, and architectures to meet current and future needs.
* Implement Infrastructure as Code (IaC): Use tools like Terraform, Bicep, or ARM templates to automate and document infrastructure deployments in a repeatable, auditable way.
* Build and Maintain CI/CD Pipelines: Develop and manage CI/CD workflows to support APIs, general services, and internal tools, enabling reliable and automated delivery.
* Collaborate Across Teams: Work closely with ML/AI teams and Research Engineers to deliver infrastructure solutions that support model training, real-time insights, and graph-based knowledge systems.
* Security First Mindset: Implement and enforce best-in-class security practices, including role-based access control (RBAC), encryption, vulnerability management, and network security configurations.
* Ensure Monitoring and Observability: Set up monitoring, logging, and alerting solutions using Azure Monitor, Log Analytics, and other tools to ensure system health and performance.
* Support Internal IT Operations: Assist with internal IT tasks, including Azure AD configurations, endpoint security, and compliance tooling, contributing to the company's ISO 27001 certification efforts.
* Drive Security and Compliance: Ensure infrastructure aligns with ISO 27001, SOC 2, and GDPR standards, implementing logging, auditing, and incident response processes.
* Continuously Improve: Identify and address infrastructure bottlenecks, improve cost-efficiency, and optimize for evolving platform requirements.
Expertise and Skills
Core Technical Competencies:
* Infrastructure as Code (IaC): Strong proficiency in Terraform, Bicep, or ARM templates for automating cloud infrastructure.
* Azure Kubernetes Service (AKS): Extensive experience designing, provisioning, securing, and scaling Kubernetes clusters on Azure.
* CI/CD Pipelines: Hands-on experience with tools like Azure DevOps, GitHub Actions, or Jenkins to automate delivery pipelines.
* Cloud Security: Expertise in implementing RBAC, encryption (in transit and at rest), secure networking (NSGs, firewalls), vulnerability scanning, and monitoring.
* Monitoring & Observability: Experience setting up end-to-end monitoring, logging, and alerting using tools like Azure Monitor, Log Analytics, and Prometheus.
System Design and Collaboration:
* Solution Architecture: Proven ability to design scalable, fault-tolerant systems that meet complex business and technical requirements.
* Cross-Team Collaboration: Ability to collaborate effectively with ML/AI teams and Research Engineers to deliver infrastructure solutions that enable AI workloads and research workflows.
Security and Compliance:
* ISO 27001 and SOC 2: Familiarity with implementing controls, auditing processes, and incident response strategies to meet compliance requirements.
* Security First Mindset: Deep understanding of cloud security frameworks and a track record of building secure-by-design infrastructure.
IT Operations:
* Internal IT Support: Experience supporting internal IT operations, including Azure AD, endpoint security, and internal tooling.
Mindset & Approach
* Security First: You prioritize security, governance, and compliance as foundational principles in every solution.
* Problem Solver: You embrace complex technical challenges and focus on delivering practical, scalable solutions.
* System Thinker: You look at the big picture, designing solutions that are future-proof, reliable, and cost-efficient.
* Collaborative Partner: You work seamlessly across teams, enabling Research Engineers, ML/AI teams, and leadership to achieve their goals.
* Continuous Learner: You stay up to date with emerging tools, security trends, and infrastructure best practices.
What Success Looks Like
Success in this role will be measured by:
* Robust, scalable infrastructure that meets the company's evolving needs.
* Highly available and secure AKS clusters and cloud infrastructure.
* Efficient and automated deployments through IaC and CI/CD pipelines.
* A proactive approach to security and compliance, supporting ISO 27001 readiness.
* Enabling ML/AI teams and Research Engineers to focus on innovation while relying on a stable and optimised infrastructure.
* Strong internal IT operations that meet business and compliance needs.
What We Offer
* Competitive salary
* Bonus scheme
* Wellness allowance
* Fully remote working (with regular company get-togethers)
* Private medical and dental insurance
* Life assurance, critical illness cover, and income protection
Provision and availability depend on your country of residence – we’ll discuss this with you.
Join us to solve complex infrastructure challenges, architect secure solutions, and enable the company's cutting-edge AI platform to scale reliably and securely.