Vacancy Name
MLOps and Cloud Engineer
Vacancy No
VN552
Employment Type
Full-Time
Location
Remote
About Us
Our Mission
Happy, healthy pets, make for happy, healthy people.
We aim to strengthen the unique bond between pets and their parents through our innovative products and services, enabled by brilliant colleagues who embody our values of performance, exploration, togetherness, and sustainability.
Job Details
We are looking for a skilled MLOps Engineer with expertise in Azure Machine Learning (Azure ML) to collaborate closely with our ML development teams and optimize machine learning workflows. In this role, you will be responsible for deploying, monitoring, and maintaining ML models in production using Azure ML services, ensuring scalability, efficiency, and automation throughout the ML lifecycle.
This position serves as a bridge between the ML development team and the DevOps team, supporting both production and preproduction/experimental ML environments. Additionally, you will help facilitate cross-training with the DevOps team to ensure ongoing platform support for business-as-usual (BAU) operations.
Key Responsibilities
* Deploy and Manage ML Models: Work closely with data scientists and ML engineers to deploy and operationalize machine learning models using Azure ML endpoints, pipelines, and managed services.
* Optimize ML Workflows: Identify opportunities to enhance automated workflows, streamline build and release cycles, and improve overall ML infrastructure efficiency.
* Develop CI/CD Pipelines: Design and implement Azure DevOps CI/CD pipelines and reusable templates for deploying containerized services on Azure PaaS.
* Maintain Preproduction & Experimental Environments: Manage and support preproduction and experimental ML environments, ensuring ML developers have the necessary resources for testing and development.
* Infrastructure Design & Collaboration: Partner with development teams, DBAs, architects, and cloud providers to design, plan, and build infrastructure that meets business requirements.
* Documentation & Knowledge Sharing: Create and maintain comprehensive technical documentation to support cloud migration and infrastructure processes.
* Security & Compliance: Collaborate with the InfoSec team to integrate security best practices early in the development lifecycle (shift-left approach), ensuring compliance with access controls, encryption, and regulatory requirements.
* Advocate for Cloud Technologies: Promote best practices in cloud adoption, mentor team members, and provide technical guidance to drive innovation within the business.
* Customer-Centric Approach: Ensure solutions align with business goals, enhance customer experiences, and empower users to make informed decisions.
Successful Candidates Will Have
Key skills
* Experience with LLMs & Generative AI: Exposure to large language models (LLMs) and generative AI deployments.
* Infrastructure as Code (IaC): Proficiency in Terraform for managing cloud infrastructure.
* CI/CD & Version Control: Strong expertise in Git, CI/CD pipeline design, and automation using Azure DevOps, GitHub Actions, or similar tools.
* Containerization & Orchestration: Experience with Docker, Kubernetes (AKS), and containerized deployments in cloud environments.
* Linux Expertise: Strong knowledge of Linux systems, including performance tuning, security, and troubleshooting.
* Agile Mindset: Ability to adapt to evolving requirements, prioritize effectively, and work in an Agile development environment.
Person Specification
Required Skills and Work Experience;
Essential
* Monitoring & Observability: Hands-on experience with Azure Log Analytics, Azure Monitor, and other monitoring tools.
* Cloud Adoption & Best Practices: Familiarity with the Microsoft Azure Cloud Adoption Framework and cloud-native design principles.
* Strong Communication: Ability to clearly articulate technical concepts and collaborate effectively across teams.
* Advanced Troubleshooting: Excellent problem-solving skills with a proactive approach to identifying and resolving issues.
* DevOps Best Practices: Experience in implementing DevOps methodologies, automation, and best practices.
* Security & Zero-Trust: Understanding of Zero-Trust Security (Secure by Design) principles and their implementation in cloud environments.
Required Qualifications;
Essential
* Computing or related Degree
Working Pattern
Monday - Friday (plus out of hours)
Working Arrangement
Remote Working #J-18808-Ljbffr