We are seeking an experienced MLOps Engineer to bridge the gap between machine learning models and production environments. As an MLOps Engineer, you will be responsible for building, deploying, and maintaining scalable machine learning infrastructure in AWS. You will work closely with data scientists, DevOps teams, and software engineers to ensure that machine learning models can be successfully operationalised, monitored, and updated in real-time environments.
Key Responsibilities:
* Design and deploy scalable machine learning pipelines using AWS services (SageMaker, Lambda, ECS/EKS, DynamoDB) and automate infrastructure with CloudFormation, Terraform, or AWS CDK.
* Implement robust monitoring for model performance and drift with tools like CloudWatch, SageMaker Model Monitor, ensuring models meet business and compliance requirements.
* Automate the full machine learning lifecycle, integrating models into CI/CD pipelines (CodePipeline, Jenkins, GitLab CI) for seamless deployment and version control.
* Collaborate with data scientists and engineers to transition models from development to production, optimizing workflows and resource usage.
* Manage and optimize data pipelines, ensuring data is available for training, testing, and inference at scale, supporting model performance improvements.
* Design cloud-native, cost-efficient machine learning solutions that scale based on real-time data and increasing workloads.
Required Skills & Experience:
* Hands-on experience with AWS services such as SageMaker, Lambda, EKS, EC2, CloudFormation, and DynamoDB for deploying and managing machine learning models.
* Proficiency in containerization (Docker, Kubernetes) and automating ML pipelines using CI/CD tools like CodePipeline, Jenkins, and GitLab CI.
* Experience with model versioning tools (MLflow, DVC, SageMaker Model Registry) and automating data workflows to ensure data availability and traceability.
* Strong background in Python, Bash, and scripting to automate model management, training, and deployment processes.
* Knowledge of cloud infrastructure security practices, including data privacy, model security, and compliance standards like GDPR and SOC 2.
* Familiarity with AWS big data tools (Redshift, Glue, EMR) for processing large datasets to support machine learning models.
Preferred Qualifications:
* AWS Certified Machine Learning – Specialty or other relevant certifications.
* Experience with machine learning deployment frameworks (TensorFlow Serving, Kubeflow, MLflow) and managing containerized workloads with ECS/EKS.
* Deep understanding of data privacy regulations, model security, and designing solutions that are compliant with industry standards.
* Background in machine learning libraries such as TensorFlow, PyTorch, or XGBoost for model development and training.
* Familiarity with serverless computing for ML workflows using AWS Lambda and API Gateway, and multi-cloud environments.
If you are a skilled MLOps Engineer with a passion for automating machine learning pipelines, deploying models at scale, and optimizing cloud-based infrastructures, we’d love to hear from you!
#J-18808-Ljbffr