You will need to login before you can apply for a job.
AI Ops and ML Engineer - NetDevOps & Observability
GSK is seeking a highly skilled and experienced AI Ops and ML Engineer - NetDevOps to join our dynamic IT infrastructure team. This role is pivotal in the ongoing development, deployment, and maintenance of our network infrastructure, with a specialized focus on integrating Artificial Intelligence (AI) and Machine Learning (ML) to enhance network automation, observability, and incident management.
The ideal candidate will bring a strong background in network engineering, AI/ML technologies, and automation, contributing to the modernization of GSK's network and observability platforms.
In this role you will
AI Ops and ML Development
* Design and implement AI-driven solutions for anomaly detection, predictive analytics, and automated remediation.
* Build and fine-tune machine learning models using frameworks like TensorFlow, PyTorch, or Scikit-learn to optimize network operations.
* Integrate AI Ops frameworks with tools such as Moogsoft, Dynatrace, or Splunk ITSI to deliver actionable insights.
Network Automation and Optimization
* Automate network tasks, configurations, and maintenance using tools like Ansible, Terraform, and scripting languages such as Python or PowerShell.
* Develop and maintain CI/CD pipelines for deploying AI Ops and ML solutions in network environments.
* Enhance monitoring capabilities using AI to reduce alert noise and prioritize critical issues.
Observability and Incident Management
* Collaborate with teams managing tools like SolarWinds, Zscaler ZDX, Elastic, and Juniper Mist to improve data collection, correlation, and visualization.
* Use AI/ML to proactively identify and resolve network vulnerabilities and performance issues.
* Conduct root cause analysis (RCA) for network incidents and implement long-term improvements based on AI insights.
Mentorship and Continuous Improvement
* Mentor junior engineers, providing guidance on adopting AI/ML technologies, automation, and NetDevOps best practices.
* Stay updated on industry trends in AI Ops, network automation, and observability, applying innovations to improve operational efficiency.
* Proactively identify opportunities for optimization and implement solutions to enhance network performance, reliability, and security.
Qualifications & Skills:
* Minimum of 8 years of experience in network engineering, with at least 2 years focusing on AI Ops or ML applications.
* Proficiency in machine learning frameworks (e.g., TensorFlow, PyTorch, Scikit-learn) and AI Ops platforms (e.g., Moogsoft, Dynatrace, Splunk ITSI).
* Expertise in network automation tools and frameworks like Ansible, Terraform, Puppet, or Chef.
* Strong understanding of network engineering principles, including routing, switching, firewalls, and load balancers.
* Advanced scripting skills in Python, PowerShell, or Bash.
* Relevant certifications such as AWS Certified Machine Learning, CCNP, or equivalent.
Closing Date for Applications: Tuesday 25th March 2025 (COB)
Please take a copy of the Job Description, as this will not be available post closure of the advert.
#J-18808-Ljbffr