Principal engineer - reliability engineering

Fleet (Hampshire)

Principal engineer

Posted: 1 March

Offer description

Ready for a challenge? Then Just Eat Takeaway.com might be the place for you. We’re a leading global online food delivery platform, and our vision is to empower everyday convenience. Whether it’s a Friday-night feast, a post-gym poke bowl, or grabbing some groceries, our tech platform connects tens of millions of customers with hundreds of thousands of restaurant, grocery and convenience partners across the globe. About this role We are seeking a seasoned Principal Engineer to lead the design, development, and evolution of our Observability Platform, ensuring it meets the needs of our rapidly scaling systems and engineering teams. This role will also focus on leveraging Machine Learning (ML) and Artificial Intelligence (AI) to deliver advanced insights that proactively improve system health and drive down Mean Time to Detection (MTTD) and Mean Time to Resolution (MTTR). The ideal candidate will be a visionary technologist with deep expertise in observability, monitoring, and distributed systems, capable of driving strategy, architecture, and execution for a world-class platform. These are some of the key ingredients to the role: Platform Leadership Architect, design, and implement a cutting-edge Observability Platform to support metrics, logs, traces, and events at scale. Integrate ML/AI-driven solutions to enhance anomaly detection, root cause analysis, and predictive insights. Lead the development and adoption of platform capabilities to ensure system health, reliability, and performance. Establish and evolve platform standards and best practices to align with the company’s overall engineering goals. Strategic Initiatives Collaborate with engineering teams to define the observability strategy, ensuring alignment with business and operational objectives. Identify and integrate the latest observability technologies, including AI-based analytics, to improve system insights and developer productivity. Drive a platform-first mindset, ensuring observability is treated as a foundational capability across all services. AI/ML to reduce detection and resolution times. Operational Excellence Ensure the Observability Platform is highly available, performant, and secure across all environments. Optimize data collection, processing, and storage to balance performance with cost efficiency. Define SLAs, SLOs, and SLIs for observability services to support reliability engineering practices. Continuously improve MTTD and MTTR by leveraging advanced AI/ML models for predictive analysis and automated responses. Mentorship and Collaboration Act as a mentor and technical leader for engineers, fostering a culture of learning, innovation, and excellence. Collaborate with stakeholders, including Site Reliability Engineering (SRE), infrastructure, and application teams, to gather requirements and deliver impactful solutions. Advocate for observability as a critical enabler of operational success across the organization. What will you bring to the table? Extensive Engineering Experience: Proven experience in building and scaling observability platforms in a cloud-native environment. Observability Expertise: Deep understanding of observability pillars (metrics, logs, traces) and related tools (e.g., Prometheus, Grafana, OpenTelemetry, Jaeger, Kibana Elastic Stack). AI/ML Proficiency: Hands-on experience integrating ML/AI models into observability systems to drive advanced insights, anomaly detection, and predictive analysis. Distributed Systems Knowledge: Strong expertise in designing scalable and reliable systems for high-throughput data collection and processing. Programming Skills: Proficiency in one or more languages (e.g., Go, Python, Java, Terraform, Pulumi) with a focus on building robust platforms. Cloud Proficiency: Hands-on experience with cloud platforms (e.g., AWS, GCP, Azure) and Infrastructure-as-Code tools (e.g., Terraform, Pulumi). Leadership and Mentorship: Experience leading and mentoring multicultural and cross functional, multicultural engineering teams, driving technical decisions, and delivering large-scale initiatives. Cost Optimization: Familiarity with strategies for managing the costs associated with observability data storage, processing, and analysis. Desirable Qualifications: Expertise in applying AI/ML for proactive alerting, root cause analysis, and predictive scaling. Experience with service mesh technologies (e.g., Istio, Linkerd) and their observability implications. Contributions to open-source observability or ML/AI projects. Proficiency with container technologies (Docker, Kubernetes) and best practices configurations their implications for observability and monitoring. Understanding of statistical analysis, data mining, and feature engineering techniques to extract meaningful insights from observability data. At JET, this is on the menu: Our teams forge connections internally and work with some of the best-known brands on the planet, giving us truly international impact in a dynamic environment. Fun, fast-paced and supportive, the JET culture is about movement, growth and about celebrating every aspect of our JETers. Thanks to them we stay one step ahead of the competition. Inclusion, Diversity & Belonging No matter who you are, what you look like, who you love, or where you are from, you can find your place at Just Eat Takeaway.com. We’re committed to creating an inclusive culture, encouraging diversity of people and thinking, in which all employees feel they truly belong and can bring their most colourful selves to work every day. What else is cooking? Want to know more about our JETers, culture or company? Have a look at our career site where you can find people's stories, blogs, podcasts and more JET morsels.

Apply

Create E-mail Alert

Save

Similar job

Senior/principal engineer - water

Slough

Fortis Recruitment Solutions

Principal engineer

£90,000 a year

Similar job

Senior/ principal engineer - tunnel systems

Guildford

Apex Contracting Co.

Principal engineer

€125,000 - €150,000 a year

Similar job

Principal engineer - identity and platform security

Hook

TN United Kingdom

Principal engineer

€100,000 - €125,000 a year