You will play a crucial role in designing, building, and maintaining our data platform, with a strong emphasis on streaming data, cloud infrastructure, and machine learning operations. Key Responsibilities: Architect and Implement Data Pipelines: Design, develop, and maintain scalable and efficient data pipelines. Optimize ETL processes to ensure seamless data ingestion, processing, and integration across various systems. Streaming Data Platform Development: Lead the development and maintenance of a real-time data streaming platform using tools like Apache Kafka, Databricks, Kinesis. Ensure the integration of streaming data with batch processing systems for comprehensive data management. Cloud Infrastructure Management: Utilize AWS data engineering services (including S3, Redshift, Glue, Kinesis, Lambda, etc.) to build and manage our data infrastructure. Continuously optimize the platform for performance, scalability, and cost-effectiveness. Communications: Collaborate with cross-functional teams, including data scientists and BI developers, to understand data needs and deliver solutions. Leverage the project management team to coordinate project, requirements, timelines and deliverables, allowing you to concentrate on technical excellence. ML Ops and Advanced Data Engineering: Establish ML Ops practices within the data engineering framework, focusing on automation, monitoring, and optimization of machine learning pipelines. Data Quality and Governance: Implement and maintain data quality frameworks, ensuring the accuracy, consistency, and reliability of data across the platform. Drive data governance initiatives, including data cataloguing, lineage tracking, and adherence to security and compliance standards. Requirements Experience: 3 years of experience in data engineering, with a proven track record in building and maintaining data platforms, preferably on AWS. Strong proficiency in Python, experience in SQL and PostgreSQL. PySpark, Scala or Java is a plus. Familiarity with Databricks and the Delta Lakehouse concept. Experience mentoring or leading junior engineers is highly desirable. Skills: Deep understanding of cloud-based data architectures and best practices. Proficiency in designing, implementing, and optimizing ETL/ELT workflows. Strong database and data lake management skills. Familiarity with ML Ops practices and tools, with a desire to expand skills in this area. Excellent problem-solving abilities and a collaborative mindset. Nice to Have: Familiarity with containerization and orchestration tools (e.g., Docker, Kubernetes). Knowledge of machine learning pipelines and their integration with data platforms. Benefits 22 days holiday Plus 8 days bank holidays Staff discounts & Friends and Family discounts Cycle to work scheme and Tech Scheme Breakfast and drinks provided Charity day per annum supported Summer and Christmas Parties Street food days Perkbox membership Enhanced Maternity & Paternity leave Employee referral scheme