Data Engineer with ML Ops experience
6 Months
Hybrid Working with 2 Days Per Week onsite
Inside IR35
My client, a top Global company, is currently looking to recruit a Data Engineer with ML Ops experience to join their team on a contract basis. Please note if successful you will need to set up via an Umbrella Company/PAYE. The successful Data Engineer will execute migration of raw and derived unstructured datasets (images, videos, etc.) between on-prem and cloud data locations (e.g. GCP, Azure, AWS). Datasets magnitude vary between small scale (Gb) up to large scale (Tb).
In this role, the Data Engineer will:
* Execute migration of raw and derived unstructured datasets (images, videos, etc.) between on-prem and cloud data locations (e.g. GCP, Azure, AWS).
* Ensure consistency between the data ingested and the data manifests.
* Organise raw and derived data into appropriate hierarchies.
* Collaborate with AI/ML engineers and product managers.
* Develop data pipelines for incoming batch data and update existing pipelines where necessary.
* Design and implement well decoupled, modularized, reusable, and scalable scripts and code for the retrieval and pre-processing of large-scale histopathology images into the AI/ML pipeline.
* Document data flows and ingestion pipelines, data use and re-use.
* Implement data flows to connect operational systems, data for analytics, and business intelligence (BI) systems (e.g. Power-BI).
* Ensure completion of requisite documentation i.e., ingestion form and any related documentation.
* Track & report completion of data migration to stakeholders and raise blockers preventing migration.
* Migrate ML pipelines from on-prem HPC solutions to the cloud.
* Migrate ML pipelines between cloud environments and across cloud computing providers.
* Optimise and parallelise said ML pipelines for scalability, speed, and cost efficiency.
We are looking for professionals with these required skills to achieve our goals:
* Experience as a professional data/software engineer.
* Experience with migrating ML pipelines from on-prem HPC solutions to the cloud; migrating ML pipelines between cloud environments and across cloud computing providers; and optimising and parallelising said ML pipelines for scalability, speed, and cost efficiency.
* Experience with large-size images and data formats for computational pathology (e.g. .svs, .tiff, .h5) is highly desirable.
* Advanced programming expertise in Python and in developing and delivering robust software solutions.
* Machine learning experience.
* Computer Vision experience/knowledge.
* CI/CD experience.
* Industrial experience in design, development, and deployment of data engineering pipelines.
* Experience with cloud platforms, such as Google Cloud Platform, Azure, AWS (preference GCP).
* Experience in handling big data at scale.
Carbon60, Lorien & SRG - The Impellam Group STEM Portfolio are acting as an Employment Business in relation to this vacancy.
#J-18808-Ljbffr