Site Name: UK – London – New Oxford Street, UK - Hertfordshire - Stevenage
Posted Date: Nov 15 2024
GSK is a global leader in pharmaceuticals and healthcare, with a relentless commitment to advancing healthcare for the betterment of humanity. Our mission is to help people around the world do more, feel better, and live longer. We achieve this by researching, developing, and providing innovative medicines and vaccines. Our dedication to scientific excellence and ethical practices guides everything we do.
R&D at GSK is highly data-driven, and we’re applying AI/ML and data science to generate new insights, enable analytics, gain efficiencies and automation.
Job Description
This role is based in a team that works on projects involving AI/ML, generative AI, information retrieval, and data science. The team’s future projects will be in diverse areas, such as regulatory, clinical, legal, and HR. Versatility is key, with an ability to quickly understand domain data and requirements and translate them into solutions. You will interact with architects, software and data engineers, modelers, data scientists, AI/ML engineers, product owners as well as other team members in Clinical Solutions and R&D. You will actively participate in creating technical solutions, designs, implementations, and participate in the relentless improvement of R&D Tech systems in alignment with agile and DevOps principles.
Data Engineering is responsible for the design, delivery, support, and maintenance of industrialized automated end-to-end data services and pipelines. They apply standardized data models and mapping to ensure data is accessible for end users in tools through the use of APIs. They define and embed best practices and ensure compliance with Quality Management practices and alignment to automated data governance. They also acquire and process internal and external, structured and unstructured data in line with Product requirements.
As a Senior Principal Data Engineer, you will be able to develop a well-defined specification for a function, pipeline, service, or other sort of component, and a technical approach to building it, and deliver it at a high level. In that respect, you will be a technical contributor but will also provide leadership and guidance to junior data engineers. You will be aware of, and adhere to, best practices for software development in general (and data engineering in particular), including code quality, documentation, DevOps practices, and testing. You will ensure the robustness of our services and serve as an escalation point in the operation of existing services, pipelines, and workflows. You should have awareness of the most common tools (languages, libraries, etc.) in the data space, such as Spark, Databricks, Kafka, ADF/AirFlow, Snowflake, Denodo, etc. and have experience working on Azure.
In this role you will:
* Build modular code/libraries/services using modern data engineering tools (Python/Spark, Databricks, Kafka) and orchestration tools (e.g., ADF, Airflow Composer)
* Produce well-engineered software, including appropriate automated test suites and technical documentation
* Ensure consistent application of platform abstractions to ensure quality and consistency with respect to logging and lineage
* Adhere to QMS framework and CI/CD best practices
* Provide leadership and guidance to junior data engineers
Qualifications & Skills:
We are looking for professionals with these required skills to achieve our goals:
* Bachelor’s degree in data engineering, Computer Science, Software Engineering, or related discipline
* Experience in industry as a Data Engineer
* Solid experience with working on Azure
* Experience with choosing appropriate data structures for scale and access patterns
* Knowledge and use of at least one common programming language (preferably Python), including toolchains for documentation and testing
* Exposure to modern software development tools/ways of working (e.g., git/GitHub, DevOps tools)
* Software engineering experience
* Hands-on experience with logging and monitoring tools
* Exposure to common tools for data engineering (e.g., Spark, ADF/AirFlow, Databricks, Snowflake, Kafka, Denodo)
* Demonstrable experience overcoming high volume, high compute challenges
* Familiarity with databases and SQL
* Familiarity with Data Mesh/Fabric concepts, with exposure to MS Fabric a bonus
* Exposure to automated testing techniques
Preferred Qualifications & Skills:
* Masters or PhD in Data Engineering, Computer Science, Software Engineering, or related discipline
* Azure certifications for data engineering
Closing Date for Applications: Friday 6th December 2024 (COB)
Please take a copy of the Job Description, as this will not be available post closure of the advert. When applying for this role, please use the ‘cover letter’ of the online application or your CV to describe how you meet the competencies for this role, as outlined in the job requirements above. The information that you have provided in your cover letter and CV will be used to assess your application.
#J-18808-Ljbffr