A world market research company are looking for a passionate Data Engineer to come and join their team.
You will be working in the Data Engineering team whose main function is developing maintaining and improving the end-to-end data pipeline that includes real-time data processing; extract, transform load jobs artificial intelligence and data analytics on a complex and large dataset.
You must have strong AWS and Pyspeak experience.
Your role will primarily work with Pyspark or Scala data transformations and maintain data pipleines on AWS infrastructure to develop innovative solutions to effectively scale and maintain the data platform. You will be working on complex data problems in a challenging and fun environment using some of the latest Big Data open-source technologies like Apache Spark as well as Amazon Web Service technologies including Elastic MapReduce Athena and Lambda to develop scalable data solutions.
The role is hybrid.
Great benefits
* 25 days paid holiday plus bank holidays
* Purchase/sale of up to 5 leave days pa - after 2 years’ service
* Life insurance
* Workplace pension with employer contribution
* Performance based bonus scheme
* Informal dress code
* Cycle to work scheme
* Branded company merchandise
* New company laptop
* One to one learning and development coaching sessions
* Support and budget available for training programmes
* 'Giving back’ to charities
Ideal Data Engineer
* Knowledge of Serverless technologies frameworks and best practices. • Experience using AWS CloudFormation or Terraform for infrastructure automation.
* Knowledge of Scala or Pyspark language such as Java or C#.
* SQL or Python development experience.
* High-quality coding and testing practices.
* Willingness to learn new technologies and methodologies.
* Knowledge of agile software development practices including continuous integration
* automated testing and working with software engineering requirements and specifications.
* Good interpersonal skills positive attitude willing to help other members of the team.
* Experience debugging and dealing with failures on business-critical systems.
* Preferable:
* Exposure to Apache Spark Apache Trino
* or another big data processing system.
* Knowledge of streaming data principles and best practices.
* Understanding of database technologies and standards.
* Experience working on large and complex datasets.
* Exposure to Data Engineering practices used in Machine Learning training and inference.
* Experience using Git Jenkins and other CI/CD tools