Job Description
Job Description
Job Title: Data Engineer (Iceberg Experience) – 8+ Years of overall Experience.
Domain: Banking
As a Data Engineer with Iceberg experience, you will play a crucial role in the design, development, and maintenance of our data infrastructure. Your work will empower data-driven decision-making and contribute to the success of our data-driven initiatives.
Key Responsibilities:
• Data Integration: Develop and maintain data pipelines to extract, transform, and load (ETL) data from various sources into AWS data stores for both batch and streaming data ingestion.
• AWS Expertise: Utilize your expertise in AWS services such as Amazon EMR, S3, AWS Glue, Amazon Redshift, AWS Lambda, and more to build and optimize data solutions.
• Data Modeling: Design and implement data models to support analytical and reporting needs, ensuring data accuracy and performance.
• Data Quality: Implement data quality and data governance best practices to maintain data integrity.
• Performance Optimization: Identify and resolve performance bottlenecks in data pipelines and storage solutions to ensure optimal performance.
• Documentation: Create and maintain comprehensive documentation for data pipelines, architecture, and best practices.
• Collaboration: Collaborate with cross-functional teams, including data scientists and analysts, to understand data requirements and deliver high-quality data solutions.
• Automation: Implement automation processes and best practices to streamline data workflows and reduce manual interventions.
• Experience working with bigdata ACID file formats to build delta lake, particularly with Iceberg file formats and loading methods of Iceberg.
• Good knowledge on Iceberg functionalities to use the delta features to identify the changed records, optimization and housekeeping on Iceberg tables in the data lake.
Must have: AWS, ETL, EMR, GLUE, Spark/Scala, Java, Python
Good to have: Cloudera – Spark, Hive, Impala, HDFS, Informatica PowerCenter, Informatica DQ/DG, Snowflake Erwin
Qualifications:
• Bachelor's or Master's degree in Computer Science, Data Engineering, or a related field.
• 5 to 8 years of experience in data engineering, including working with AWS services.
• Proficiency in AWS services like S3, Glue, Redshift, Lambda, and EMR.
• Knowledge on Cloudera based hadoop is a plus.
• Strong ETL development skills and experience with data integration tools.
• Knowledge of data modeling, data warehousing, and data transformation techniques.
• Familiarity with data quality and data governance principles.
• Strong problem-solving and troubleshooting skills.
• Excellent communication and teamwork skills, with the ability to collaborate with technical and non-technical stakeholders.
• Knowledge of best practices in data engineering, scalability, and performance optimization.
• Experience with version control systems and DevOps practices is a plus.