Vacancy expired!
- Designing Hive/HCatalog data model includes creating table definitions, file formats, compression techniques for Structured & Semi-structured data processing
- Implementing Spark processing based ETL frameworks
- Implementing Big data pipeline for Data Ingestion, Storage, Processing & Consumption
- Experienced AWS Developer, familiar with AWS services, S3, EC2, EMR / databricks, Lambda, Aws Ci/Cd
- Enhancing the Talend-Hive/Spark & Unix based data pipelines
- Develop and Deploy Scala/Python based Spark Jobs for ETL processing
- Strong SQL & DWH concepts
- Function as integrator between business needs and technology solutions, helping to create technology solutions to meet clients’ business needs
- Lead project efforts in defining scope, planning, executing, and reporting to stakeholders on strategic initiatives
- Understanding of EDW system of business and creating High level design document and low level implementation document
- Understanding of Big Data Lake system of business and creating High level design document and low level implementation document
- Designing Big data pipeline for Data Ingestion, Storage, Processing & Consumption
- Must have strong Programming Language experience in Scala and/or Python