We are looking for a skilled Data Engineer with strong expertise in PySpark and Hadoop to build and manage scalable data pipelines and support data processing across large datasets.
Key Responsibilities:
Design, develop, and maintain scalable data pipelines using PySpark
Work with Hadoop ecosystem for distributed data processing and storage
Develop and optimize Python-based data workflows
Schedule, monitor, and manage workflows using Airflow
Collaborate with cross-functional teams to ensure data availability and reliability
Must-have Skills:
Strong hands-on experience with PySpark
Positive knowledge of Hadoop ecosystem (HDFS, Hive, etc.)
Proficiency in Python programming
Experience with Apache Airflow for workflow orchestration
Understanding of data processing, ETL concepts, and large-scale data systems
📌 Data Engineer (Bengaluru)
🏢 C5i
📍 Bengaluru
Reply to this offer
Impress this employer describing Your skills and abilities, fill out the form below and leave Your personal touch in the presentation letter.