Senior Data Engineer (Nellore)

Senior Data Engineer (Nellore)

29 Mar
|
Idexcel
|
Nellore

29 Mar

Idexcel

Nellore

Job Description for Senior Data Engineer

Experience : 4years to 8years

Required Skills : Aws,Python,Pyspark,Databricks

Notice Period : Immediate to 15days

Databricks (Spark)

· Develop scalable ETL/ELT pipelines using PySpark (RDD/DataFrame APIs), Delta Lake, Auto Loader (cloudFiles), and Structured Streaming.

· Optimize jobs: partitioning, bucketing, Z-Ordering, OPTIMIZE + VACUUM, broadcast joins, AQE, checkpointing.

· Manage Unity Catalog: catalogs/schemas/tables, data lineage, permissions, secrets, tokens, and cluster policies.

· CI/CD for Databricks assets: notebooks, Jobs, Repos, MLflow artifacts.

· Build Medallion Architecture (Bronze/Silver/Gold) with Delta Live Tables (DLT) and expectations for data quality.

· Event-driven ingestion: Kafka/Kinesis → Databricks Streaming

Snowflake (DW & ELT)

· Model and implement star/snowflake schemas, data marts, and secure views.

· Performance tuning: clustering keys, micro-partitions, result caching, warehouse sizing, query profile analysis.

· Implement Task/Stream patterns for CDC; external tables for data lakes (S3); Snowpipe for near-real-time ingestion.

· Python/Snowpark for transformations and UDFs; SQL best practices (CTEs, window functions).

· Security: Row Level Security (RLS), Column Masking, OAuth/SCIM, network policies, data sharing (reader accounts).

AWS Data Engineering

· Storage & compute: S3 (lifecycle, encryption, partitioning), EMR (if needed), Lambda, Glue (ETL/Schema registry), Athena, Kinesis (Data Streams/Firehose), RDS/Aurora, Step Functions.

· Orchestration: MWAA/Airflow or Step Functions (error handling, retries, backfills, SLA alerts).

· Infra-as-code: Terraform/CloudFormation for reproducible environments (Databricks workspace, IAM, S3, networking).





· Security/compliance: IAM least privilege, KMS, VPC endpoints/private links, Secrets Manager, CloudTrail/CloudWatch, GuardDuty.

· Observability: CloudWatch metrics/logs, structured logging, datadog/Prometheus (optional), cost monitoring (tags/budgets).

Data Quality, Governance & Security

· Implement unit/integration tests for pipelines (e.g., pytest + Great Expectations + DLT expectations).

· Data contracts and schema evolution; monitor SLA/SLO; DQ dashboards (missingness, drift, freshness, completeness).

· PII handling: tokenization/pseudonymization, field-level encryption, KYB/KYC data flows adherence; audit trails.

· Cataloging & lineage through Unity Catalog and/or OpenLineage/Purview (if applicable).

DevOps & CI/CD

· Git workflows (branching, PR reviews), Databricks CLI/Terraform modules for jobs/clusters/UC, Snowflake DevOps (object versioning via schemachange or SQL-based migration).

· Automated testing in pipelines; feature flags, canary releases for data jobs; rollback strategies.

Client-Facing PoCs & Delivery

· Rapid PoC build: clearly defined success metrics, benchmark cost/performance, produce a transition plan to production.

· Present architectural decisions, trade-offs (Spark vs Snowflake ELT), and cost projections (Databricks DBU, Snowflake credits, storage egress).

· Produce runbooks, operational playbooks,



and knowledge transfer documents for client teams.

Required Technical Skillset

· Databricks: PySpark, Delta Lake, Auto Loader, DLT, Jobs, Unity Catalog, MLflow basics.

· Snowflake: SQL, Snowpipe, Tasks/Streams, Snowpark (Python), warehouse sizing, performance tuning, security policies.

· Python: solid in packages for DE (pandas, pyarrow, pytest), robust error handling, typing, and packaging.

· Orchestration: Airflow DAGs (Sensors, Operators, XCom), Step Functions state machines.

· Streaming & CDC: Kafka/Kinesis, Debezium (nice-to-have), CDC patterns to Delta/Snowflake.

· AWS: S3, Glue, Lambda, Kinesis, IAM/KMS, VPC, CloudWatch; Terraform/CloudFormation.

· Data Modeling: 3NF/Dimensional, slowly changing dimensions (SCD Type 2), surrogate keys, surrogate vs natural debates.

· Security & Compliance: encryption at rest/in transit, tokenization, key rotation, audit logging, governance controls.

· Performance & Cost: Spark job tuning, Snowflake warehouse right-sizing, partitioning/clustering, object storage best practices.

Nice-to-Have:

· dbt (Snowflake) with tests & exposures; Great Expectations.

· Databricks SQL Warehouses and BI connectivity; Photon engine awareness.

· Lakehouse Federation (UC external locations); Delta Sharing; Iceberg experience.

· Kafka Connect/Debezium, NiFi or MuleSoft (for data integrations).

· Experience in financial services

· Exposure to ISO/IEC 27001 controls in data platforms.

Education & Certifications

· Bachelor’s/Master’s in CS/IT/EE or related.

· Certifications (plus): Databricks Data Engineer Associate/Professional, Snowflake SnowPro Core/Advanced, AWS Solutions Architect/Big Data/DP.

📌 Senior Data Engineer (Nellore)
🏢 Idexcel
📍 Nellore

Reply to this offer

Impress this employer describing Your skills and abilities, fill out the form below and leave Your personal touch in the presentation letter.

Subscribe to this job alert:
Enter Your E-mail address to receive the latest job offers for: senior data engineer (nellore) / nellore
Subscribe to this job alert:
Enter Your E-mail address to receive the latest job offers for: senior data engineer (nellore) / nellore