03 Apr
|
Cactus Communications
|
Nellore
03 Apr
Cactus Communications
Nellore
Apply on Kit Job: kitjob.in/job/45nplr
Overview :
CACTUS is a remote-first organization and we embrace an accelerate from anywhere culture. You may be required to travel to our Mumbai office based on business requirements or for company/team events.
We are looking for a Data Engineering Lead to architect and manage the large-scale data foundations that power our analytics and AI systems. In this role, you will design robust data pipelines, implement enterprise-grade architectures, and ensure seamless integration across diverse Digital India platforms while maintaining strict governance and security standards. If you are a technical leader who thrives on building scalable ingestion frameworks and optimizing high-performance data environments, this role offers the opportunity to drive strategic data initiatives at a national scale.
Responsibilities :
- Architect and implement scalable data pipelines to support AI/ML models and analytical workloads.
- Define data lake and data warehouse architectures compliant with NeGD and MeitY standards.
- Implement data ingestion frameworks for real-time and batch processing of government datasets.
- Establish robust metadata management, lineage tracking, and data governance frameworks.
- Optimize data storage, compression, and retrieval strategies for large-scale AI applications.
- Collaborate with AI/ML teams to ensure clean, high-quality, and versioned datasets for model training and inference.
- Integrate APIs and connectors to unify data from multiple Digital India platforms.
- Maintain compliance with MeitY’s data retention, privacy, and security policies.
Requirements :
- B.Tech / M.Tech / M.S. in Computer Science, Data Engineering, or related fields.
- Professional certifications in cloud data platforms (AWS,
Azure, GCP) are desirable.
- 8–12 years of professional experience in data engineering or data platform architecture.
- At least 4–5 years in designing and managing large-scale data pipelines for analytics or AI systems.
- 5-7 years designing and implementing enterprise data architectures for analytics and AI/ML use cases
- Experience working with structured, semi-structured, and unstructured data from diverse sources.
Technical Competencies:
- Cloud Services: AWS Redshift, Glue, S3; Azure Synapse; GCP BigQuery and Dataflow.
- Big Data Technologies: Apache Spark, Hadoop, Kafka, Airflow, Databricks, Snowflake, dbt for data transformation and orchestration
- Programming: Python, Scala, Java, SQL.
- Databases: PostgreSQL, MongoDB, Cassandra, Elasticsearch.
- Data Management: ETL/ELT design, schema evolution, DataOps, CI/CD pipelines.
- Infrastructure: Docker, Kubernetes, Terraform, Jenkins for data automation.
- Governance: Data cataloging, access control, encryption, and compliance logging.
About Cactus:
Established in 2002, Cactus Communications (cactusglobal.com) is a leading technology company that specializes in expert services and AI-driven products which improve how research gets funded, published, communicated, and discovered. Its flagship brand Editage offers a comprehensive suite of researcher solutions, including expert services and cutting-edge AI products like Mind the Graph, Paperpal, and R Discovery. With offices in Princeton, London, Singapore, Beijing, Shanghai, Seoul, Tokyo, and Mumbai and a global workforce of over 3,000 experts, CACTUS is a pioneer in workplace best practices and has been consistently recognized as a outstanding place to work.
Apply on Kit Job: kitjob.in/job/45nplr
📌 Data Engineer Lead (Nellore)
🏢 Cactus Communications
📍 Nellore