1. Role Overview
We are looking for a Data Engineer to design, develop, and optimize scalable data pipelines and storage systems. The candidate will work closely with analytics, product, and engineering teams to ensure high-quality data availability for business intelligence and ML workloads.
2. Key Responsibilities
A. Data Pipeline Development
Build and maintain batch & real-time ETL/ELT pipelines.
Integrate data from multiple sources (APIs, databases, cloud storage, streaming platforms).
Implement data transformation and cleansing logic.
B. Data Architecture & Modeling
Design data lakes, data warehouses, and data marts.
Build optimized schemas following normalization/denormalization standards.
Maintain high data availability and consistency across systems.
C. Cloud & Big Data Technologies
Work with cloud ecosystems (AWS, Azure, or GCP).
Use services such as S3, Redshift, BigQuery, Azure Data Factory, Databricks, EMR, etc.
Manage distributed frameworks (Spark, Kafka, Hadoop).
D. Performance & Reliability
Monitor pipeline performance and optimize resource usage.
Implement logging, monitoring, and error-handling mechanisms.
Automate pipeline deployments using CI/CD workflows.
3. Qualifications
Required Skills
Strong Python or Scala programming.
Experience with SQL, ETL tools, and cloud services.
Understanding of data structures, distributed computing, and pipeline orchestration (Airflow/Luigi).
Preferred Skills
Experience with dbt, Terraform, or containerization (Docker/K8s).
Knowledge of machine learning pipelines.
4. Salary Range
?6 LPA - ?18 LPA (India), depending on experience.
MNCJobsIndia.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.