to design, build, and maintain scalable data platforms and pipelines. The ideal candidate has strong hands-on experience across data ingestion, transformation, orchestration, and cloud-based analytics, with a focus on modern lakehouse architectures.
Key Responsibilities
Design, develop, and maintain end-to-end data pipelines using
Python, PySpark, and SQL
Build and optimize data transformation workflows using
dbt on Snowflake
Develop scalable
lakehouse architectures
for structured and semi-structured data
Implement reliable data ingestion frameworks using
Kafka, AWS Glue, and custom connectors
Orchestrate workflows and manage dependencies using
Apache Airflow
Manage cloud infrastructure on
AWS
(S3, Glue, EMR, Redshift/Snowflake integrations)
Implement Infrastructure as Code (IaC) using
Terraform
Collaborate with cross-functional teams to deliver analytics-ready datasets
Ensure data quality, performance optimization, and cost efficiency
Use
GitLab
for version control, CI/CD, and collaborative development
Monitor, troubleshoot, and resolve data pipeline issues in production environments
Requirements
Required Skills & Qualifications
4+ years
of experience with
AWS
data services and cloud-based data engineering
Strong programming skills in
Python
and
PySpark
Hands-on experience with
Snowflake
and
dbt
for data modeling and transformations
Solid understanding of
SQL
for complex analytical queries
Experience with
Apache Airflow
for workflow orchestration
Proficiency in
Kafka
for real-time/streaming data ingestion
Experience with
AWS Glue/ Airflow
for ETL development
Experience with
Terraform
for infrastructure automation
* Strong experience with
GitLab
and CI/CD pipelines
Beware of fraud agents! do not pay money to get a job
MNCJobsIndia.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.