with strong expertise in designing, building, and optimizing large-scale data pipelines and data lake/warehouse solutions on AWS. The ideal candidate will have extensive experience in
data engineering, ETL development, cloud-based data platforms, and modern data architecture practices
.
Key Responsibilities
Design, build, and maintain scalable
data pipelines and ETL workflows
using AWS services.
Develop, optimize, and maintain
data lake and data warehouse solutions
(e.g., S3, Redshift, Glue, Athena, EMR, Snowflake on AWS).
Work with structured and unstructured data from multiple sources, ensuring
data quality, governance, and security
.
Collaborate with data scientists, analysts, and business stakeholders to enable
analytics and AI/ML use cases
.
Implement
best practices
for data ingestion, transformation, storage, and performance optimization.
Monitor and troubleshoot data pipelines to ensure reliability and scalability.
Contribute to
data modeling, schema design, partitioning, and indexing strategies
.
Support
real-time and batch data processing
using tools like Kinesis, Kafka, or Spark.
Ensure compliance with
security and regulatory standards
(IAM, encryption, GDPR, HIPAA, etc.).
Required Skills & Experience
6+ years
of experience in Data Engineering, with at least
3+ years on AWS
cloud ecosystem.
Strong programming skills in
Python, PySpark, or Scala
.
Hands-on experience with
AWS services
:
Data Storage: S3, DynamoDB, RDS, Redshift
Data Processing: Glue, EMR, Lambda, Step Functions