to help build the backbone of our enterprise data and AI initiatives. You'll work on modern data lake architectures and high-performance pipelines in AWS, enabling real-time insights and scalable analytics.
This role reports to the
Head - Data Platform
and
AI Lead
, offering a unique opportunity to be part of a cross-functional team shaping the future of data-driven innovation.
Key Responsibilities
Data Engineering & Pipeline Development
Design and develop reliable, reusable ETL/ELT pipelines using AWS Glue, Python, and Spark.
Process structured and semi-structured data (e.g., JSON, Parquet, CSV) efficiently for analytics and AI workloads.
Build automation and orchestration workflows using Airflow or AWS Step Functions.
Data Lake Architecture & Integration
Implement AWS-native data lake/lakehouse architectures using S3, Redshift, Glue Catalog, and Lake Formation.
Consolidate data from APIs, on-prem systems, and third-party sources into a centralized platform.
Optimize data models and partitioning strategies for high-performance queries.
Security, IAM & Governance Support
Ensure secure data architecture practices across AWS components using encryption, access control, and policy enforcement.
Implement and manage AWS IAM roles and policies to control data access across services and users.
Collaborate with platform and security teams to maintain compliance and audit readiness (e.g., HIPAA, GxP).
Apply best practices in data security, privacy, and identity management in cloud environments.
DevOps & Observability
Automate deployment of data infrastructure using CI/CD pipelines (GitHub Actions, Jenkins, or AWS CodePipeline).
Create Docker-based containers and manage workloads using ECS or EKS.
Monitor pipeline health, failures, and performance using CloudWatch and custom logs.
Collaboration & Communication
Partner with the Data Platform Lead and AI Lead to align engineering efforts with AI product goals.
Engage with analysts, data scientists, and business teams to gather requirements and deliver data assets.
Contribute to documentation, code reviews, and architectural discussions with clarity and confidence.
Required Qualifications
Bachelor's degree in Computer Science, Engineering, or equivalent.
5-8 years of experience in data engineering, preferably in AWS cloud environments.
Proficient in Python, SQL, and AWS services: Glue, Redshift, S3, IAM, Lake Formation.
Experience managing IAM roles, security policies, and cloud-based data access controls.
Hands-on experience with orchestration tools like Airflow or AWS Step Functions.
Exposure to CI/CD practices and infrastructure automation.
Strong interpersonal and communication skills--able to convey technical ideas clearly.
Preferred Additional Skills
Proficiency in
Databricks
,
Unity Catalog
, and
Spark-based distributed data processing
.
Background in Pharma, Life Sciences, or other regulated environments (GxP, HIPAA).
Experience with EMR, Snowflake, or hybrid-cloud data platforms.
Experience with BI/reporting tools such as Power BI or QuickSight.
Knowledge of integration tools (Boomi, Kafka) or real-time streaming frameworks.
Ready to build data solutions that fuel AI innovation?
Join Aptus Data Labs and play a key role in transforming raw data into enterprise intelligence.
Beware of fraud agents! do not pay money to get a job
MNCJobsIndia.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.