Industry: Technology, Information and Media
Seniority level: Mid-Senior level
Min Experience: 5 years
Location: Remote (India)
JobType: full-time
We are seeking an experienced and versatile Data Engineer to join our growing data engineering team. In this role, you'll design and build scalable, high-performance data pipelines and infrastructure using modern Azure and big data technologies. You will collaborate closely with cross-functional teams to deliver clean, accessible, and trusted data that powers advanced analytics, AI/ML models, and key business decisions.
Key Responsibilities
Build Scalable Pipelines:
Design and implement robust ETL/ELT pipelines to ingest, transform, and store data from diverse structured and unstructured sources (e.g., APIs, Kafka, MongoDB, cloud services).
Develop Azure Data Solutions:
Use Azure Data Factory, Databricks, and related tools to manage orchestration, transformation, and pipeline deployment.
Architect & Model Data:
Develop scalable data lake and data warehouse solutions using ADLS Gen2, Delta Lake, and Azure SQL DB. Implement optimized data models for analytics and reporting use cases.
Optimize Performance:
Apply best practices for data pipeline performance tuning, resource optimization, and parallel processing using Apache Spark and PySpark.
Enable Automation:
Automate data workflows and validation processes, including test cases for ETL and Big Data pipelines, ensuring reliability and efficiency.
Collaborate Across Teams:
Partner with analysts, data scientists, engineers, and business stakeholders to understand requirements and deliver impactful data products.
Implement Governance & Security:
Ensure adherence to data governance, quality, privacy, and compliance standards. Use tools like Key Vault and DevOps CI/CD for secure and automated deployment.
Monitor & Maintain:
Establish monitoring, alerting, and logging for data pipeline health and quality, proactively resolving any failures or inconsistencies.
What We're Looking For
Hands-on Expertise in:
Programming: Python, PySpark, Scala
Azure: Data Factory, Databricks, Key Vault, DevOps CI/CD
Storage: ADLS Gen2, Delta Lake, Azure SQL DB
Big Data Ecosystem: Apache Spark, Hadoop
Experience With:
Data ingestion from real-time and batch sources like Kafka, MongoDB
Building secure, reusable, and scalable data infrastructure
Agile methodology and DevOps principles
Automation testing frameworks for ETL/Big Data pipelines
Preferred (but not required):
Exposure to AI/ML workflows and data science teams
Understanding of MLOps and deployment of ML models at scale
Core Competencies
Strong grasp of data modeling, warehousing, and distributed processing
Solid problem-solving and debugging skills
Passion for clean, reliable, and well-documented data systems
Excellent communication and teamwork abilities
Self-starter attitude with a sense of ownership and accountability
Education & Experience
Bachelor's or Master's degree in Computer Science, Engineering, or a related field
5-8 years of relevant experience in data engineering roles with cloud and big data tech stacks
Beware of fraud agents! do not pay money to get a job
MNCJobsIndia.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.