to join our offshore data engineering team. The ideal candidate will have strong expertise in
Azure Databricks
,
data lake architecture
, and
ETL/ELT pipeline development
. This role involves building, optimizing, and maintaining large-scale data pipelines, collaborating with onshore teams, and ensuring seamless data integration and availability across platforms.
Key Responsibilities:
Design, develop, and optimize
data ingestion, transformation, and loading (ETL/ELT)
pipelines using
Azure Databricks
and
PySpark
.
Implement data engineering solutions that support analytics, BI, and data science use cases.
Integrate data from various sources such as
SAP, APIs, Azure Data Lake, SQL databases
, and
streaming platforms (Kafka, Event Hubs)
.
Work closely with
onshore architecture, analytics, and business teams
to translate requirements into scalable data models.
Optimize
Databricks clusters
,
Spark jobs
, and
Delta Lake
performance.
Develop reusable components, frameworks, and standard practices for data pipelines and workflows.
Ensure data governance, security, and compliance best practices are followed.
Participate in
code reviews
,
testing
, and
deployment automation
using CI/CD tools.
Troubleshoot data issues and ensure data quality and integrity.
Required Skills & Experience:
6+ years
of hands-on experience in
data engineering
.
Strong working experience with
Azure Databricks
,
PySpark
, and
Spark SQL
.
Experience with
Azure Data Factory (ADF)
,
Azure Synapse
, or
Azure Data Lake Storage (ADLS)
.
Solid understanding of
Delta Lake
,
Lakehouse architecture
, and
data partitioning strategies
.
Proficiency in
SQL
and experience working with
relational and non-relational databases
.
Hands-on experience in
CI/CD pipelines
,
Git
,
DevOps
, and
data versioning
.
Experience with
data modeling
,
schema design
, and
performance tuning
.
Familiarity with
ETL frameworks
,
data governance
, and
monitoring tools
.
Excellent communication and collaboration skills to work effectively with distributed teams.
Preferred / Nice-to-Have Skills:
Knowledge of
Airflow
or
Databricks Workflows
for orchestration.
Exposure to
streaming data
(Kafka, Spark Streaming).
Experience with
Python libraries
for data processing and automation.
Familiarity with
Power BI
or other visualization tools for data validation and reporting.
Education:
Bachelor's or Master's degree in
Computer Science, Information Technology, or a related field
.
Relevant certifications (e.g.,
Azure Data Engineer Associate
,
Databricks Certified Data Engineer
) are a plus.
Beware of fraud agents! do not pay money to get a job
MNCJobsIndia.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.